taxprofiler: Parameters

Define where the pipeline should find input data and save output data.

Path to comma-separated file containing information about the samples and libraries/runs.

required

type: string

pattern: ^\S+\.csv$

Path to comma-separated file containing information about databases and profiling parameters for each taxonomic profiler

required

type: string

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required

type: string

Email address for completion summary.

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

MultiQC report title. Printed as page header, used for filename if not otherwise specified.

type: string

Common options across both long and short read preprocessing QC steps

Specify to skip sequencing quality control of raw sequencing reads

type: boolean

Specify the tool used for quality control of raw sequencing reads

type: string

Save reads from samples that went through the adapter clipping, pair-merging, and length filtering steps for both short and long reads

type: boolean

Save only the final reads from all read processing steps (that are sent to classification/profiling) in results directory.

type: boolean

Options for adapter clipping, quality trimming, pair-merging, and complexity filtering

Turns on short read quality control steps (adapter clipping, complexity filtering etc.)

type: boolean

Specify which tool to use for short-read QC

type: string

Skip adapter trimming

type: boolean

Specify adapter 1 nucleotide sequence

type: string

Specify adapter 2 nucleotide sequence

type: string

Specify a list of all possible adapters to trim. Overrides —shortread_qc_adapter1/2. Formats: .txt (AdapterRemoval) or .fasta. (fastp).

type: string

Turn on merging of read pairs for paired-end data

type: boolean

Include unmerged reads from paired-end merging in the downstream analysis

type: boolean

Specify the minimum length of reads to be retained

type: integer

default: 15

Turns on nucleotide sequence complexity filtering

type: boolean

Specify which tool to use for complexity filtering

type: string

Specify the minimum sequence entropy level for complexity filtering

type: number

default: 0.3

Specify the window size for BBDuk complexity filtering

type: integer

default: 50

Turn on masking rather than discarding of low complexity reads for BBduk

type: boolean

Specify the minimum complexity filter threshold of fastp

type: integer

default: 30

Specify the complexity filter mode for PRINSEQ++

type: string

Specify the minimum dust score for PRINTSEQ++ complexity filtering

type: number

default: 0.5

Save reads from samples that went through the complexity filtering step

type: boolean

Options for adapter clipping, quality trimming, and length filtering

Turns on long read quality control steps (adapter clipping, length filtering etc.)

type: boolean

Skip long-read trimming

type: boolean

Skip long-read length and quality filtering

type: boolean

Specify the minimum length of reads to be retained

type: integer

default: 1000

Specify the percent of high-quality bases to be retained

type: integer

default: 90

Specify the number of high-quality bases in the library to be retained

type: integer

default: 500000000

Options for pre-profiling host read removal

Turn on short-read host removal

type: boolean

Turn on long-read host removal

type: boolean

Specify path to single reference FASTA of host(s) genome(s)

type: string

Specify path to the directory containing pre-made BowTie2 indexes of the host removal reference

type: string

Specify path to a pre-made Minimap2 index file (.mmi) of the host removal reference

type: string

Save mapping index of input reference when not already supplied by user

type: boolean

Saved mapped and unmapped reads in BAM format from host removal

type: boolean

Save reads from samples that went through the host-removal step

type: boolean

Options for per-sample run-merging

Turn on run merging

type: boolean

Save reads from samples that went through the run-merging step

type: boolean

Turn on profiling with Centrifuge. Requires database to be present CSV file passed to —databases

type: boolean

Turn on saving of Centrifuge-aligned reads

type: boolean

Turn on profiling with DIAMOND. Requires database to be present CSV file passed to —databases

type: boolean

Specify output format from DIAMOND profiling.

type: string

Turn on saving of DIAMOND-aligned reads. Will override —diamond_output_format and no taxon tables will be generated

type: boolean

Turn on profiling with Kaiju. Requires database to be present CSV file passed to —databases

type: boolean

Turn on expanding of virus hits to individual viruses rather than aggregating at a taxonomic level.

type: boolean

Specify taxonomic rank to be displayed in Kaiju taxon table

type: string

Turn on profiling with Kraken2. Requires database to be present CSV file passed to —databases

type: boolean

Turn on saving of Kraken2-aligned reads

type: boolean

Turn on saving of Kraken2 per-read taxonomic assignment file

type: boolean

Turn on saving minimizer information in the kraken2 report thus increasing to an eight column layout.

type: boolean

Turn on profiling with KrakenUniq. Requires database to be present CSV file passed to —databases

type: boolean

Turn on saving of KrakenUniq-aligned reads

type: boolean

Specify how large to chunk database when loading into memory for KrakenUniq

type: string

default: 16G

Turn on saving of KrakenUniq per-read taxonomic assignment file

type: boolean

Specify the number of samples for each KrakenUniq run

type: integer

default: 20

Turn on Bracken (and the required Kraken2 prerequisite step).

type: boolean

Turn on profiling with MALT. Requires database to be present CSV file passed to —databases

type: boolean

Specify which MALT alignment mode to use

type: string

default: BlastN

Turn on saving of MALT-aligned reads

type: boolean

Turn on generation of MEGAN summary file from MALT results

type: boolean

Turn on profiling with MetaPhlAn. Requires database to be present CSV file passed to —databases

type: boolean

Turn on profiling with mOTUs. Requires database to be present CSV file passed to —databases

type: boolean

Turn on printing relative abundance instead of counts.

type: boolean

Turn on saving the mgc reads count.

type: boolean

Turn on removing NCBI taxonomic IDs.

type: boolean

Turn on classification with KMCP.

type: boolean

Specify which KMCP profiling mode to use.

type: integer

default: 3

Turn on saving the output of KMCP search

type: boolean

Turn on profiling with ganon. Requires database to be present CSV file passed to —databases.

type: boolean

Turn on saving of ganon per-read taxonomic assignment file(s).

type: boolean

Specify the type of ganon report to save.

type: string

Specify the taxonomic report the ganon report file should display.

type: string

Specify a percentile within which hits will be reported in ganon report output..

type: integer

Specify a minimum number of reads a hit must have to be retained in the ganon report.

type: integer

Specify a maximum number of reads a hit must have to be retained in the ganon report.

type: integer

Turn on standardisation of taxon tables across profilers

type: boolean

Turn on generation of BIOM output (currently only applies to mOTUs)

type: boolean

Turn on generation of Krona plots for supported profilers

type: boolean

Specify path to krona taxonomy directories (required for MALT krona plots)

type: string

The desired output format.

type: string

The path to a directory containing taxdump files.

type: string

Add the taxon name to the output.

type: boolean

Add the taxon rank to the output.

type: boolean

Add the taxon’s entire name lineage to the output.

type: boolean

Add the taxon’s entire ID lineage to the output.

type: boolean

Add the taxon’s entire rank lineage to the output.

type: boolean

Ignore individual profiles that cause errors.

type: boolean

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden

type: string

default: master

Base directory for Institutional configs.

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/configs/master

Institutional config name.

hidden

type: string

Institutional config description.

hidden

type: string

Institutional config contact information.

hidden

type: string

Institutional config URL link.

hidden

type: string

Set the top limit for requested resources for any single job.

Maximum number of CPUs that can be requested for any single job.

hidden

type: integer

default: 16

Maximum amount of memory that can be requested for any single job.

hidden

type: string

default: 128.GB

pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Maximum amount of time that can be requested for any single job.

hidden

type: string

default: 240.h

pattern: ^(\d+\.?\s*(s|m|h|d|day)\s*)+$

Less common options for the pipeline, typically set in a config file.

Display help text.

hidden

type: boolean

Display version and exit.

hidden

type: boolean

Method used to save pipeline results to output directory.

hidden

type: string

Email address for completion summary, only when pipeline fails.

hidden

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Send plain-text email instead of HTML.

hidden

type: boolean

File size limit when attaching MultiQC reports to summary emails.

hidden

type: string

default: 25.MB

pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Do not use coloured log outputs.

hidden

type: boolean

Incoming hook URL for messaging service

hidden

type: string

Custom config file to supply to MultiQC.

hidden

type: string

Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file

hidden

type: string

Custom MultiQC yaml file containing HTML including a methods description.

type: string

Boolean whether to validate parameters against the schema at runtime

hidden

type: boolean

default: true

Show all params when using --help

hidden

type: boolean

Validation of parameters fails when an unrecognised parameter is found.

hidden

type: boolean

Validation of parameters in lenient more.

hidden

type: boolean

Reference genome related files and options required for the workflow.

Name of iGenomes reference.

hidden

type: string

Directory / URL base for iGenomes references.

hidden

type: string

default: s3://ngi-igenomes/igenomes

Do not load the iGenomes reference config.

hidden

type: boolean

nf-core/taxprofiler