nf-core/taxprofiler
Highly parallelised multi-taxonomic profiling of shotgun short- and long-read metagenomic data
1.1.0
). The latest
stable release is
1.2.3
.
Define where the pipeline should find input data and save output data.
Path to comma-separated file containing information about the samples and libraries/runs.
string
^\S+\.csv$
Path to comma-separated file containing information about databases and profiling parameters for each taxonomic profiler
string
The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.
string
Email address for completion summary.
string
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
MultiQC report title. Printed as page header, used for filename if not otherwise specified.
string
Common options across both long and short read preprocessing QC steps
Specify to skip sequencing quality control of raw sequencing reads
boolean
Specify the tool used for quality control of raw sequencing reads
string
Save reads from samples that went through the adapter clipping, pair-merging, and length filtering steps for both short and long reads
boolean
Save only the final reads from all read processing steps (that are sent to classification/profiling) in results directory.
boolean
Options for adapter clipping, quality trimming, pair-merging, and complexity filtering
Turns on short read quality control steps (adapter clipping, complexity filtering etc.)
boolean
Specify which tool to use for short-read QC
string
Skip adapter trimming
boolean
Specify adapter 1 nucleotide sequence
string
Specify adapter 2 nucleotide sequence
string
Specify a list of all possible adapters to trim. Overrides —shortread_qc_adapter1/2. Formats: .txt (AdapterRemoval) or .fasta. (fastp).
string
Turn on merging of read pairs for paired-end data
boolean
Include unmerged reads from paired-end merging in the downstream analysis
boolean
Specify the minimum length of reads to be retained
integer
15
Turns on nucleotide sequence complexity filtering
boolean
Specify which tool to use for complexity filtering
string
Specify the minimum sequence entropy level for complexity filtering
number
0.3
Specify the window size for BBDuk complexity filtering
integer
50
Turn on masking rather than discarding of low complexity reads for BBduk
boolean
Specify the minimum complexity filter threshold of fastp
integer
30
Specify the complexity filter mode for PRINSEQ++
string
Specify the minimum dust score for PRINTSEQ++ complexity filtering
number
0.5
Save reads from samples that went through the complexity filtering step
boolean
Options for adapter clipping, quality trimming, and length filtering
Turns on long read quality control steps (adapter clipping, length filtering etc.)
boolean
Skip long-read trimming
boolean
Skip long-read length and quality filtering
boolean
Specify the minimum length of reads to be retained
integer
1000
Specify the percent of high-quality bases to be retained
integer
90
Specify the number of high-quality bases in the library to be retained
integer
500000000
Options for pre-profiling host read removal
Turn on short-read host removal
boolean
Turn on long-read host removal
boolean
Specify path to single reference FASTA of host(s) genome(s)
string
Specify path to the directory containing pre-made BowTie2 indexes of the host removal reference
string
Specify path to a pre-made Minimap2 index file (.mmi) of the host removal reference
string
Save mapping index of input reference when not already supplied by user
boolean
Saved mapped and unmapped reads in BAM format from host removal
boolean
Save reads from samples that went through the host-removal step
boolean
Options for per-sample run-merging
Turn on run merging
boolean
Save reads from samples that went through the run-merging step
boolean
Turn on profiling with Centrifuge. Requires database to be present CSV file passed to —databases
boolean
Turn on saving of Centrifuge-aligned reads
boolean
Turn on profiling with DIAMOND. Requires database to be present CSV file passed to —databases
boolean
Specify output format from DIAMOND profiling.
string
Turn on saving of DIAMOND-aligned reads. Will override —diamond_output_format and no taxon tables will be generated
boolean
Turn on profiling with Kaiju. Requires database to be present CSV file passed to —databases
boolean
Turn on expanding of virus hits to individual viruses rather than aggregating at a taxonomic level.
boolean
Specify taxonomic rank to be displayed in Kaiju taxon table
string
Turn on profiling with Kraken2. Requires database to be present CSV file passed to —databases
boolean
Turn on saving of Kraken2-aligned reads
boolean
Turn on saving of Kraken2 per-read taxonomic assignment file
boolean
Turn on saving minimizer information in the kraken2 report thus increasing to an eight column layout.
boolean
Turn on profiling with KrakenUniq. Requires database to be present CSV file passed to —databases
boolean
Turn on saving of KrakenUniq-aligned reads
boolean
Specify how large to chunk database when loading into memory for KrakenUniq
string
16G
Turn on saving of KrakenUniq per-read taxonomic assignment file
boolean
Specify the number of samples for each KrakenUniq run
integer
20
Turn on Bracken (and the required Kraken2 prerequisite step).
boolean
Turn on profiling with MALT. Requires database to be present CSV file passed to —databases
boolean
Specify which MALT alignment mode to use
string
BlastN
Turn on saving of MALT-aligned reads
boolean
Turn on generation of MEGAN summary file from MALT results
boolean
Turn on profiling with MetaPhlAn. Requires database to be present CSV file passed to —databases
boolean
Turn on profiling with mOTUs. Requires database to be present CSV file passed to —databases
boolean
Turn on printing relative abundance instead of counts.
boolean
Turn on saving the mgc reads count.
boolean
Turn on removing NCBI taxonomic IDs.
boolean
Turn on classification with KMCP.
boolean
Specify which KMCP profiling mode to use.
integer
3
Turn on saving the output of KMCP search
boolean
Turn on profiling with ganon. Requires database to be present CSV file passed to —databases.
boolean
Turn on saving of ganon per-read taxonomic assignment file(s).
boolean
Specify the type of ganon report to save.
string
Specify the taxonomic report the ganon report file should display.
string
Specify a percentile within which hits will be reported in ganon report output..
integer
Specify a minimum number of reads a hit must have to be retained in the ganon report.
integer
Specify a maximum number of reads a hit must have to be retained in the ganon report.
integer
Turn on standardisation of taxon tables across profilers
boolean
Turn on generation of BIOM output (currently only applies to mOTUs)
boolean
Turn on generation of Krona plots for supported profilers
boolean
Specify path to krona taxonomy directories (required for MALT krona plots)
string
The desired output format.
string
The path to a directory containing taxdump files.
string
Add the taxon name to the output.
boolean
Add the taxon rank to the output.
boolean
Add the taxon’s entire name lineage to the output.
boolean
Add the taxon’s entire ID lineage to the output.
boolean
Add the taxon’s entire rank lineage to the output.
boolean
Ignore individual profiles that cause errors.
boolean
Parameters used to describe centralised config profiles. These should not be edited.
Git commit id for Institutional configs.
string
master
Base directory for Institutional configs.
string
https://raw.githubusercontent.com/nf-core/configs/master
Institutional config name.
string
Institutional config description.
string
Institutional config contact information.
string
Institutional config URL link.
string
Set the top limit for requested resources for any single job.
Maximum number of CPUs that can be requested for any single job.
integer
16
Maximum amount of memory that can be requested for any single job.
string
128.GB
^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$
Maximum amount of time that can be requested for any single job.
string
240.h
^(\d+\.?\s*(s|m|h|d|day)\s*)+$
Less common options for the pipeline, typically set in a config file.
Display help text.
boolean
Display version and exit.
boolean
Method used to save pipeline results to output directory.
string
Email address for completion summary, only when pipeline fails.
string
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
Send plain-text email instead of HTML.
boolean
File size limit when attaching MultiQC reports to summary emails.
string
25.MB
^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$
Do not use coloured log outputs.
boolean
Incoming hook URL for messaging service
string
Custom config file to supply to MultiQC.
string
Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file
string
Custom MultiQC yaml file containing HTML including a methods description.
string
Boolean whether to validate parameters against the schema at runtime
boolean
true
Show all params when using --help
boolean
Validation of parameters fails when an unrecognised parameter is found.
boolean
Validation of parameters in lenient more.
boolean
Reference genome related files and options required for the workflow.
Name of iGenomes reference.
string
Directory / URL base for iGenomes references.
string
s3://ngi-igenomes/igenomes
Do not load the iGenomes reference config.
boolean