nf-core/detaxizer

A pipeline to identify (and remove) certain sequences from raw genomic data. Default taxon to identify (and remove) is Homo sapiens. Removal is optional.

de-identificationdecontaminationednafastqfilterlong-readsmetabarcodingmetagenomicsmicrobiomenanoporeshort-readsshotguntaxonomic-classificationtaxonomic-profiling

These pages are for an old version of the pipeline (1.0.0). The latest stable release is1.3.0.

Launch version 1.0.0 https://github.com/nf-core/detaxizer

Define where the pipeline should find input data and save output data.

Path to comma-separated file containing information about the samples in the experiment.

required

type: string

pattern: ^\S+\.csv$

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required

type: string

default: results

Email address for completion summary.

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

MultiQC report title. Printed as page header, used for filename if not otherwise specified.

type: string

Parameters which enable/disable certain steps used in the workflow.

If the filtering step should be carried out.

type: boolean

If the pre-processed reads should be used by the filter.

type: boolean

If the output of kraken2 should be used for filtering.

type: boolean

If blastn should be skipped.

type: boolean

Save intermediates to the results folder.

type: boolean

Parameters used by kraken2 to classify all reads provided. Fine-tuning of the isolation step can be done via the cutoff_* parameters.

The database which is used in the classification step.

type: string

default: https://genome-idx.s3.amazonaws.com/kraken/k2_standard_08gb_20231009.tar.gz

Save unclassified reads and classified reads (those assigned to any taxon, not specifically assessed or filtered) to separate files.

hidden

type: boolean

Confidence in the classification of a read as a certain taxon.

type: number

default: 0.05

If a read has less k-mers assigned to the taxon/taxa to be assessed/to be filtered the read is ignored by the pipeline.

type: integer

default: 2

Ratio per read of assigned to tax2filter k-mers to k-mers assigned to any other taxon (except unclassified).

type: number

default: 0.5

Ratio per read of assigned to tax2filter k-mers to unclassified k-mers.

type: number

The taxon or taxonomic group to be assessed or filtered by the pipeline.

type: string

default: Homo

Parameters to fine-tune the output of blastn.

Coverage is the percentage of the query sequence which can be found in the alignments of the sequence match. It can be used to fine-tune the validation step.

type: number

default: 40

The expected(e)-value contains information on how many hits of the same score can be found in a database of the size used in the query by chance. The parameter can be used to fine-tune the validation step.

type: number

default: 0.01

Identity is the percentage of the exact matches in the query and the sequence found in the database. The parameter can be used to fine-tune the validation step.

type: number

default: 40

Options to control the behavior of fastp

fastp option defining the minimum readlength of a read

type: integer

fastp option defining if the reads which failed to be trimmed should be saved

type: boolean

fastp option to define the threshold of quality of an individual base

type: integer

fastp option to define the mean quality for trimming

type: integer

default: 15

fastp option to define if the clipped reads should be saved

type: boolean

Reference genome related files and options required for the workflow.

Name of iGenomes reference.

type: string

default: GRCh38

Path to FASTA genome file.

type: string

Do not load the iGenomes reference config.

hidden

type: boolean

default: true

Directory / URL base for iGenomes references.

type: string

default: s3://ngi-igenomes/igenomes/

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden

type: string

default: master

Base directory for Institutional configs.

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/configs/master

Institutional config name.

hidden

type: string

Institutional config description.

hidden

type: string

Institutional config contact information.

hidden

type: string

Institutional config URL link.

hidden

type: string

Set the top limit for requested resources for any single job.

Maximum number of CPUs that can be requested for any single job.

hidden

type: integer

default: 16

Maximum amount of memory that can be requested for any single job.

hidden

type: string

default: 128.GB

pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Maximum amount of time that can be requested for any single job.

hidden

type: string

default: 240.h

pattern: ^(\d+\.?\s*(s|m|h|d|day)\s*)+$

Less common options for the pipeline, typically set in a config file.

Display help text.

hidden

type: boolean

Display version and exit.

hidden

type: boolean

Method used to save pipeline results to output directory.

hidden

type: string

Email address for completion summary, only when pipeline fails.

hidden

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Send plain-text email instead of HTML.

hidden

type: boolean

File size limit when attaching MultiQC reports to summary emails.

hidden

type: string

default: 25.MB

pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Do not use coloured log outputs.

hidden

type: boolean

Incoming hook URL for messaging service

hidden

type: string

Custom config file to supply to MultiQC.

hidden

type: string

Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file

hidden

type: string

Custom MultiQC yaml file containing HTML including a methods description.

type: string

Boolean whether to validate parameters against the schema at runtime

hidden

type: boolean

default: true

Show all params when using --help

hidden

type: boolean

Validation of parameters fails when an unrecognised parameter is found.

hidden

type: boolean

Validation of parameters in lenient more.

hidden

type: boolean

On this page