ampliseq: Parameters

Path to tab-separated sample sheet

type: string

pattern: ^\S+\.(tsv|csv|yml|yaml|txt)$

Path to ASV/OTU fasta file

type: string

pattern: ^\S+\.(fasta|fas|fna|fa|ffn)$

Path to folder containing zipped FastQ files

type: string

Forward primer sequence

type: string

Reverse primer sequence

type: string

Path to metadata sheet, when missing most downstream analysis are skipped (barplots, PCoA plots, …).

type: string

Path to multi-region definition sheet, for multi-region analysis with Sidle

type: string

pattern: ^\S+\.(tsv|csv|yml|yaml|txt)$

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required

type: string

Save intermediate results such as QIIME2’s qza and qzv files

type: boolean

Email address for completion summary.

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

If data has binned quality scores such as Illumina NovaSeq

type: boolean

If data is single-ended PacBio reads instead of Illumina

type: boolean

If data is single-ended IonTorrent reads instead of Illumina

type: boolean

If data is single-ended Illumina reads instead of paired-end

type: boolean

If analysing ITS amplicons or any other region with large length variability with Illumina paired end reads

type: boolean

Type of quality scores in raw read data

type: string

If using --input_folder: samples were sequenced in multiple sequencing runs

type: boolean

If using --input_folder: naming of sequencing files

type: string

default: /*_R{1,2}_001.fastq.gz

Set read count threshold for failed samples.

type: integer

default: 1

Ignore input files with too few reads.

type: boolean

Spurious sequences sometimes lack primer sequences and primers introduce errors that can be removed in that step

Cutadapt will retain untrimmed reads, choose only if input reads are not expected to contain primer sequences.

type: boolean

Sets the minimum overlap for valid matches of primer sequences with reads for cutadapt (-O).

type: integer

default: 3

Sets the maximum error rate for valid matches of primer sequences with reads for cutadapt (-e).

type: number

default: 0.1

Cutadapt will be run twice to ensure removal of potential double primers

type: boolean

Ignore files with too few reads after trimming.

type: boolean

Read trimming and quality filtering is supposed to reduce spurious results and aid error correction

DADA2 read truncation value for forward strand, set this to 0 for no truncation

type: integer

DADA2 read truncation value for reverse strand, set this to 0 for no truncation

type: integer

If —trunclenf and —trunclenr are not set, these values will be automatically determined using this median quality score

type: integer

default: 25

Assures that values chosen with —trunc_qmin will retain a fraction of reads.

type: number

default: 0.75

DADA2 read filtering option

type: integer

default: 2

DADA2 read filtering option

type: integer

default: 50

DADA2 read filtering option

type: integer

Ignore files with too few reads after quality filtering.

type: boolean

Mode of sample inference: “independent”, “pooled” or “pseudo”

type: string

Strategy to merge paired end reads. When paired end reads are not sufficiently overlapping for merging, you can use “concatenate” (not recommended). When you have a mix of overlapping and non overlapping reads use “consensus”

type: string

The score assigned for each matching base pair during sequence alignment.

type: integer

default: 1

The penalty score assigned for each mismatched base pair during sequence alignment.

type: integer

default: -2

The penalty score assigned for each gap introduced during sequence alignment.

type: integer

default: -4

The minimum number of overlapping base pairs required to merge forward and reverse reads.

type: integer

default: 12

The maximum number of mismatches allowed within the overlapping region for merging reads.

type: integer

The percentile used to determine a stringent cutoff which will correspond to the minimum observed overlap in the dataset. This ensures that only read pairs with high overlap are merged into consensus sequences. Those with insufficient overlap are concatenated.

type: number

default: 0.001

ASV post-processing takes place after ASV computation but before taxonomic assignment, it will affect all downstream processes

Post-cluster ASVs with VSEARCH

type: boolean

Pairwise Identity value used when post-clustering ASVs if --vsearch_cluster option is used (default: 0.97).

type: number

default: 0.97

Raise stack size when filtering VSEARCH clusters

type: boolean

default: true

Enable SSU filtering. Comma separated list of kingdoms (domains) in Barrnap, a combination (or one) of “bac”, “arc”, “mito”, and “euk”. ASVs that have their lowest evalue in that kingdoms are kept.

type: string

Minimal ASV length

type: integer

Maximum ASV length

type: integer

Filter ASVs based on codon usage

type: boolean

Starting position of codon tripletts

type: integer

default: 1

Ending position of codon tripletts

type: integer

Define stop codons

type: string

default: TAA,TAG

Choose a method and database for taxonomic assignments to single-region amplicons

Name of supported database, and optionally also version number

type: string

Path to a custom DADA2 reference taxonomy database

type: string

Path to a custom DADA2 reference taxonomy database for species assignment

type: string

Comma separated list of taxonomic levels used in DADA2’s assignTaxonomy function

type: string

If the expected amplified sequences are extracted from the DADA2 reference taxonomy database

type: boolean

If multiple exact matches against different species are returned

type: boolean

If reverse-complement of each sequences will be also tested for classification

type: boolean

ASV fasta will be subset into chunks of this size for classification

type: integer

default: 10000

Newick file with reference phylogenetic tree. Requires also --pplace_aln and --pplace_model.

type: string

File with reference sequences. Requires also --pplace_tree and --pplace_model.

type: string

Phylogenetic model to use in placement, e.g. ‘LG+F’ or ‘GTR+I+F’. Requires also --pplace_tree and --pplace_aln.

type: string

Method used for alignment, “hmmer” or “mafft”

type: string

Tab-separated file with taxonomy assignments of reference sequences.

type: string

A name for the run

hidden

type: string

Name of supported database, and optionally also version number

type: string

Path to files of a custom QIIME2 reference taxonomy database (tarball, or two comma-separated files)

type: string

Path to QIIME2 trained classifier file (typically *-classifier.qza)

type: string

Name of supported database, and optionally also version number

type: string

Path to a custom Kraken2 reference taxonomy database (.tar.gz|.tgz archive or folder)

type: string

Comma separated list of taxonomic levels used in Kraken2. Will overwrite default values.

type: string

Confidence score threshold for taxonomic classification.

type: number

Name of supported database, and optionally also version number

type: string

If ASVs should be assigned to UNITE species hypotheses (SHs). Only relevant for ITS data.

type: boolean

Part of ITS region to use for taxonomy assignment: “full”, “its1”, or “its2”

type: string

Cutoff for partial ITS sequences. Only full sequences by default.

type: integer

Choose database for taxonomic assignments with multi-region amplicons using SIDLE

Name of supported database, and optionally also version number

type: string

Path to reference taxonomy strings (headerless, *.txt)

type: string

pattern: ^.*\.txt$

Path to reference taxonomy sequences in fasta format

type: string

pattern: ^.*\.(fasta|fas|fna|fa|ffn)$

Path to multiple sequence alignment of reference taxonomy sequences in fasta format

type: string

pattern: ^.*\.(fasta|fas|fna|fa|ffn)$

Path to SIDLE reference taxonomy tree (*.qza)

type: string

pattern: ^.*\.qza$

Exclude reference sequences with more than this much degenerates

type: integer

default: 5

Arguments for qiime sidle reconstruct-taxonomy regarding ad-hoc cleaning

type: string

Filtering by taxonomy or abundance will affect all downstream analysis

Comma separated list of unwanted taxa, to skip taxa filtering use “none”

type: string

default: mitochondria,chloroplast

Abundance filtering

type: integer

default: 1

Prevalence filtering

type: integer

default: 1

Metadata is used here to visualize data either for quality control or publication ready figures

Comma separated list of metadata column headers for statistics.

type: string

Comma separated list of metadata column headers for plotting average relative abundance barplots.

type: string

Formula for QIIME2 ADONIS metadata feature importance test for beta diversity distances

type: string

If the functional potential of the bacterial community is predicted.

type: boolean

If data should be exported in SBDI (Swedish biodiversity infrastructure) Excel format.

type: boolean

Minimum rarefaction depth for diversity analysis. Any sample below that threshold will be removed.

type: integer

default: 500

Minimum taxonomy agglomeration level for taxonomic classifications

type: integer

default: 2

Maximum taxonomy agglomeration level for taxonomic classifications

type: integer

default: 6

Differential abundance analysis relies on provided metadata

Minimum sample counts to retain a sample for ANCOM analysis. Any sample below that threshold will be removed.

type: integer

default: 1

Perform differential abundance analysis with ANCOM

type: boolean

Perform differential abundance analysis with ANCOMBC

type: boolean

Formula to perform differential abundance analysis with ANCOMBC

type: string

Reference level for --ancombc_formula

type: string

Effect size threshold for differential abundance barplot for --ancombc and --ancombc_formula

type: number

default: 1

Significance threshold for differential abundance barplot for --ancombc and --ancombc_formula

type: number

default: 0.05

Customization of the pipeline report

Path to Markdown file (Rmd)

type: string

default: ${projectDir}/assets/report_template.Rmd

Path to style file (css)

type: string

default: ${projectDir}/assets/nf-core_style.css

Path to logo file (png)

type: string

default: ${projectDir}/assets/nf-core-ampliseq_logo_light_long.png

String used as report title

type: string

default: Summary of analysis results

Path to Markdown file (md) that replaces the ‘Abstract’ section

type: string

Skip FastQC

type: boolean

Skip primer trimming with cutadapt. This is not recommended! Use only in case primer sequences were removed before and the data does not contain any primer sequences.

type: boolean

Skip quality check with DADA2. Can only be skipped when --trunclenf and --trunclenr are set.

type: boolean

Skip annotating SSU matches.

type: boolean

Skip all steps that are executed by QIIME2, including QIIME2 software download, taxonomy assignment by QIIME2, barplots, relative abundance tables, diversity analysis, differential abundance testing.

type: boolean

Skip steps that are executed by QIIME2 except for taxonomic classification. Skip steps including barplots, relative abundance tables, diversity analysis, differential abundance testing.

type: boolean

Skip taxonomic classification. Incompatible with --sbdiexport

type: boolean

Skip taxonomic classification with DADA2

type: boolean

Skip species level when using DADA2 for taxonomic classification. This reduces the required memory dramatically under certain conditions. Incompatible with --sbdiexport

type: boolean

Skip producing barplot

type: boolean

Skip producing any relative abundance tables

type: boolean

Skip alpha rarefaction

type: boolean

Skip alpha and beta diversity analysis

type: boolean

Skip exporting phyloseq rds object(s)

type: boolean

Skip exporting TreeSummarizedExperiment rds object(s)

type: boolean

Skip MultiQC reporting

type: boolean

Skip Markdown summary report

type: boolean

Less common options for the pipeline, typically set in a config file.

Specifies the random seed.

type: integer

default: 100

Display version and exit.

hidden

type: boolean

Method used to save pipeline results to output directory.

hidden

type: string

Email address for completion summary, only when pipeline fails.

hidden

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Send plain-text email instead of HTML.

hidden

type: boolean

File size limit when attaching MultiQC reports to summary emails.

hidden

type: string

default: 25.MB

pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Do not use coloured log outputs.

hidden

type: boolean

Incoming hook URL for messaging service

hidden

type: string

Custom config file to supply to MultiQC.

hidden

type: string

Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file

hidden

type: string

Custom MultiQC yaml file containing HTML including a methods description.

type: string

Boolean whether to validate parameters against the schema at runtime

hidden

type: boolean

default: true

Base URL or local path to location of pipeline test dataset files

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/test-datasets/

Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.

hidden

type: string

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden

type: string

default: master

Base directory for Institutional configs.

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/configs/master

Institutional config name.

hidden

type: string

Institutional config description.

hidden

type: string

Institutional config contact information.

hidden

type: string

Institutional config URL link.

hidden

type: string

MultiQC report title. Printed as page header, used for filename if not otherwise specified.

hidden

type: string

nf-core/ampliseq