eager: Parameters

Define where the pipeline should find input data and save output data.

Path to tab- or comma-separated file containing information about the samples in the experiment.

required

type: string

pattern: ^\S+\.(c|t)sv$

Specify to convert input BAM files back to FASTQ for remapping

type: boolean

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required

type: string

Email address for completion summary.

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

MultiQC report title. Printed as page header, used for filename if not otherwise specified.

type: string

Reference genome related files and options required for the workflow.

Path to FASTA file of the reference genome.

type: string

pattern: ^\S+\.fn?a(sta)?(\.gz)?$

Specify path to samtools FASTA index.

type: string

Specify path to Picard sequence dictionary file.

type: string

Specify path to directory containing index files of the FASTA for a given mapper.

type: string

Specify to generate ‘.csi’ BAM indices instead of ‘.bai’ for larger reference genomes.

type: boolean

Specify to save any pipeline-generated reference genome indices in the results directory.

type: boolean

Path to a tab-/comma-separated file containing reference-specific files.

type: string

pattern: ^\S+\.(c|t)sv$

Name of iGenomes reference.

hidden

type: string

Directory / URL base for iGenomes references.

hidden

type: string

default: s3://ngi-igenomes/igenomes/

Do not load the iGenomes reference config.

hidden

type: boolean

Specify the FASTA header of the extended chromosome when using circularmapper.

type: string

Specify the number of bases to extend reference by (circularmapper only).

type: integer

default: 500

Specify an elongated reference FASTA to be used for circularmapper.

type: string

Specify a samtools index for the elongated FASTA file.

type: string

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden

type: string

default: master

Base directory for Institutional configs.

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/configs/master

Institutional config name.

hidden

type: string

Institutional config description.

hidden

type: string

Institutional config contact information.

hidden

type: string

Institutional config URL link.

hidden

type: string

Less common options for the pipeline, typically set in a config file.

Display version and exit.

hidden

type: boolean

Method used to save pipeline results to output directory.

hidden

type: string

Email address for completion summary, only when pipeline fails.

hidden

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Send plain-text email instead of HTML.

hidden

type: boolean

File size limit when attaching MultiQC reports to summary emails.

hidden

type: string

default: 25.MB

pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Do not use coloured log outputs.

hidden

type: boolean

Incoming hook URL for messaging service

hidden

type: string

Custom config file to supply to MultiQC.

hidden

type: string

Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file

hidden

type: string

Custom MultiQC yaml file containing HTML including a methods description.

type: string

Boolean whether to validate parameters against the schema at runtime

hidden

type: boolean

default: true

Base URL or local path to location of pipeline test dataset files

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/test-datasets/

Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.

hidden

type: string

Removal of adapters, paired-end merging, poly-G removal, etc.

Specify which tool to use for sequencing quality control.

type: string

Specify to skip all preprocessing steps (adapter removal, paired-end merging, poly-G trimming, etc).

type: boolean

Specify which preprocessing tool to use.

type: string

Specify to skip read-pair merging.

type: boolean

Specify to exclude read-pairs that did not overlap sufficiently for merging (i.e., keep merged reads only).

type: boolean

Specify to skip removal of adapters.

type: boolean

Specify the nucleotide sequence for the forward read/R1.

type: string

Specify the nucleotide sequence for the reverse read/R2.

type: string

Specify a list of all possible adapters to trim.

type: string

Specify the minimum length reads must have to be retained.

type: integer

default: 25

Specify number of bases to hard-trim from 5 prime or front of reads.

type: integer

Specify number of bases to hard-trim from 3 prime or tail of reads.

type: integer

Specify to save the preprocessed reads in the results directory.

type: boolean

Specify to turn on sequence complexity filtering of reads.

type: boolean

Specify the complexity threshold that must be reached or exceeded to retain reads.

type: integer

default: 10

Skip AdapterRemoval quality and N base trimming at 5 prime end.

type: boolean

Specify to skip AdapterRemoval quality and N trimming at the ends of reads.

type: boolean

Specify AdapterRemoval minimum base quality for trimming off bases.

type: integer

default: 20

Specify to skip AdapterRemoval N trimming (quality trimming only).

type: boolean

Specify the AdapterRemoval minimum adapter overlap required for trimming.

type: integer

default: 1

Specify the AdapterRemoval maximum Phred score used in input FASTQ files.

type: integer

default: 41

Options for aligning reads against reference genome(s)

Specify to turn on FASTQ sharding.

type: boolean

Specify the number of reads in each shard when splitting.

type: integer

default: 1000000

Specify which mapper to use.

type: string

Specify the amount of allowed mismatches in the alignment for mapping with BWA aln.

type: number

default: 0.01

Specify the maximum edit distance allowed in a seed for mapping with BWA aln.

type: integer

default: 2

Specify the length of seeds to be used for BWA aln.

type: integer

default: 1024

Specify the number of gaps allowed for alignment with BWA aln.

type: integer

default: 2

Specify the minimum seed length for alignment with BWA mem.

type: integer

default: 19

Specify the re-seeding threshold for alignment with BWA mem.

type: number

default: 1.5

Specify the Bowtie 2 alignment mode.

type: string

Specify the level of sensitivity for the Bowtie 2 alignment mode.

type: string

Specify the number of mismatches in seed for alignment with Bowtie 2.

type: integer

Specify the length of seed substrings for Bowtie 2.

type: integer

default: 20

Specify the number of bases to trim off from 5 prime end of read before alignment with Bowtie 2.

type: integer

Specify the number of bases to trim off from 3 prime end of read before alignment with Bowtie 2.

type: integer

Specify the maximum fragment length for Bowtie2 paired-end mapping mode only.

type: integer

default: 500

Turn on to remove reads that did not map to the circularised genome.

type: boolean

Specify the -p parameter for mapAD, i.e. mismatch budget for the alignment.

type: number

default: 0.03

Specify the -f parameter for mapAD, which sets the 5’-overhang length parameter of mapAD’s aDNA scoring model.

type: number

default: 0.5

Specify the -t parameter for mapAD, which sets the 3’-overhang length parameter of mapAD’s aDNA scoring model.

type: number

default: 0.5

Specify the -d parameter for mapAD, which sets the double-stranded deamination rate parameter of mapAD’s aDNA scoring model.

type: number

default: 0.02

Specify the -s parameter for mapAD, which sets the single-stranded deamination rate parameter of mapAD’s aDNA scoring model.

type: number

default: 1

Specify the -i parameter for mapAD, which adjusts the expected indel rate between reads and reference.

type: number

default: 0.001

Specify the -x parameter for mapAD, which adjusts the gap extension penalty.

type: number

default: 0.5

Specify the —gap_dist_ends parameter for mapAD, which defines the distance from the read ends in which no gaps are permitted.

type: number

default: 5

Specify the number of gaps allowed for alignment with mapAD.

type: number

default: 2

Specify the base error rate parameter for mapAD.

type: number

default: 0.02

Set the —ignore_base_quality flag, which instructs mapAD to ignore base quality values in its scoring model.

type: boolean

Set the —no_search_limit_recovery flag, which instructs mapAD to abort instead of trying to recover from full internal data structures.

type: boolean

Options related to length, quality, and map status filtering of reads.

Specify to turn on filtering of reads in BAM files after mapping. By default, only mapped reads retained.

type: boolean

Specify the minimum read length mapped reads should have for downstream genomic analysis.

type: integer

Specify the minimum mapping quality reads should have for downstream genomic analysis.

type: integer

Specify the SAM format flag of reads to remove during BAM filtering for downstream genomic steps.

type: integer

default: 4

Specify to retain unmapped reads in the BAM file used for downstream genomic analyses.

type: boolean

Specify to generate FASTQ files from the filtered BAM files.

type: boolean

Specify to save the intermediate filtered genomic BAM files in the results directory.

type: boolean

Options related to metagenomic screening.

Specify to turn on metagenomic screening of mapped, unmapped or all reads.

type: boolean

Specify which type of reads to use for metagenomic screening.

type: string

Specify to turn on saving of input for metagenomics.

type: boolean

Specify to run a complexity filter on the metagenomics input files before classification.

type: boolean

Specify to save FASTQ files containing the complexity-filtered reads before metagenomic classification.

type: boolean

Specify which tool to use for trimming, filtering or reformatting of FASTQ reads that go into metagenomics screening.

type: string

Specify the entropy threshold under which a sequencing read will be complexity-filtered out.

type: number

default: 0.3

Specify the complexity filter mode for PRINSEQ++.

type: string

Specify the minimum dust score for PRINTSEQ++ complexity filtering

type: number

default: 0.5

Specify which tool to use for metagenomic profiling and screening. Required if --run_metagenomics flagged.

type: string

Specify a databse directory or .tar.gz file of a database directory to run metagenomics profiling on. Required if --run_metagenomics flagged.

type: string

Turn on saving reads assigned by KrakenUniq or Kraken2

type: boolean

Turn on saving of KrakenUniq or Kraken2 per-read taxonomic assignment file

type: boolean

Specify how large to chunk database when loading into memory for KrakenUniq

type: string

default: 16G

Turn on saving minimizer information in the kraken2 report thus increasing to an eight column layout.

type: boolean

Specify which alignment mode to use for MALT.

type: string

Specify alignment method for MALT.

type: string

Percent identity value threshold for MALT.

type: integer

default: 85

Specify the percent for LCA algorithm for MALT (see MEGAN6 CE manual).

type: integer

default: 1

Specify whether to use percent or raw number of reads for minimum support required for taxon to be retained for MALT.

type: string

Specify the minimum percentage of reads a taxon of sample total is required to have to be retained for MALT.

type: number

default: 0.01

Specify a minimum number of reads a taxon of sample total is required to have to be retained in malt or kraken. Not compatible with —malt_min_support_mode ‘percent’.

type: integer

default: 1

Specify the maximum number of queries a read can have for MALT.

type: integer

default: 100

Specify the memory load method. Do not use ‘map’ with GPFS file systems for MALT as can be very slow.

type: string

Specify to also produce SAM alignment files. Note this includes both aligned and unaligned reads, and are gzipped. Note this will result in very large file sizes.

type: boolean

Define how many fastq files should be submitted in the same malt run. Default value of 0 runs all files at once.

type: integer

Activate post-processing of metagenomics profiling tool selected.

type: boolean

Path to a text file with taxa of interest (one taxon per row, NCBI taxonomy name format)

type: string

Path to directory containing containing NCBI resource files (ncbi.tre and ncbi.map; available: https://github.com/rhuebler/HOPS/)

type: string

Specify which MaltExtract filter to use.

type: string

Specify percent of top alignments to use.

type: number

default: 0.01

Turn off destacking.

type: boolean

Turn off downsampling.

type: boolean

Turn off duplicate removal.

type: boolean

Turn on exporting alignments of hits in BLAST format.

type: boolean

Turn on export of MEGAN summary files.

type: boolean

Minimum percent identity alignments are required to have to be reported as candidate reads. Recommended to set same as MALT parameter.

type: number

default: 85

Turn on using top alignments per read after filtering.

type: boolean

Options for removal of PCR duplicates

Specify to skip the removal of PCR duplicates.

type: boolean

Specify which tool to use for deduplication.

type: string

Options for filtering for, trimming or rescaling characteristic ancient DNA damage patterns

Specify to turn on damage rescaling of BAM files using mapDamage2 to probabilistically remove damage.

type: boolean

Specify the length of read sequence to use from each side for rescaling.

type: integer

default: 12

Specify the length of read for mapDamage2 to rescale from 5 prime end.

type: integer

Specify the length of read for mapDamage2 to rescale from 3 prime end.

type: integer

Specify to turn on PMDtools filtering.

type: boolean

Specify PMD score threshold for PMDtools.

type: integer

default: 3

Specify a masked FASTA file with positions to be used with PMDtools.

type: string

pattern: ^\S+\.fa?(\sta)$

Specify a BED file to be used to mask the reference FASTA prior to running PMDtools.

type: string

pattern: ^\S+\.bed?(\.gz)$

Specify to turn on BAM trimming for non-UDG or half-UDG libraries.

type: boolean

Specify the number of bases to clip off reads from ‘left’ (5 prime) end of reads for double-stranded non-UDG libraries.

type: integer

Specify the number of bases to clip off reads from ‘right’ (3 prime) end of reads for double-stranded non-UDG libraries.

type: integer

Specify the number of bases to clip off reads from ‘left’ (5 prime) end of read for double-stranded half-UDG libraries.

type: integer

Specify the number of bases to clip off reads from ‘right’ (3 prime) end of read for double-stranded half-UDG libraries.

type: integer

Specify the number of bases to clip off reads from ‘left’ (5 prime) end of read for single-stranded non-UDG libraries.

type: integer

Specify the number of bases to clip off reads from ‘right’ (3 prime) end of read for single-stranded non-UDG libraries.

type: integer

Specify the number of bases to clip off reads from ‘left’ (5 prime) end of read for single-stranded half-UDG libraries.

type: integer

Specify the number of bases to clip off reads from ‘right’ (3 prime) end of read for single-stranded half-UDG libraries.

type: integer

Specify to turn on soft-trimming instead of hard masking.

type: boolean

Options for variant calling

Specify to turn on genotyping of BAM files.

type: boolean

Specify which input BAM to use for genotyping.

type: string

Specify which genotyper to use.

type: string

Specify to skip generation of VCF-based variant calling statistics with bcftools.

type: boolean

Specify the ploidy of the reference organism.

type: integer

default: 2

Specify the base mapping quality to be used for genotyping with pileupCaller.

type: integer

default: 30

Specify the minimum mapping quality to be used for genotyping with pileupCaller.

type: integer

default: 30

Specify the path to SNP panel in BED format for pileupCaller.

type: string

Specify the path to SNP panel in EIGENSTRAT format for pileupCaller.

type: string

Specify the SNP calling method to use for genotyping with pileupCaller.

type: string

Specify the calling mode for transitions with pileupCaller.

type: string

Specify GATK phred-scaled confidence threshold.

type: integer

default: 30

Specify VCF file for SNP annotation of output VCF files for GATK.

type: string

pattern: ^\S+\.vcf$

Specify the maximum depth coverage allowed for genotyping with GATK before down-sampling is turned on.

type: integer

default: 250

Specify GATK UnifiedGenotyper output mode.

type: string

Specify UnifiedGenotyper likelihood model.

type: string

Specify to keep the BAM output of re-alignment around variants from GATK UnifiedGenotyper.

type: boolean

Specify to supply a default base quality if a read is missing a base quality score.

type: integer

default: -1

Specify GATK HaplotypeCaller output mode.

type: string

Specify HaplotypeCaller mode for emitting reference confidence calls.

type: string

Specify minimum required supporting observations of an alternate allele to consider a variant in FreeBayes.

type: integer

default: 1

Specify to skip over regions of high depth by discarding alignments overlapping positions where total read depth is greater than specified in FreeBayes.

type: integer

Specify which ANGSD genotyping likelihood model to use.

type: string

Specify the formatting of the output VCF for ANGSD genotype likelihood results.

type: string

Options for the calculation of ratio of reads to one chromosome/FASTA entry against all others.

Specify to turn on mitochondrial to nuclear ratio calculation.

type: boolean

Specify the name of the reference FASTA entry corresponding to the mitochondrial genome.

type: string

default: MT

Options for the calculation of mapping statistics

Specify to turn off the computation of library complexity estimation with preseq.

type: boolean

Specify which mode of preseq to run.

type: string

Specify the step size (i.e., sampling regularity) of preseq.

type: integer

default: 1000

Specify the maximum number of terms that preseq’s lc_extrap mode will use.

type: integer

default: 100

Specify the maximum extrapolation to use for preseq’s lc_extrap mode.

type: integer

default: 10000000000

Specify number of bootstraps to perform in preseq’s lc_extrap mode.

type: integer

default: 100

Specify confidence interval level for preseq’s lc_extrap mode.

type: number

default: 0.95

Specify to turn on preseq defects mode to extrapolate without testing for defects in lc_extrap mode.

type: boolean

Specify to turn off coverage calculation with Qualimap.

type: boolean

Specify path to SNP capture positions in BED format for coverage calculations with Qualimap.

type: string

Options for calculating and filtering for characteristic ancient DNA damage patterns.

Specify to turn off ancient DNA damage calculation.

type: boolean

Specify the tool to use for damage calculation.

type: string

Specify the maximum misincorporation frequency that should be displayed on damage plot.

type: number

default: 0.3

Specify number of bases of each read to be considered for plotting damage estimation.

type: integer

default: 25

Specify the length filter for DamageProfiler.

type: integer

default: 100

Specify the maximum number of reads to consider for damage calculation with mapDamage.

type: integer

Options for calculating reference annotation statistics (e.g. gene coverages)

Specify to turn on calculation of number of reads, depth and breadth coverage of features in reference with bedtools.

type: boolean

Specify path to GFF or BED file containing positions of features in reference file for bedtools.

type: string

Options for removing host-mapped reads

Specify to turn on creation of pre-adapter-removal and/or read-pair-merging FASTQ files without reads that mapped to reference (e.g. for public upload of privacy sensitive non-host data).

type: boolean

Specify the host-mapped read removal mode.

type: string

Options for the estimation of contamination in human data

Specify to turn on nuclear contamination estimation for genomes with ANGSD.

type: boolean

Specify the name of the chromosome to be used for contamination estimation with ANGSD.

type: string

default: X

Specify the first position on the chromosome to be used for contamination estimation with ANGSD.

type: integer

default: 5000000

Specify the last position on the chromosome to be used for contamination estimation with ANGSD.

type: integer

default: 154900000

Specify the minimum mapping quality reads should have for contamination estimation with ANGSD.

type: integer

default: 30

Specify the minimum base quality reads should have for contamination estimation with ANGSD.

type: integer

default: 30

Specify path to HapMap file of chromosome for contamination estimation with ANGSD.

type: string

default: ${projectDir}/assets/angsd_resources/HapMapChrX.gz

Options for the calculation of genetic sex of human individuals.

Specify to turn on sex determination for genomes mapped to human reference genomes with Sex.DetERRmine.

type: boolean

Specify path to SNP panel in BED format for error bar calculation.

type: string

nf-core/eager