nf-core/eager
A fully reproducible and state-of-the-art ancient DNA analysis pipeline
22.10.6
.
Learn more.
Define where the pipeline should find input data and save output data.
Path to tab- or comma-separated file containing information about the samples in the experiment.
string
^\S+\.(c|t)sv$
Specify to convert input BAM files back to FASTQ for remapping
boolean
The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.
string
Email address for completion summary.
string
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
MultiQC report title. Printed as page header, used for filename if not otherwise specified.
string
Reference genome related files and options required for the workflow.
Path to FASTA file of the reference genome.
string
^\S+\.fn?a(sta)?(\.gz)?$
Specify path to samtools FASTA index.
string
Specify path to Picard sequence dictionary file.
string
Specify path to directory containing index files of the FASTA for a given mapper.
string
Specify to generate ‘.csi’ BAM indices instead of ‘.bai’ for larger reference genomes.
boolean
Specify to save any pipeline-generated reference genome indices in the results directory.
boolean
Path to a tab-/comma-separated file containing reference-specific files.
string
^\S+\.(c|t)sv$
Name of iGenomes reference.
string
Directory / URL base for iGenomes references.
string
s3://ngi-igenomes/igenomes/
Do not load the iGenomes reference config.
boolean
Specify the FASTA header of the extended chromosome when using circularmapper
.
string
Specify the number of bases to extend reference by (circularmapper only).
integer
500
Specify an elongated reference FASTA to be used for circularmapper.
string
Specify a samtools index for the elongated FASTA file.
string
Parameters used to describe centralised config profiles. These should not be edited.
Git commit id for Institutional configs.
string
master
Base directory for Institutional configs.
string
https://raw.githubusercontent.com/nf-core/configs/master
Institutional config name.
string
Institutional config description.
string
Institutional config contact information.
string
Institutional config URL link.
string
Less common options for the pipeline, typically set in a config file.
Display version and exit.
boolean
Method used to save pipeline results to output directory.
string
Email address for completion summary, only when pipeline fails.
string
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
Send plain-text email instead of HTML.
boolean
File size limit when attaching MultiQC reports to summary emails.
string
25.MB
^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$
Do not use coloured log outputs.
boolean
Incoming hook URL for messaging service
string
Custom config file to supply to MultiQC.
string
Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file
string
Custom MultiQC yaml file containing HTML including a methods description.
string
Boolean whether to validate parameters against the schema at runtime
boolean
true
Base URL or local path to location of pipeline test dataset files
string
https://raw.githubusercontent.com/nf-core/test-datasets/
Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.
string
Removal of adapters, paired-end merging, poly-G removal, etc.
Specify which tool to use for sequencing quality control.
string
Specify to skip all preprocessing steps (adapter removal, paired-end merging, poly-G trimming, etc).
boolean
Specify which preprocessing tool to use.
string
Specify to skip read-pair merging.
boolean
Specify to exclude read-pairs that did not overlap sufficiently for merging (i.e., keep merged reads only).
boolean
Specify to skip removal of adapters.
boolean
Specify the nucleotide sequence for the forward read/R1.
string
Specify the nucleotide sequence for the reverse read/R2.
string
Specify a list of all possible adapters to trim.
string
Specify the minimum length reads must have to be retained.
integer
25
Specify number of bases to hard-trim from 5 prime or front of reads.
integer
Specify number of bases to hard-trim from 3 prime or tail of reads.
integer
Specify to save the preprocessed reads in the results directory.
boolean
Specify to turn on sequence complexity filtering of reads.
boolean
Specify the complexity threshold that must be reached or exceeded to retain reads.
integer
10
Skip AdapterRemoval quality and N base trimming at 5 prime end.
boolean
Specify to skip AdapterRemoval quality and N trimming at the ends of reads.
boolean
Specify AdapterRemoval minimum base quality for trimming off bases.
integer
20
Specify to skip AdapterRemoval N trimming (quality trimming only).
boolean
Specify the AdapterRemoval minimum adapter overlap required for trimming.
integer
1
Specify the AdapterRemoval maximum Phred score used in input FASTQ files.
integer
41
Options for aligning reads against reference genome(s)
Specify to turn on FASTQ sharding.
boolean
Specify the number of reads in each shard when splitting.
integer
1000000
Specify which mapper to use.
string
Specify the amount of allowed mismatches in the alignment for mapping with BWA aln.
number
0.01
Specify the maximum edit distance allowed in a seed for mapping with BWA aln.
integer
2
Specify the length of seeds to be used for BWA aln.
integer
1024
Specify the number of gaps allowed for alignment with BWA aln.
integer
2
Specify the minimum seed length for alignment with BWA mem.
integer
19
Specify the re-seeding threshold for alignment with BWA mem.
number
1.5
Specify the Bowtie 2 alignment mode.
string
Specify the level of sensitivity for the Bowtie 2 alignment mode.
string
Specify the number of mismatches in seed for alignment with Bowtie 2.
integer
Specify the length of seed substrings for Bowtie 2.
integer
20
Specify the number of bases to trim off from 5 prime end of read before alignment with Bowtie 2.
integer
Specify the number of bases to trim off from 3 prime end of read before alignment with Bowtie 2.
integer
Specify the maximum fragment length for Bowtie2 paired-end mapping mode only.
integer
500
Turn on to remove reads that did not map to the circularised genome.
boolean
Specify the -p parameter for mapAD, i.e. mismatch budget for the alignment.
number
0.03
Specify the -f parameter for mapAD, which sets the 5’-overhang length parameter of mapAD’s aDNA scoring model.
number
0.5
Specify the -t parameter for mapAD, which sets the 3’-overhang length parameter of mapAD’s aDNA scoring model.
number
0.5
Specify the -d parameter for mapAD, which sets the double-stranded deamination rate parameter of mapAD’s aDNA scoring model.
number
0.02
Specify the -s parameter for mapAD, which sets the single-stranded deamination rate parameter of mapAD’s aDNA scoring model.
number
1
Specify the -i parameter for mapAD, which adjusts the expected indel rate between reads and reference.
number
0.001
Specify the -x parameter for mapAD, which adjusts the gap extension penalty.
number
0.5
Specify the —gap_dist_ends parameter for mapAD, which defines the distance from the read ends in which no gaps are permitted.
number
5
Specify the number of gaps allowed for alignment with mapAD.
number
2
Specify the base error rate parameter for mapAD.
number
0.02
Set the —ignore_base_quality flag, which instructs mapAD to ignore base quality values in its scoring model.
boolean
Set the —no_search_limit_recovery flag, which instructs mapAD to abort instead of trying to recover from full internal data structures.
boolean
Options related to length, quality, and map status filtering of reads.
Specify to turn on filtering of reads in BAM files after mapping. By default, only mapped reads retained.
boolean
Specify the minimum read length mapped reads should have for downstream genomic analysis.
integer
Specify the minimum mapping quality reads should have for downstream genomic analysis.
integer
Specify the SAM format flag of reads to remove during BAM filtering for downstream genomic steps.
integer
4
Specify to retain unmapped reads in the BAM file used for downstream genomic analyses.
boolean
Specify to generate FASTQ files from the filtered BAM files.
boolean
Specify to save the intermediate filtered genomic BAM files in the results directory.
boolean
Options related to metagenomic screening.
Specify to turn on metagenomic screening of mapped, unmapped or all reads.
boolean
Specify which type of reads to use for metagenomic screening.
string
Specify to turn on saving of input for metagenomics.
boolean
Specify to run a complexity filter on the metagenomics input files before classification.
boolean
Specify to save FASTQ files containing the complexity-filtered reads before metagenomic classification.
boolean
Specify which tool to use for trimming, filtering or reformatting of FASTQ reads that go into metagenomics screening.
string
Specify the entropy threshold under which a sequencing read will be complexity-filtered out.
number
0.3
Specify the complexity filter mode for PRINSEQ++.
string
Specify the minimum dust score for PRINTSEQ++ complexity filtering
number
0.5
Specify which tool to use for metagenomic profiling and screening. Required if --run_metagenomics
flagged.
string
Specify a databse directory or .tar.gz file of a database directory to run metagenomics profiling on. Required if --run_metagenomics
flagged.
string
Turn on saving reads assigned by KrakenUniq or Kraken2
boolean
Turn on saving of KrakenUniq or Kraken2 per-read taxonomic assignment file
boolean
Specify how large to chunk database when loading into memory for KrakenUniq
string
16G
Turn on saving minimizer information in the kraken2 report thus increasing to an eight column layout.
boolean
Specify which alignment mode to use for MALT.
string
Specify alignment method for MALT.
string
Percent identity value threshold for MALT.
integer
85
Specify the percent for LCA algorithm for MALT (see MEGAN6 CE manual).
integer
1
Specify whether to use percent or raw number of reads for minimum support required for taxon to be retained for MALT.
string
Specify the minimum percentage of reads a taxon of sample total is required to have to be retained for MALT.
number
0.01
Specify a minimum number of reads a taxon of sample total is required to have to be retained in malt or kraken. Not compatible with —malt_min_support_mode ‘percent’.
integer
1
Specify the maximum number of queries a read can have for MALT.
integer
100
Specify the memory load method. Do not use ‘map’ with GPFS file systems for MALT as can be very slow.
string
Specify to also produce SAM alignment files. Note this includes both aligned and unaligned reads, and are gzipped. Note this will result in very large file sizes.
boolean
Define how many fastq files should be submitted in the same malt run. Default value of 0 runs all files at once.
integer
Activate post-processing of metagenomics profiling tool selected.
boolean
Path to a text file with taxa of interest (one taxon per row, NCBI taxonomy name format)
string
Path to directory containing containing NCBI resource files (ncbi.tre and ncbi.map; available: https://github.com/rhuebler/HOPS/)
string
Specify which MaltExtract filter to use.
string
Specify percent of top alignments to use.
number
0.01
Turn off destacking.
boolean
Turn off downsampling.
boolean
Turn off duplicate removal.
boolean
Turn on exporting alignments of hits in BLAST format.
boolean
Turn on export of MEGAN summary files.
boolean
Minimum percent identity alignments are required to have to be reported as candidate reads. Recommended to set same as MALT parameter.
number
85
Turn on using top alignments per read after filtering.
boolean
Options for removal of PCR duplicates
Specify to skip the removal of PCR duplicates.
boolean
Specify which tool to use for deduplication.
string
Options for filtering for, trimming or rescaling characteristic ancient DNA damage patterns
Specify to turn on damage rescaling of BAM files using mapDamage2 to probabilistically remove damage.
boolean
Specify the length of read sequence to use from each side for rescaling.
integer
12
Specify the length of read for mapDamage2 to rescale from 5 prime end.
integer
Specify the length of read for mapDamage2 to rescale from 3 prime end.
integer
Specify to turn on PMDtools filtering.
boolean
Specify PMD score threshold for PMDtools.
integer
3
Specify a masked FASTA file with positions to be used with PMDtools.
string
^\S+\.fa?(\sta)$
Specify a BED file to be used to mask the reference FASTA prior to running PMDtools.
string
^\S+\.bed?(\.gz)$
Specify to turn on BAM trimming for non-UDG or half-UDG libraries.
boolean
Specify the number of bases to clip off reads from ‘left’ (5 prime) end of reads for double-stranded non-UDG libraries.
integer
Specify the number of bases to clip off reads from ‘right’ (3 prime) end of reads for double-stranded non-UDG libraries.
integer
Specify the number of bases to clip off reads from ‘left’ (5 prime) end of read for double-stranded half-UDG libraries.
integer
Specify the number of bases to clip off reads from ‘right’ (3 prime) end of read for double-stranded half-UDG libraries.
integer
Specify the number of bases to clip off reads from ‘left’ (5 prime) end of read for single-stranded non-UDG libraries.
integer
Specify the number of bases to clip off reads from ‘right’ (3 prime) end of read for single-stranded non-UDG libraries.
integer
Specify the number of bases to clip off reads from ‘left’ (5 prime) end of read for single-stranded half-UDG libraries.
integer
Specify the number of bases to clip off reads from ‘right’ (3 prime) end of read for single-stranded half-UDG libraries.
integer
Specify to turn on soft-trimming instead of hard masking.
boolean
Options for variant calling
Specify to turn on genotyping of BAM files.
boolean
Specify which input BAM to use for genotyping.
string
Specify which genotyper to use.
string
Specify to skip generation of VCF-based variant calling statistics with bcftools.
boolean
Specify the ploidy of the reference organism.
integer
2
Specify the base mapping quality to be used for genotyping with pileupCaller.
integer
30
Specify the minimum mapping quality to be used for genotyping with pileupCaller.
integer
30
Specify the path to SNP panel in BED format for pileupCaller.
string
Specify the path to SNP panel in EIGENSTRAT format for pileupCaller.
string
Specify the SNP calling method to use for genotyping with pileupCaller.
string
Specify the calling mode for transitions with pileupCaller.
string
Specify GATK phred-scaled confidence threshold.
integer
30
Specify VCF file for SNP annotation of output VCF files for GATK.
string
^\S+\.vcf$
Specify the maximum depth coverage allowed for genotyping with GATK before down-sampling is turned on.
integer
250
Specify GATK UnifiedGenotyper output mode.
string
Specify UnifiedGenotyper likelihood model.
string
Specify to keep the BAM output of re-alignment around variants from GATK UnifiedGenotyper.
boolean
Specify to supply a default base quality if a read is missing a base quality score.
integer
-1
Specify GATK HaplotypeCaller output mode.
string
Specify HaplotypeCaller mode for emitting reference confidence calls.
string
Specify minimum required supporting observations of an alternate allele to consider a variant in FreeBayes.
integer
1
Specify to skip over regions of high depth by discarding alignments overlapping positions where total read depth is greater than specified in FreeBayes.
integer
Specify which ANGSD genotyping likelihood model to use.
string
Specify the formatting of the output VCF for ANGSD genotype likelihood results.
string
Options for the calculation of ratio of reads to one chromosome/FASTA entry against all others.
Specify to turn on mitochondrial to nuclear ratio calculation.
boolean
Specify the name of the reference FASTA entry corresponding to the mitochondrial genome.
string
MT
Options for the calculation of mapping statistics
Specify to turn off the computation of library complexity estimation with preseq.
boolean
Specify which mode of preseq to run.
string
Specify the step size (i.e., sampling regularity) of preseq.
integer
1000
Specify the maximum number of terms that preseq’s lc_extrap mode will use.
integer
100
Specify the maximum extrapolation to use for preseq’s lc_extrap mode.
integer
10000000000
Specify number of bootstraps to perform in preseq’s lc_extrap mode.
integer
100
Specify confidence interval level for preseq’s lc_extrap mode.
number
0.95
Specify to turn on preseq defects mode to extrapolate without testing for defects in lc_extrap mode.
boolean
Specify to turn off coverage calculation with Qualimap.
boolean
Specify path to SNP capture positions in BED format for coverage calculations with Qualimap.
string
Options for calculating and filtering for characteristic ancient DNA damage patterns.
Specify to turn off ancient DNA damage calculation.
boolean
Specify the tool to use for damage calculation.
string
Specify the maximum misincorporation frequency that should be displayed on damage plot.
number
0.3
Specify number of bases of each read to be considered for plotting damage estimation.
integer
25
Specify the length filter for DamageProfiler.
integer
100
Specify the maximum number of reads to consider for damage calculation with mapDamage.
integer
Options for calculating reference annotation statistics (e.g. gene coverages)
Specify to turn on calculation of number of reads, depth and breadth coverage of features in reference with bedtools.
boolean
Specify path to GFF or BED file containing positions of features in reference file for bedtools.
string
Options for removing host-mapped reads
Specify to turn on creation of pre-adapter-removal and/or read-pair-merging FASTQ files without reads that mapped to reference (e.g. for public upload of privacy sensitive non-host data).
boolean
Specify the host-mapped read removal mode.
string
Options for the estimation of contamination in human data
Specify to turn on nuclear contamination estimation for genomes with ANGSD.
boolean
Specify the name of the chromosome to be used for contamination estimation with ANGSD.
string
X
Specify the first position on the chromosome to be used for contamination estimation with ANGSD.
integer
5000000
Specify the last position on the chromosome to be used for contamination estimation with ANGSD.
integer
154900000
Specify the minimum mapping quality reads should have for contamination estimation with ANGSD.
integer
30
Specify the minimum base quality reads should have for contamination estimation with ANGSD.
integer
30
Specify path to HapMap file of chromosome for contamination estimation with ANGSD.
string
${projectDir}/assets/angsd_resources/HapMapChrX.gz
Options for the calculation of genetic sex of human individuals.
Specify to turn on sex determination for genomes mapped to human reference genomes with Sex.DetERRmine.
boolean
Specify path to SNP panel in BED format for error bar calculation.
string