Define where the pipeline should find input data and save output data.

Path to comma-separated file containing information about the samples in the experiment.

required
type: string
pattern: ^\S+\.csv$

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required
type: string

Sample metadata that is included in the multiqc report

type: string
pattern: ^\S+\.[tc]sv$

Email address for completion summary.

type: string
pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Options related to the trimming, low complexity and host removal steps of the reads

Skip read preprocessing and use input reads for downstream analysis

type: boolean

Skip read quality statistics summary tool ‘fastqc’

type: boolean

Save reads after the final preprocessing step

type: boolean
default: true

Save reads after every preprocessing step

type: boolean

With or without UMI detection

type: boolean

With or without UMI extraction

type: boolean
default: true

Specify at what level UMI deduplication should occur.

type: string

Discard R1 / R2 if required 0, meaning not to discard

type: integer

The used trimming tool

type: string

Skip read trimming

type: boolean

Fasta file of adapters

type: string

Specify true to save files that failed to pass trimming thresholds ending in *.fail.fastq.gz

type: boolean

Specify true to save all merged reads to a file ending in *.merged.fastq.gz

type: boolean

Inputs with fewer than this reads will be filtered out of the “reads” output channel

type: integer
default: 1

Skip filtering of low complexity regions in reads

type: boolean
default: true

Specify the decomplexifier to use, bbduk or prinseq

type: string

Reference files containing adapter and/or contaminant sequences for sequence kmer matching (used by bbduk)

type: string

Skip the removal of host read sequences

type: boolean

Kraken2 database used to remove host and contamination

type: string
default: s3://ngi-igenomes/test-data/viralrecon/kraken2_human.tar.gz

Kraken2 library(s) required to remove host and contamination

type: string
default: human

Skip the fastqc step after host & contaminants were removed

type: boolean

Arguments for FastQC tool

type: string
default: --quiet

Arguments for Fastp tool

type: string
default: --cut_front --cut_tail --trim_poly_x --cut_mean_quality 30 --qualified_quality_phred 30 --unqualified_percent_limit 10 --length_required 50

Arguments for Trimmomatic tool

type: string
default: ILLUMINACLIP:null:2:30:10

Arguments for UMI-tools extract

type: string

Arguments for Humid tool

type: string
default: -a -m 1

Arguments for BBDuk tool

type: string
default: entropy=0.3 entropywindow=50 entropymask=f

Arguments for Prinseq tool for reads

type: string

Arguments for Kraken2 tool for host removal

type: string

Parameters used to determine the metagenomic diversity of the sample

Skip determining the metagenomic diversity of the sample

type: boolean

Specify the taxonomic read classifiers, choices are ‘kaiju,kraken2’

type: string
default: kraken2,kaiju
pattern: ^(kaiju|kraken2|bracken)(?:,(kaiju|kraken2|bracken)){0,2}$

Save the used databases

type: boolean

Location of the Kraken2 database

type: string
default: https://genome-idx.s3.amazonaws.com/kraken/k2_viral_20230314.tar.gz

Save classified and unclassified reads as fastq files

type: boolean

Save summary overview of read classifications in a txt file

type: boolean

Save kraken2’s used minimizers

type: boolean

Location of bracken database

type: string
default: https://genome-idx.s3.amazonaws.com/kraken/k2_viral_20230314.tar.gz

Location of Kaiju database

type: string
default: https://kaiju-idx.s3.eu-central-1.amazonaws.com/2023/kaiju_db_rvdb_2023-05-26.tgz

Level of taxa rank that needs to be determined

type: string

Arguments for Kraken2 tool

type: string
default: --report-minimizer-data

Arguments for Kaiju tool

type: string
default: -v

Arguments for Kaiju2Table tool

type: string
default: -e -l species

Arguments for Kaiju2Krona tool

type: string
default: -v -u

Arguments for Krona tool

type: string

Arguments for Bracken tool

type: string

Arguments for Kreport2Krona tool

type: string

Parameters relating to the used assembly methods

Skip de novo assembly of reads

type: boolean

The specified tools for de novo assembly, multiple options are possible

type: string
default: spades,megahit
pattern: ^(trinity|spades|megahit)(?:,(trinity|spades|megahit)){0,2}$

Specific SPAdes mode to run

type: string

File or directory with amino acid HMMs for Spades HMM-guided mode.

type: string

Path to yml file containing read information.

hidden
type: string

Skip the filtering of low complexity contigs with prinseq

type: boolean

Skip the contig extension with sspace_basic

type: boolean

Specify the mean distance between the paired reads

type: integer
default: 350

Specify the deviation of the mean distance that is allowed.

type: number
default: 0.75

Specify the read orientation.

type: string
default: FR

Arguments for SPAdes tool

type: string
default: --rnaviral

Arguments for MEGAHIT tool

type: string

Arguments for Trinity tool

type: string
default: --max_reads_per_graph 100000

Arguments for QUAST tool

type: string
default: --min-contig 0

Arguments for SSPACE Basic tool

type: string
default: -x 1 -o 15 -r 0.75

Arguments for Prinseq tool for contigs

type: string
default: -out_format 1 -lc_dust .20

Parameters relating to the refinement of de novo contigs

Skip the refinement/polishing of contigs through reference based scaffolding and read mapping

type: boolean

Save intermediate polishing files

type: boolean

Set of fasta sequences used as potential references for the contigs

type: string
default: https://rvdb.dbi.udel.edu/download/C-RVDBvCurrent.fasta.gz

Skip the preclustering of assemblies to facilitate downstream processing of assemblies

type: boolean

Keep the contigs that could not be classified with the taxonomic databases (kaiju_db & kraken2_db)

type: boolean
default: true

Specify the metagenomic classifiers to use for contig taxonomy classification: ‘kraken2,kaiju’

type: string
default: kraken2,kaiju
pattern: ^(kaiju|kraken2)(,(kaiju|kraken2))?$

Cluster algorithm used for contigs

type: string

(only with mash) Algorithm to partition the network.

type: string

Skip creation of the hybrid consensus, instead keep the scaffold with ambiguous bases if the depth of scaffolds is not high enough.

type: boolean

Identity threshold value used in clustering algorithms

type: number
default: 0.85

Minimum cumulated sum of mapped read percentages of each member from a cluster group, set to 0 to disable

type: integer
default: 5

Minimum allowed contig size

type: integer
default: 500

Maximum allowed contig size

type: integer
default: 10000000

Define the maximum percentage of ambiguous bases in a contig

type: integer
default: 50

Skip the filtering of contigs that did not cluster together with other contigs

type: boolean

Arguments for BLAST makeblastdb tool

type: string
default: -dbtype nucl

Arguments for BLASTN tool

type: string
default: -max_target_seqs 5

Arguments for BLAST filter tool

type: string
default: --escore 0.01 --bitscore 50 --percent-alignment 0.80

Arguments for Kraken2 tool for contigs

type: string

Arguments for Kaiju tool for contigs

type: string
default: -v

Arguments for precluster extraction

type: string
default: --keep-unclassified true --merge-strategy lca

Arguments for CD-HIT tool

type: string
default: -c 0.85 -mask rRyYkKsSwWmMbBdDhHvVnN

Arguments for VSEARCH tool

type: string
default: --maxseqlength 10000000 --id 0.85 --strand both --iddef 0 --no_progress --qmask none

Arguments for MMseqs2 linclust tool

type: string
default: --min-seq-id 0.85 -c 0.700 --cov-mode 2 --cluster-mode 0

Arguments for MMseqs2 cluster tool

type: string
default: --min-seq-id 0.85 -c 0.700 --cov-mode 2 --cluster-mode 0

Arguments for VRhyme tool

type: string
default: --mems 50

Arguments for Mash distance tool

type: string
default: -s 4000 -k 15

Arguments for network clustering

type: string
default: --score 0.85

Arguments for cluster extraction

type: string
default: --perc_reads_contig 5

Arguments for Minimap2 alignment

type: string

Arguments for Minimap2 index

type: string

Arguments for Mash sketch tool

type: string
default: -i

Arguments for Mash screen tool

type: string

Arguments for selecting reference

type: string

Define parameters for iterations to update de novo consensus using reference based improvements

Don’t realign reads to consensus sequences and redefine the consensus through (multiple) iterations

type: boolean

Number of iterations

type: integer
default: 2

Mapping tool used during iterations

type: string

Variant caller used during iterations

type: string

Call variants during the iterations

type: boolean

Consensus tool used for calling new consensus during iterations

type: string
default: bcftools

Calculate summary statistics during iterations

type: boolean
default: true

Parameters relating to the analysis of variants associated to contigs and scaffolds

Skip the analysis of variants for the external reference or contigs

type: boolean

Define which mapping tool needs to be used when mapping reads to reference

type: string

Sequence to use as a mapping reference instead of the de novo contigs or scaffolds

type: string

Deduplicate the reads

type: boolean
default: true

Define the variant caller to use: ‘ivar’ or ‘bcftools’

type: string

Consensus tool used for calling new consensus in final iteration

type: string

Define the minimum number of mapped reads in order to continue the variant and consensus calling

type: integer
default: 200

Minimum allele frequency threshold for calling consensus

type: number
default: 0.75

Calculate summary statistics in final iteration

type: boolean
default: true
hidden
type: string

Arguments for BWA-MEM2 index

type: string

Arguments for BWA index

type: string

Arguments for BWA MEM

type: string

Arguments for Bowtie2 build

type: string

Arguments for Bowtie2 alignment

type: string
default: --local --very-sensitive-local --seed 1

Arguments for UMI-tools deduplication

type: string
default: --umi-separator=\':\' --method cluster --unmapped-reads use

Arguments for Picard MarkDuplicates

type: string
default: --ASSUME_SORTED true --VALIDATION_STRINGENCY LENIENT --TMP_DIR tmp --REMOVE_DUPLICATES true

Arguments for Picard CollectMultipleMetrics

type: string
default: --ASSUME_SORTED true --VALIDATION_STRINGENCY LENIENT --TMP_DIR tmp

Arguments for custom mpileup

type: string

Arguments for Mosdepth tool

type: string

Arguments for BCFtools mpileup step 1

type: string
default: --ignore-overlaps --count-orphans --max-depth 800000 --min-BQ 20 --annotate FORMAT/AD,FORMAT/ADF,FORMAT/ADR,FORMAT/DP,FORMAT/SP,INFO/AD,INFO/ADF,INFO/ADR

Arguments for BCFtools mpileup step 2

type: string
default: --ploidy 2 --keep-alts --keep-masked-ref --multiallelic-caller --variants-only

Arguments for BCFtools mpileup step 3

type: string
default: --include \'INFO/DP>=10\

Arguments for BCFtools norm

type: string
default: --do-not-normalize --output-type z --multiallelics -any

Arguments for BCFtools stats

type: string

Arguments for Samtools stats command

type: string

Arguments for Samtools idxstats command

type: string

Arguments for Samtools flagstat command

type: string

Arguments for Tabix tool

type: string
default: -p vcf -f

Arguments for Bedtools merge

type: string

Arguments for Bedtools maskfasta

type: string

Arguments for BCFtools consensus

type: string

Arguments for iVar variants step 1

type: string
default: -q 20 -m 10

Arguments for iVar variants step 2

type: string
default: --ignore-overlaps --count-orphans --max-depth 0 --no-BAQ --min-BQ 0

Arguments for making BED mask

type: string
default: -a --ignore-overlaps --count-orphans --max-depth 0 --no-BAQ --min-BQ 0

Arguments for iVar consensus step 1

type: string
default: -t 0 -q 20 -m 10 -n N

Arguments for iVar consensus step 2

type: string
default: --count-orphans --max-depth 0 --min-BQ 20 --no-BAQ -aa

Apply different quality control techniques on the generated consensus genomes

Skip the quality measurements on consensus genomes

type: boolean

Skip the use of checkv for quality check

type: boolean

Reference database used by checkv for consensus quality control

type: string

Skip the annotation of the consensus constructs

type: boolean

Database used for annotation of the consensus constructs

type: string
default: ftp://ftp.expasy.org/databases/viralzone/2020_4/virosaurus90_vertebrate-20200330.fas.gz

Skip gene estimation & annotation with prokka

type: boolean

Define a prokka --protein database for protein annotation

type: string

Skip the use of QUAST for quality check

type: boolean

Skip the blast search of contigs to the provided reference DB

type: boolean

Skip creating an alignment of each the collapsed clusters and each iterative step

type: boolean
default: true

Specify the search algorithm to use for mmseqs. 0: auto 1: amino acid, 2: translated, 3: nucleotide, 4: translated nucleotide alignment

type: integer
default: 4

Arguments for CheckV tool

type: string
default: --remove_tmp

Arguments for MAFFT iterations

type: string
default: --auto --adjustdirection

Arguments for MAFFT QC

type: string
default: --auto --adjustdirection

Arguments for BLASTN QC

type: string
default: -max_target_seqs 5

Arguments for Prokka tool

type: string
default: --centre X --compliant --force --kingdom Viruses

Arguments for MMseqs2 search

type: string
default: --search-type 4 --rescore-mode 3

Arguments for QUAST quality control

type: string

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden
type: string
default: master

Base directory for Institutional configs.

hidden
type: string
default: https://raw.githubusercontent.com/nf-core/configs/master

Institutional config name.

hidden
type: string

Institutional config description.

hidden
type: string

Institutional config contact information.

hidden
type: string

Institutional config URL link.

hidden
type: string

Less common options for the pipeline, typically set in a config file.

Display version and exit.

hidden
type: boolean

Method used to save pipeline results to output directory.

hidden
type: string

Email address for completion summary, only when pipeline fails.

hidden
type: string
pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Send plain-text email instead of HTML.

hidden
type: boolean

File size limit when attaching MultiQC reports to summary emails.

hidden
type: string
default: 25.MB
pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Do not use coloured log outputs.

hidden
type: boolean

Incoming hook URL for messaging service

hidden
type: string

MultiQC report title. Printed as page header, used for filename if not otherwise specified.

hidden
type: string

Custom config file to supply to MultiQC.

hidden
type: string

Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file

hidden
type: string
default: https://github.com/Joon-Klaps/viralgenie/blob/dev/docs/images/ViralGenie-nf-core-theme.png?raw=true

Custom MultiQC yaml file containing HTML including a methods description.

type: string

Delete the output directory if the pipeline fails

hidden
type: boolean

Custom yaml file containing the table column names selection and new names.

hidden
type: string
default: https://github.com/Joon-Klaps/viralgenie/raw/refs/heads/dev/assets/custom_table_headers.yml

Boolean whether to validate parameters against the schema at runtime

hidden
type: boolean
default: true

Prefix of all output files followed by [date][pipelineversion][runName]

type: string

Global prefix set if you don’t want metadata embedded in the output filenames

type: string

Base URL or local path to location of pipeline test dataset files

hidden
type: string
default: https://raw.githubusercontent.com/nf-core/test-datasets/