nf-core/mag

Assembly and binning of metagenomes

annotationassemblybinninglong-read-sequencingmetagenomesmetagenomicsnanoporenanopore-sequencing

These pages are for an old version of the pipeline (1.1.2). The latest stable release is5.4.2.

Launch version 1.1.2 https://github.com/nf-core/mag

Define where the pipeline should find input data and save output data.

Input FastQ files. Either this or the --manifest parameter is required.

type: string

Manifest file, required for hybrid assembly with metaSPAdes. Alternative to --input.

type: string

Paths to input FastQ files for tests.

hidden

type: string

Specifies that the input is single-end reads.

type: boolean

The output directory where the results will be saved.

type: string

default: ./results

Email address for completion summary.

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Options for the reference genome indices used to align reads.

Directory / URL base for iGenomes references.

hidden

type: string

default: s3://ngi-igenomes/igenomes/

Do not load the iGenomes reference config.

hidden

type: boolean

Less common options for the pipeline, typically set in a config file.

Display help text.

hidden

type: boolean

Method used to save pipeline results to output directory.

hidden

type: string

Workflow name.

hidden

type: string

Email address for completion summary, only when pipeline fails.

hidden

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Send plain-text email instead of HTML.

hidden

type: boolean

File size limit when attaching MultiQC reports to summary emails.

hidden

type: string

default: 25.MB

Do not use coloured log outputs.

hidden

type: boolean

Custom config file to supply to MultiQC.

hidden

type: string

Directory to keep pipeline Nextflow logs and reports.

hidden

type: string

default: ${params.outdir}/pipeline_info

Set the top limit for requested resources for any single job.

Maximum number of CPUs that can be requested for any single job.

hidden

type: integer

default: 16

Maximum amount of memory that can be requested for any single job.

hidden

type: string

default: 128.GB

Maximum amount of time that can be requested for any single job.

hidden

type: string

default: 240.h

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden

type: string

default: master

Base directory for Institutional configs.

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/configs/master

Institutional configs hostname.

hidden

type: string

Institutional config description.

hidden

type: string

Institutional config contact information.

hidden

type: string

Institutional config URL link.

hidden

type: string

Use these parameters to also enable reproducible results from the individual assembly and binning tools .

Fix number of CPUs for MEGAHIT to 1. Not increased with retries.

type: boolean

Fix number of CPUs used by SPAdes. Not increased with retries.

type: integer

Fix number of CPUs used by SPAdes hybrid. Not increased with retries.

type: integer

RNG seed for MetaBAT2.

type: integer

default: 1

Sequence of 3’ adapter to remove in the forward reads.

type: string

default: AGATCGGAAGAGCACACGTCTGAACTCCAGTCA

Sequence of 3’ adapter to remove in the reverse reads.

type: string

default: AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT

Mean qualified quality value for keeping read.

type: integer

default: 15

Trimming quality value for the sliding window.

type: integer

default: 15

Name of iGenomes reference for host contamination removal.

type: string

Fasta reference file for host contamination removal.

type: string

Use the --very-sensitive instead of the--sensitivesetting for Bowtie 2 to map reads against the host genome.

type: boolean

Save the read IDs of removed host reads.

type: boolean

Keep reads similar to the Illumina internal standard PhiX genome.

type: boolean

Genome reference used to remove Illumina PhiX contaminant reads.

hidden

type: string

default: ${baseDir}/assets/data/GCA_002596845.1_ASM259684v1_genomic.fna.gz

Skip removing adapter sequences from long reads.

type: boolean

Discard any read which is shorter than this value.

type: integer

default: 1000

Keep this percent of bases.

type: integer

default: 90

The higher the more important is read length when choosing the best reads.

type: integer

default: 10

Keep reads similar to the ONT internal standard Escherichia virus Lambda genome.

type: boolean

Genome reference used to remove ONT Lambda contaminant reads.

hidden

type: string

default: ${baseDir}/assets/data/GCA_000840245.1_ViralProj14204_genomic.fna.gz

Taxonomic classification is disabled by default. You have to specify one of the options below to activate it.

Database for taxonomic binning with centrifuge.

type: string

Database for taxonomic binning with kraken2.

type: string

Skip creating a krona plot for taxonomic binning.

type: boolean

Database for taxonomic classification of metagenome assembled genomes.

type: string

Skip Illumina-only SPAdes assembly.

type: boolean

Skip SPAdes hybrid assembly (only available when using manifest input).

type: boolean

Skip MEGAHIT assembly.

type: boolean

Skip metaQUAST.

type: boolean

Skip metagenome binning.

type: boolean

Minimum contig size to be considered for binning and for bin quality check.

type: integer

default: 1500

Minimal length of contigs that are not part of any bin but treated as individual genome.

type: integer

default: 1000000

Maximal number of contigs that are not part of any bin but treated as individual genome.

type: integer

default: 100

Disable bin QC with BUSCO.

type: boolean

Download path for BUSCO database.

type: string

default: https://busco-data.ezlab.org/v4/data/lineages/bacteria_odb10.2020-03-06.tar.gz

Save BUSCO reference.

type: boolean

On this page