Define where the pipeline should find input data and save output data.

Path to comma-separated file containing information about the samples in the experiment.

type: string
pattern: ^\S+\.csv$

You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See usage docs.

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

type: string

Email address for completion summary.

type: string
pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (~/.nextflow/config) then you don't need to specify this on the command line for every run.

MultiQC report title. Printed as page header, used for filename if not otherwise specified.

type: string

Option to generate mapping statistics, creating plots and summaries

type: boolean

This will create the following:

  • Count the total number of reads before and after trimming
  • Scatterplots comparing all replicates (separate for both host and pathogen reads)
  • Plots of the % of mapped/quantified reads
  • Plots of RNA-class statistics (as many types can be identified, the parameter below --RNA_classes_to_replace_host can help to summarise these)

To ignore igenomes reference config

type: string
default: genomes,igenomes_base,igenomes_ignore

Reference igenome related files and options required for the workflow.

Name of iGenomes reference.

type: string

If using a reference genome configured in the pipeline using iGenomes, use this parameter to give the ID for the reference. This is then used to build the full paths for all required reference genome files e.g. --genome GRCh38.

See the nf-core website docs for more details.

Path to FASTA genome file.

type: string
pattern: ^\S+\.fn?a(sta)?(\.gz)?$

This parameter is mandatory if --genome is not specified. If you don't have a BWA index available this will be generated for you automatically. Combine with --save_reference to save BWA index for future runs.

Directory / URL base for iGenomes references.

type: string
default: s3://ngi-igenomes/igenomes

Do not load the iGenomes reference config.

type: boolean

Do not load igenomes.config when running the pipeline. You may choose this option if you observe clashes between custom parameters and those supplied in igenomes.config.

The path to the files should be enclosed by quotes "../.."

default param???

type: string

Change to custom name if desired, ie Human_hela_cells

type: string

Change to custom name if desired, ie Salmonella_SL1344

type: string

Host genome fasta file

type: string

Pathogen genome fasta file

type: string

Host GFF file

type: string

Pathogen GFF

type: string

Host transcriptome file

type: string

Pathogen transcriptome file

type: string

By default, the pipeline utilizes FastQC tool for quality control of sequencing reads, run before and after trimming

Define a set of additional fastqc parameters you wish to use, except --quiet --threads --noextract flags which are already specified in the dualrnaseq pipeline

type: string

Adapter and read trimming is performed by either Cutadapt or BBDuk.

Additional parameters if needed

type: string

Define a set of additional Cutadapt parameters you wish to use, such as -a or -A to specify custom adapter sequences.

These parameters are available for Salmon in both Selective Alignment and Alignment-based mode

Options for setting the library type. A = automatic detection

type: string

The pipeline uses gene features from the 3rd column of the host annotative file (gff3) to extract the coordinates of transcripts to be quantified. By default, the pipeline useanscriptome_hosts exon from the --gff_host

type: string
default: exon,tRNA

The pipeline uses gene features from the 3rd column of the pathogen annotative fikle (gff3) to extract the coordinates of transcripts to be quantified. By default, the pipeline uses features as gene, sRNA, tRNA and rRNA from the --gff_pathogen file.

type: string
default: gene,sRNA,tRNA,rRNA

This flag defines the gene attribute from the 9th column of the host annotative (gff3) file, where the transcript names are extracted. By default, the pipeline extracts transcript_id from the --gff_host file

type: string
default: transcript_id

This flag defines the gene attribute from the 9th column of the pathogen annotative (gff3) file, where transcript, genes or CDS regions are extracted. By default, the pipeline extracts locus_tag from the --gff_pathogen file

type: string
default: locus_tag

Still to be described

type: string
default: exon

Still to be described - requires capital P though

type: string
default: Parent

Parameters listed below are available only for Salmon with Selective Alignment.

Run Salmon selective alignment. Does not need a value, just run --run_salmon_SA

type: boolean
default: true

Set of additional parameters for creating an index with Salmon Selective Alignment. By default, the kmer size is set at 21. Multiple parameters can be passed - for example: --salmon_sa_index_args="--keepDuplicates -k 21".

type: string
default: -k 21

Set of additional parameters for mapping with Salmon Selective Alignment. By default, the pipeline allows soft-clipping of overhanging reads. Multiple parameters can be passed - for example: --salmon_sa_args="--softclipOverhangs --allowDovetail"

type: string
default: --softclipOverhangs

Options for Alignment-based mode

To run Salmon alignment-based mode

type: boolean

STAR parameter - To create a transcriptome Bam (to use with Salmon)

type: string
default: TranscriptomeSAM

The nf-core/dualrnaseq pipeline runs STAR to generate transcriptomic alignments. By default, it allows for insertions, deletions and soft-clips (Singleend option). To prohibit this behaviour, please specify IndelSoftclipSingleend

type: string
default: Singleend

Define a set of additional salmon quant parameters you wish to use in salmon alignment-based mode.

type: string

Options for STAR genome alignment

To run STAR genome alignment

type: boolean

Quant value in GFF 3rd column

type: string
default: quant

parent attribule in GFF - last column

type: string
default: parent

By default, the pipeline saves unmapped reads within the main BAM file. If you want to switch off this option, set the --outSAMunmapped flag to None

type: string
default: Within

For paired-end reads, the KeepPairs parameter will record the unmapped mates for each alignment, and will keep it adjacent to its mapped read (only affects multi-mapping reads).

Option to limit RAM when sorting BAM file. If 0, will be set to the genome index size, which can be quite large when running on a desktop or laptop

type: integer
default: 0

Options for HTSeq count

To run HTSeq count on a aligned genome file

type: boolean

3rd value of GFF to quantify

type: string
default: quant

Host - gene feature to quantify

type: string
default: exon,tRNA

Host - GFF attribute

type: string
default: gene_id

Pathogen - gene feature to quantify (will likely need to be modified)

type: string
default: gene,sRNA,tRNA,rRNA

Pathogen - GFF attribute (Will likely need to be modified)

type: string
default: locus_tag

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

type: string
default: master

Base directory for Institutional configs.

type: string

If you're running offline, Nextflow will not be able to fetch the institutional config files from the internet. If you don't need them, then this is not a problem. If you do need them, you should download the files from the repo and tell Nextflow where to find them with this parameter.

Institutional config name.

type: string

Institutional config description.

type: string

Institutional config contact information.

type: string

Institutional config URL link.

type: string

Less common options for the pipeline, typically set in a config file.

Display help text.

type: boolean

Display version and exit.

type: boolean

Method used to save pipeline results to output directory.

type: string

The Nextflow publishDir option specifies which intermediate files should be saved to the output directory. This option tells the pipeline what method should be used to move these files. See Nextflow docs for details.

Email address for completion summary, only when pipeline fails.

type: string
pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

An email address to send a summary email to when the pipeline is completed - ONLY sent if the pipeline does not exit successfully.

Send plain-text email instead of HTML.

type: boolean

File size limit when attaching MultiQC reports to summary emails.

type: string
default: 25.MB
pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Do not use coloured log outputs.

type: boolean

Incoming hook URL for messaging service

type: string

Incoming hook URL for messaging service. Currently, MS Teams and Slack are supported.

Custom config file to supply to MultiQC.

type: string

Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file

type: string

Custom MultiQC yaml file containing HTML including a methods description.

type: string

Boolean whether to validate parameters against the schema at runtime

type: boolean
default: true

Show all params when using --help

type: boolean

By default, parameters set as hidden in the schema are not shown on the command line when a user runs with --help. Specifying this option will tell the pipeline to show all parameters.

Disable specified tools.

type: string
pattern: ^((baserecalibrator|baserecalibrator_report|bcftools|documentation|fastqc|markduplicates|markduplicates_report|mosdepth|multiqc|samtools|vcftools|versions)?,?)*[^,]+$

Multiple tools can be specified, separated by commas.