Define where the pipeline should find input data and save output data.

Path to comma-separated file containing information about the samples in the experiment.

required
type: string
pattern: ^\S+\.csv$

You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See usage docs.

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required
type: string

Email address for completion summary.

type: string
pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (~/.nextflow/config) then you don't need to specify this on the command line for every run.

MultiQC report title. Printed as page header, used for filename if not otherwise specified.

type: string

Save FastQ files after merging re-sequenced libraries in the results directory.

type: boolean

Reference genome related files and options required for the workflow.

Name of iGenomes reference.

type: string

If using a reference genome configured in the pipeline using iGenomes, use this parameter to give the ID for the reference. This is then used to build the full paths for all required reference genome files e.g. --genome GRCh38.

See the nf-core website docs for more details.

Path to FASTA genome file.

type: string
pattern: ^\S+\.fn?a(sta)?(\.gz)?$

This parameter is mandatory if --genome is not specified. If you don't have a BWA index available this will be generated for you automatically. Combine with --save_reference to save BWA index for future runs.

Path to FASTA dictionary file.

type: string

NB If none provided, will be generated automatically from the FASTA reference.

Path to FASTA reference index.

type: string

NB If none provided, will be generated automatically from the FASTA reference

Directory / URL base for iGenomes references.

hidden
type: string
default: s3://ngi-igenomes/igenomes

Do not load the iGenomes reference config.

hidden
type: boolean

Do not load igenomes.config when running the pipeline. You may choose this option if you observe clashes between custom parameters and those supplied in igenomes.config.

Path to GTF annotation file.

type: string

This parameter is mandatory if --genome is not specified.

Path to GFF3 annotation file.

type: string

This parameter must be specified if --genome or --gtf are not specified.

Path to BED file containing exon intervals. This will be created from the GTF file if not specified.

type: string

Read length

type: number
default: 151

Specify the read length for the STAR aligner.

If generated by the pipeline, save the STAR index in the results directory.

type: boolean

If the STAR index is generated by the pipeline, then please use this parameter to save it to your results folder. These index can then be used for future pipeline runs, reducing processing times.

Path to known indels VCF file

type: string

Path to known indels index file

type: string

Path to dbSNP VCF file

type: string

Path to dbSNP VCF index file

type: string

snpEff DB version

type: string

VEP genome

type: string

If you use AWS iGenomes or a local resource with genomes.conf, this has already been set for you appropriately.

VEP species

type: string

If you use AWS iGenomes or a local resource with genomes.conf, this has already been set for you appropriately.

VEP cache version

type: string

Define parameters related to read alignment

Specifies the alignment algorithm to use. Currently available option is 'star'

type: string
default: star

This parameter define which aligner is to be used for aligning the RNA reads to the reference genome. Currently only STAR aligner is supported. So use 'star' as the value for this option.

Path to STAR index folder or compressed file (tar.gz)

type: string

This parameter can be used if there is an pre-defined STAR index available. You can either give the full path to the index directory or a compressed file in tar.gz format.

Enable STAR 2-pass mapping mode.

type: boolean

This parameter enables STAR to perform 2-pass mapping. Default true.

Do not use GTF file during STAR index buidling step

type: boolean

Do not use parameter --sjdbGTFfile <GTF file> during the STAR genomeGenerate process.

Option to limit RAM when sorting BAM file. Value to be specified in bytes. If 0, will be set to the genome index size.

type: integer

This parameter specifies the maximum available RAM (bytes) for sorting BAM during STAR alignment.

Specifies the number of genome bins for coordinate-sorting

type: integer
default: 50

This parameter specifies the number of bins to be used for coordinate sorting during STAR alignment step.

Specifies the maximum number of collapsed junctions

type: integer
default: 1000000

Sequencing center information to be added to read group of BAM files.

type: string

This parameter is required for creating a proper BAM header to use in the downstream analysis of GATK.

Specify the sequencing platform used

type: string
default: illumina

This parameter is required for creating a proper BAM header to use in the downstream analysis of GATK.

Where possible, save unaligned reads from aligner to the results directory.

type: boolean

This may either be in the form of FastQ or BAM files depending on the options available for that particular tool.

Save the intermediate BAM files from the alignment step.

type: boolean

By default, intermediate BAM files will not be saved. The final BAM files created after the appropriate filtering step are always saved to limit storage usage. Set this parameter to also save other intermediate BAM files.

Create a CSI index for BAM files instead of the traditional BAI index. This will be required for genomes with larger chromosome sizes.

type: boolean

Specify whether to remove duplicates from the BAM during Picard MarkDuplicates step.

type: boolean

Specify true for removing duplicates from BAM file during Picard MarkDuplicates step.

The minimum phred-scaled confidence threshold at which variants should be called.

type: number
default: 20

Specify the minimum phred-scaled confidence threshold at which variants should be called.

Specify which tools RNAvar should use for annotating variants. Values can be 'snpeff', 'vep' or 'merge'. If you specify 'merge', the pipeline runs both snpeff and VEP annotation.

hidden
type: string

List of tools to be used for variant annotation.

This parameter must be a combination of the following values: snpeff, vep, merge

Enable the use of cache for annotation

hidden
type: boolean

And disable usage of snpeff and vep specific containers for annotation

To be used with --snpeff_cache and/or --vep_cache

Enable CADD cache.

hidden
type: boolean

Path to CADD InDels file.

hidden
type: string

Path to CADD InDels index.

hidden
type: string

Path to CADD SNVs file.

hidden
type: string

Path to CADD SNVs index.

hidden
type: string

Enable the use of the VEP GeneSplicer plugin.

hidden
type: boolean

Path to snpEff cache

hidden
type: string

To be used with --annotation_cache

Path to VEP cache

hidden
type: string

To be used with --annotation_cache

Define parameters that control the stages in the pipeline

Skip the process of base recalibration steps i.e., GATK BaseRecalibrator and GATK ApplyBQSR.

type: boolean

This parameter disable the base recalibration step, thus using a un-calibrated BAM file for variant calling.

Skip the process of preparing interval lists for the GATK variant calling step

type: boolean

This parameter disable preparing multiple interval lists to use with HaplotypeCaller module of GATK. It is recommended not to disable the step as it is required to run the variant calling correctly.

Skip variant filtering of GATK

type: boolean

Set this parameter if you don't want to filter any variants.

Skip variant annotation

type: boolean

Set this parameter if you don't want to run variant annotation.

Skip MultiQC reports

type: boolean

This parameter disable all QC reports

Define parameters of the tools used in the pipeline

Number of times the gene interval list to be split in order to run GATK haplotype caller in parallel

type: integer
default: 25

Set this parameter to decide the number of splits for the gene interval list file.

Do not use gene interval file during variant calling

type: boolean

This parameter, if set to True, does not use the gene intervals during the variant calling step, which then results in variants from all regions including non-genic. Default is False

The window size (in bases) in which to evaluate clustered SNPs.

type: integer
default: 35

This parameter is used by GATK variant filteration step. It defines the window size (in bases) in which to evaluate clustered SNPs. It has to be used together with the other option 'cluster'.

The number of SNPs which make up a cluster. Must be at least 2.

type: integer
default: 3

This parameter is used by GATK variant filteration step. It defines the number of SNPs which make up a cluster within a window. Must be at least 2.

Value to be used for the FisherStrand (FS) filter

type: number
default: 30

This parameter defines the value to use for the FisherStrand (FS) filter in the GATK variant-filtering step.
The value should given in a float number format. Default is 30.0

Value to be used for the QualByDepth (QD) filter

type: number
default: 2

This parameter defines the value to use for the QualByDepth (QD) filter in the GATK variant-filtering step.
The value should given in a float number format. Default is 2.0

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden
type: string
default: master

Base directory for Institutional configs.

hidden
type: string
default: https://raw.githubusercontent.com/nf-core/configs/master

If you're running offline, Nextflow will not be able to fetch the institutional config files from the internet. If you don't need them, then this is not a problem. If you do need them, you should download the files from the repo and tell Nextflow where to find them with this parameter.

Institutional config name.

hidden
type: string

Institutional config description.

hidden
type: string

Institutional config contact information.

hidden
type: string

Institutional config URL link.

hidden
type: string

Set the top limit for requested resources for any single job.

Maximum number of CPUs that can be requested for any single job.

hidden
type: integer
default: 16

Use to set an upper-limit for the CPU requirement for each process. Should be an integer e.g. --max_cpus 1

Maximum amount of memory that can be requested for any single job.

hidden
type: string
default: 128.GB
pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Use to set an upper-limit for the memory requirement for each process. Should be a string in the format integer-unit e.g. --max_memory '8.GB'

Maximum amount of time that can be requested for any single job.

hidden
type: string
default: 240.h
pattern: ^(\d+\.?\s*(s|m|h|day)\s*)+$

Use to set an upper-limit for the time requirement for each process. Should be a string in the format integer-unit e.g. --max_time '2.h'

Less common options for the pipeline, typically set in a config file.

Display help text.

hidden
type: boolean

Method used to save pipeline results to output directory.

hidden
type: string

The Nextflow publishDir option specifies which intermediate files should be saved to the output directory. This option tells the pipeline what method should be used to move these files. See Nextflow docs for details.

Email address for completion summary, only when pipeline fails.

hidden
type: string
pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

An email address to send a summary email to when the pipeline is completed - ONLY sent if the pipeline does not exit successfully.

Send plain-text email instead of HTML.

hidden
type: boolean

File size limit when attaching MultiQC reports to summary emails.

hidden
type: string
default: 25.MB
pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Do not use coloured log outputs.

hidden
type: boolean

Custom config file to supply to MultiQC.

hidden
type: string

Directory to keep pipeline Nextflow logs and reports.

hidden
type: string
default: ${params.outdir}/pipeline_info

Boolean whether to validate parameters against the schema at runtime

hidden
type: boolean
default: true

Show all params when using --help

hidden
type: boolean

By default, parameters set as hidden in the schema are not shown on the command line when a user runs with --help. Specifying this option will tell the pipeline to show all parameters.

Run this workflow with Conda. You can also use '-profile conda' instead of providing this parameter.

hidden
type: boolean