rnaseq: Parameters

Define where the pipeline should find input data and save output data.

Path to comma-separated file containing information about the samples in the experiment.

type: string

pattern: ^\S+\.csv$

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required

type: string

Email address for completion summary.

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

MultiQC report title. Printed as page header, used for filename if not otherwise specified.

type: string

Save FastQ files after merging re-sequenced libraries in the results directory.

type: boolean

Options for processing reads with unique molecular identifiers

Enable UMI-based read deduplication.

type: boolean

UMI pattern to use. Can be either ‘string’ (default) or ‘regex’.

type: string

default: string

Skip the UMI extraction from the read in case the UMIs have been moved to the headers in advance of the pipeline run.

type: boolean

The UMI barcode pattern to use e.g. ‘NNNNNN’ indicates that the first 6 nucleotides of the read are from the UMI.

type: string

Generate output stats when running “umi_tools dedup”.

type: boolean

After UMI barcode extraction discard either R1 or R2 by setting this parameter to 1 or 2, respectively.

type: integer

If this option is specified, intermediate FastQ and BAM files produced by UMI-tools are also saved in the results directory.

type: boolean

Options for filtering reads prior to alignment

Path to comma-separated file containing a list of reference genomes to filter reads against with BBSplit. You have to also explicitly set --skip_bbsplit false if you want to use BBSplit.

type: string

Path to directory or tar.gz archive for pre-built BBSplit index.

type: string

If this option is specified, FastQ files split by reference will be saved in the results directory.

type: boolean

Skip BBSplit for removal of non-reference genome reads.

type: boolean

default: true

Enable the removal of reads derived from ribosomal RNA using SortMeRNA.

type: boolean

Text file containing paths to fasta files (one per line) that will be used to create the database for SortMeRNA.

type: string

default: ${projectDir}/assets/rrna-db-defaults.txt

If this option is specified, intermediate FastQ files containing non-rRNA reads will be saved in the results directory.

type: boolean

Reference genome related files and options required for the workflow.

Name of iGenomes reference.

type: string

Path to FASTA genome file.

type: string

pattern: ^\S+\.fn?a(sta)?(\.gz)?$

Path to GTF annotation file.

type: string

pattern: ^\S+\.gtf(\.gz)?$

Path to GFF3 annotation file.

type: string

pattern: ^\S+\.gff(\.gz)?$

Path to BED file containing gene intervals. This will be created from the GTF file if not specified.

type: string

pattern: ^\S+\.bed(\.gz)?$

Path to FASTA transcriptome file.

type: string

pattern: ^\S+\.fn?a(sta)?(\.gz)?$

FASTA file to concatenate to genome FASTA file e.g. containing spike-in sequences.

type: string

pattern: ^\S+\.fn?a(sta)?(\.gz)?$

Splice sites file required for HISAT2.

type: string

Path to directory or tar.gz archive for pre-built STAR index.

type: string

Path to directory or tar.gz archive for pre-built HISAT2 index.

type: string

Path to directory or tar.gz archive for pre-built RSEM index.

type: string

Path to directory or tar.gz archive for pre-built Salmon index.

type: string

Minimum memory required to use splice sites and exons in the HiSAT2 index build process.

type: string

default: 200.GB

pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Specify if your GTF annotation is in GENCODE format.

type: boolean

By default, the pipeline uses the gene_name field to obtain additional gene identifiers from the input GTF file when running Salmon.

type: string

default: gene_name

Define the attribute type used to group features in the GTF file when running Salmon.

type: string

default: gene_id

The attribute type used to group feature types in the GTF file when generating the biotype plot with featureCounts.

type: string

default: gene_biotype

By default, the pipeline assigns reads based on the ‘exon’ attribute within the GTF file.

type: string

default: exon

If generated by the pipeline save the STAR index in the results directory.

type: boolean

Directory / URL base for iGenomes references.

hidden

type: string

default: s3://ngi-igenomes/igenomes

Do not load the iGenomes reference config.

hidden

type: boolean

Options to adjust read trimming criteria.

Instructs Trim Galore to remove bp from the 5’ end of read 1 (or single-end reads).

type: integer

Instructs Trim Galore to remove bp from the 5’ end of read 2 (paired-end reads only).

type: integer

Instructs Trim Galore to remove bp from the 3’ end of read 1 AFTER adapter/quality trimming has been performed.

type: integer

Instructs Trim Galore to remove bp from the 3’ end of read 2 AFTER adapter/quality trimming has been performed.

type: integer

Instructs Trim Galore to apply the —nextseq=X option, to trim based on quality after removing poly-G tails.

type: integer

Minimum number of trimmed reads below which samples are removed from further processing. Some downstream steps in the pipeline will fail if this threshold is too low.

type: integer

default: 10000

Skip the adapter trimming step.

type: boolean

Save the trimmed FastQ files in the results directory.

type: boolean

Options to adjust parameters and filtering criteria for read alignments.

Specifies the alignment algorithm to use - available options are ‘star_salmon’, ‘star_rsem’ and ‘hisat2’.

type: string

Specifies the pseudo aligner to use - available options are ‘salmon’. Runs in addition to ‘—aligner’.

type: string

Create a CSI index for BAM files instead of the traditional BAI index. This will be required for genomes with larger chromosome sizes.

type: boolean

When using pre-built STAR indices do not re-extract and use splice junctions from the GTF file.

type: boolean

Override Salmon library type inferred based on strandedness defined in meta object.

type: string

Minimum percentage of uniquely mapped reads below which samples are removed from further processing.

type: number

default: 5

Sequencing center information to be added to read group of BAM files.

type: string

Perform reference-guided de novo assembly of transcripts using StringTie i.e. dont restrict to those in GTF file.

type: boolean

Where possible, save unaligned reads from either STAR, HISAT2 or Salmon to the results directory.

type: boolean

Save the intermediate BAM files from the alignment step.

type: boolean

Skip picard MarkDuplicates step.

type: boolean

Skip all of the alignment-based processes within the pipeline.

type: boolean

Options to skip various steps within the workflow.

Specify the RSeQC modules to run.

type: string

default: bam_stat,inner_distance,infer_experiment,junction_annotation,junction_saturation,read_distribution,read_duplication

Use vst transformation instead of rlog with DESeq2.

type: boolean

Skip bigWig file creation.

type: boolean

Skip StringTie.

type: boolean

Skip FastQC.

type: boolean

Skip Preseq.

type: boolean

Skip dupRadar.

type: boolean

Skip Qualimap.

type: boolean

Skip RSeQC.

type: boolean

Skip additional featureCounts process for biotype QC.

type: boolean

Skip DESeq2 PCA and heatmap plotting.

type: boolean

Skip MultiQC.

type: boolean

Skip all QC steps except for MultiQC.

type: boolean

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden

type: string

default: master

Base directory for Institutional configs.

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/configs/master

Institutional config name.

hidden

type: string

Institutional config description.

hidden

type: string

Institutional config contact information.

hidden

type: string

Institutional config URL link.

hidden

type: string

Set the top limit for requested resources for any single job.

Maximum number of CPUs that can be requested for any single job.

hidden

type: integer

default: 16

Maximum amount of memory that can be requested for any single job.

hidden

type: string

default: 128.GB

pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Maximum amount of time that can be requested for any single job.

hidden

type: string

default: 240.h

pattern: ^(\d+\.?\s*(s|m|h|day)\s*)+$

Less common options for the pipeline, typically set in a config file.

Display help text.

hidden

type: boolean

Method used to save pipeline results to output directory.

hidden

type: string

Email address for completion summary, only when pipeline fails.

hidden

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Send plain-text email instead of HTML.

hidden

type: boolean

File size limit when attaching MultiQC reports to summary emails.

hidden

type: string

default: 25.MB

Do not use coloured log outputs.

hidden

type: boolean

Custom config file to supply to MultiQC.

hidden

type: string

Directory to keep pipeline Nextflow logs and reports.

hidden

type: string

default: ${params.outdir}/pipeline_info

Boolean whether to validate parameters against the schema at runtime

hidden

type: boolean

default: true

Show all params when using --help

hidden

type: boolean

Run this workflow with Conda. You can also use ‘-profile conda’ instead of providing this parameter.

hidden

type: boolean

nf-core/rnaseq