Define where the pipeline should find input data and save output data.

Path to comma-separated file containing information about the samples in the experiment. Mutually exclusive with infolder.

type: string
pattern: ^\S+\.csv$

Path to the folder with fastq files. Mutually exclusive with input

type: string

Number of underscore separated fields denoting sample name when infolder is used

type: integer

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required
type: string

Email address for completion summary.

type: string
pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

MultiQC report title. Printed as page header, used for filename if not otherwise specified.

type: string

Reference genome related files and options required for the workflow.

Name of iGenomes reference.

type: string

Path to FASTA genome file.

type: string
pattern: ^\S+\.fn?a(sta)?(\.gz)?$

Do not load the iGenomes reference config.

hidden
type: boolean

The base path to the igenomes reference files

hidden
type: string
default: s3://ngi-igenomes/igenomes/

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden
type: string
default: master

Base directory for Institutional configs.

hidden
type: string
default: https://raw.githubusercontent.com/nf-core/configs/master

Institutional config name.

hidden
type: string

Institutional config description.

hidden
type: string

Institutional config contact information.

hidden
type: string

Institutional config URL link.

hidden
type: string

Less common options for the pipeline, typically set in a config file.

Display version and exit.

hidden
type: boolean

Method used to save pipeline results to output directory.

hidden
type: string

Email address for completion summary, only when pipeline fails.

hidden
type: string
pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Send plain-text email instead of HTML.

hidden
type: boolean

File size limit when attaching MultiQC reports to summary emails.

hidden
type: string
default: 25.MB
pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Do not use coloured log outputs.

hidden
type: boolean

Incoming hook URL for messaging service

hidden
type: string

Custom config file to supply to MultiQC.

hidden
type: string

Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file

hidden
type: string

Custom MultiQC yaml file containing HTML including a methods description.

type: string

Boolean whether to validate parameters against the schema at runtime

hidden
type: boolean
default: true

Base URL or local path to location of pipeline test dataset files

hidden
type: string
default: https://raw.githubusercontent.com/nf-core/test-datasets/

Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.

hidden
type: string

Display the help message.

type: boolean,string

Display the full detailed help message.

type: boolean

Display hidden parameters in the help message (only works when —help or —help_full are provided).

type: boolean

Parameters for options prior to mapping

Whether to save merged fasta files

type: boolean
default: true

Whether to keep only those reads that start with G base

type: boolean

Additional parameters that can be passed to TrimGalore!

type: string

Makes the pipeline skip the G-trimming step in preprocessing

type: boolean

Parameters for mapping

Sequencing platform used. Required for mapping with STAR

type: string

Name of the sequencing center. Required for mapping with STAR

type: boolean

Whether only uniquely mapped reads should be considered for downstream analysis.

type: boolean
default: true

Switches the aligner from STAR to bowtie2

type: boolean

Switches on PCR duplicate removal

type: boolean

Sets an optical duplicate distance, used together with dedup

type: integer

Parameters for creating references

Genome annotation file in GTF format

required
type: string

Specifies a directory with a genome index

type: string

Name of the reference genome. It is used as meta information

type: string

Seed file for BSgenome forging

type: string

Directory containing either a set of FASTA files, one per reference chromosome, or a 2bit file for the whole reference genome. Used for BSgenome forging

type: string

BSgenome R package to use (if not forged)

type: string

Parameters for the underlying CAGEr package

The input CSV samplesheet including the name of the samples, their pairedness status, and the location of bigwig or bam files. Required when cageronly is true.

type: string

Format of the mapping data file passed to the TSS analysis part when STAR is used (either ‘bam’ or ‘bigwig’).

type: string
default: bigwig

Threshold above which raw and normalized CTSS are considered for the correlation plot

type: integer
default: 1

Method used for normalizing the samples: powerLaw, simpleTpm, and none are supported

type: string
default: powerLaw

Defines the lower threshold for fitting the power-law distribution

type: integer
default: 5

Defines the upper threshold for fitting the power-law distribution

type: integer
default: 10000

User specified alpha, the -1 * fitted slope in the log-log representation of the power-law distribution. If none, the average across samples is calculated and used.

type: string

Total number of CAGE tags in the reference power-law distribution. Setting it to 1,000,000 results in normalized tags per million (TPM) values.

type: integer
default: 1000000

Parameters for filtering low expressed CTSS before clustering. ctss_thr specifies the lower threshold above which CTSS are considered, and sample_num_thr specifies the number of samples where this threshold should be passed.

type: integer
default: 1

Parameters for filtering low expressed CTSS before clustering. ctss_thr specifies the lower threshold above which CTSS are considered, and sample_num_thr specifies the number of samples where this threshold should be passed.

type: integer
default: 1

Maximum distance for distance-based clustering (distclu)

type: integer
default: 20

The tpm threshold above which even a single CTSS is kept during clustering

type: integer
default: 5

Define the lower quantile boundaries of the interquartile range

type: number
default: 0.1

Define the upper quantile boundaries of the interquartile range

type: number
default: 0.9

Threshold above which tag clusters are considered for the interquartile width distribution plot

type: integer
default: 3

Upstream distance to consider into TSS region for ChIPseeker annotation. Should be negative.

type: integer
default: -3000

Downstream distance to consider into TSS region for ChIPseeker annotation. Should be positive.

type: integer
default: 3000

The number of bases to include upstream of the TSS for TSS logos

type: integer
default: 35

Used for defining the consensus clusters. consensus_thr specifies the TPM threshold above which tag clusters are considered for consensus clusters, and consensus_dist define the maximum distance between the interquartile ranges of tag clusters to be joined together into consensus clusters.

type: integer
default: 2

Used for defining the consensus clusters. consensus_thr specifies the TPM threshold above which tag clusters are considered for consensus clusters, and consensus_dist define the maximum distance between the interquartile ranges of tag clusters to be joined together into consensus clusters.

type: integer
default: 100

Parameters for the underlying CAGEfightR package from which enhancer calling is used

Defines the balance threshold above which bidirectionality is considered balanced and enhancers are called

type: number
default: 0.95

Used for selecting only supported enhancers. unexpressed is a non inclusive lower TPM boundary for expression when calculating support of enhancers. minSamples is a non-inclusive lower boundary for the number of samples where the clusters should show bidirectionality.

type: integer

Used for selecting only supported enhancers. unexpressed is a non inclusive lower TPM boundary for expression when calculating support of enhancers. minSamples is a non-inclusive lower boundary for the number of samples where the clusters should show bidirectionality.

type: integer

Which mode the pipeline should run in

Run the whole pipeline

type: boolean
default: true

Run only the mapping part until bigiwgs or bams

type: boolean

Run only the CAGEr and CAGEfightR processing parts from bigiwgs or bams

type: boolean