nf-core/differentialabundance
Differential abundance analysis for feature/ observation matrices from platforms such as RNA-seq
Define where the pipeline should find input data and save output data.
A string to identify results in the output directory
string
study
Also used as an identifier in some processes
A string identifying the technology used to produce the data
string
Currently 'rnaseq' or 'affy_array' may be specified.
Path to comma-separated file containing information about the samples in the experiment.
string
^\S+\.(csv|tsv|txt)$
You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row. See usage docs.
A CSV file describing sample contrasts
string
^\S+\.(csv|tsv|txt)$
This file is used to define groups of samples from 'input' to compare. It must contain at least the columns 'variable', 'reference', 'target' and 'blocking', where 'variable' is a column in the input sample sheet, 'reference' and 'target' are values in that column, and blocking is a colon-separated list of additional 'blocking' variables (can be an empty string)
The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.
string
Type of abundance measure used, platform-dependent
string
counts
Ways of providing your abundance values
TSV-format abundance matrix
string
^\S+\.(tsv|csv|txt)$
For example an expression matrix output from the nf-core/rnaseq workflow. There must be a column in this matrix for every row in the input sample sheet.
Not a required input if providing CEL files for affymetrix preprocessing.
(RNA-seq only): optional transcript length matrix with samples and genes as the abundance matrix
string
If provided, this file willl be used to provide transcript lengths to DESeq2 to model length bias across samples
Alternative to matrix: a compressed CEL files archive such as often found in GEO
string
null
Use this option to provide a raw archive of CEL files from Affymetrix arrays. Will be ignored if a matrix is specified.
Use SOFT files from GEO by providing the GSE study identifier
string
null
Use this option to provide a GSE study identifier.
Column in the samples sheet to be used as the primary sample identifier
string
sample
Type of observation
string
sample
This is used in reporting to refer to the observations. Frequently this is 'sample' (e.g. in RNA-seq experiments), but it may also be desirable to refer to 'pool', or 'individual'.
Column in the sample sheet to be used as the display identifier for observations. If unset, will use value of --observations_id_col.
string
Options related to features
Feature ID attribute in the abundance table as well as in the GTF file (e.g. the gene_id field)
string
gene_id
Feature name attribute in the abundance table as well as in the GTF file (e.g. the gene symbol field)
string
gene_name
Type of feature we have, often 'gene'
string
gene
When set, use the control features in scaling/ normalisation
boolean
Use supplied control features in normalistion/ scaling operations?
A text file listing technical features (e.g. spikes)
string
One feature per row. Note that by default these features will just be stripped from matrices prior to internal processing. To actually use them in e.g. normalisation, set --sizefactors_from_controls
Comma-separated string, specifies feature metadata columns to be used for exploratory analysis, platform-specific
string
gene_id,gene_name,gene_biotype
This parameter allows you to supply your own feature annotations. These can often be automatically derived from the GTF used upstream for RNA-seq, or from the Bioconductor annotation package (for affy arrays).
string
^\S+\.(csv|tsv|txt)$
This parameter allows you to supply your own feature annotations. These can often be automatically derived from the GTF used upstream for RNA-seq, or from the Bioconductor annotation package (for affy arrays).
Where a GTF file is supplied, which feature type to use
string
transcript
Where a GTF file is supplied, which field should go first in the converted output table
string
gene_id
Options for processing of affy arrays with justRMA()
Column of the sample sheet containing the Affymetrix CEL file name
string
file
logical value. If TRUE, then background correct using RMA background correction.
boolean
true
integer value indicating which RMA background to use
integer
2
1: use background similar to pure R rma background given in affy version 1.0 - 1.0.2
2: use background similar to pure R rma background given in affy version 1.1 and above
logical value. If TRUE, then works on the PM matrix in place as much as possible, good for large datasets.
boolean
Used to specify the name of an alternative cdf package. If set to NULL, then the usual cdf package based on Affymetrix' mappings will be used.
string
null
logical value. If TRUE, a matrix of probe annotations will be derived.
boolean
true
should the spots marked as 'MASKS' set to NA?
boolean
should the spots marked as 'OUTLIERS' set to NA?
boolean
if TRUE, then overrides what is in rm.mask and rm.oultiers.
boolean
Options for processing of proteomics MaxQuant tables with the Proteus R package
Prefix of the column names of the MaxQuant proteingroups table in which the intensity values are saved; the prefix has to be followed by the sample names that are also found in the samplesheet. Default: 'LFQ intensity'; will search for both the prefix as entered and the prefix followed by one whitespace.
string
LFQ intensity
If the sample columns are e.g. called 'LFQ intensity sample1', 'LFQ intensity sample2' etc., please set this parameter to 'LFQ intensity'.
Normalization function to use on the MaxQuant intensities.
string
'normalizeMedian' or 'normalizeQuantiles'
Which method to use for plotting sample distributions of the MaxQuant intensities; one of 'violin', 'dist', 'box'.
string
'violin', 'dist' or 'box'
Should a loess line be added to the plot of mean-variance relationship of the conditions? Default: true.
boolean
true
Valid R palette name
string
Set1
Check the content of RColorBrewer::brewer.pal.info
from an R terminal for valid palette names.
Options related to filtering upstream of differential analysis
Minimum abundance value
number
1
Minimum observations that must pass the threshold to retain the row/ feature (e.g. gene).
number
1
A minimum proportion of observations, given as a number between 0 and 1, that must pass the threshold. Overrides minimum_samples
number
An optional grouping variable to be used to calculate a min_samples value
string
The variable can be used to define groups and derive a minimum group size upon which to base minimum observation numbers. The rationale for this is to allow retention of features that might be present in only one group. Note that this is consciously NOT filtering with an explicit awareness of groups ("feature must be present in all samples of group A"), since this is known to create biases towards discovery of differential features.
A minimum proportion of observations, given as a number between 0 and 1, that must have a value (not NA) to retain the row/ feature (e.g. gene).
number
0.5
Minimum observations that must have a value (not NA) to retain the row/ feature (e.g. gene). Overrides filtering_min_proportion_not_na.
number
Options related to data exploration
Clustering method used in dendrogram creation
string
ward.D2
Correlation method used in dendrogram creation
string
spearman
Number of features selected before certain exploratory analyses. If -1, will use all features.
integer
500
Length of the whiskers in boxplots as multiple of IQR. Defaults to 1.5.
number
1.5
Threshold on MAD score for outlier identification
integer
-5
MAD = median absolute deviation. A threshold on this value is used to define observations (samples) as outliers, or not, in exploratory plots. Based on the definition at https://wiki.arrayserver.com/wiki/index.php?title=CorrelationQC.pdf.
How should the main grouping variable be selected? 'auto_pca', 'contrasts', or a valid column name from the observations table.
string
auto_pca
Some plots are only generated once, with a single sample grouping, this option defines how that sample grouping is selected. It should be 'auto_pca' (variable selected from the sample sheet with the most association with the first principal component), 'contrasts' (pick the variable associated with the first contrast), or a value specifying a specific column in the observations.
Specifies assay names to be used for matrices, platform-specific.
string
raw,normalised,variance_stabilised
Specifies final assay to be used for exploratory analysis, platform-specific
string
variance_stabilised
Of which assays to compute the log2 during exploratory analysis. Not necessary for maxquant data as this is controlled by the pipeline.
string
Either comma-separated of assay positions, e.g. '[1,2,3]', or empty list '[]' to not log any assay. If not set, will guess which assays need to be logged (those with a maximum > 20).
Valid R palette name
string
Set1
Check the content of RColorBrewer::brewer.pal.info
from an R terminal for valid palette names.
Options related to differential operations
Advanced option: the suffix associated tabular differential results tables. Will by default use the appropriate suffix according to the study_type.
string
The feature identifier column in differential results tables
string
gene_id
The fold change column in differential results tables
string
log2FoldChange
The p value column in differential results tables
string
pvalue
The q value column in differential results tables.
string
padj
Minimum fold change used to calculate differential feature numbers
number
2
Maximum p value used to calculate differential feature numbers
number
1
Maximum q value used to calculate differential feature numbers
number
0.05
Where a features file (GTF) has been provided, what attributed to use to name features
string
gene_name
Indicate whether or not fold changes are on the log scale (default is to assume they are)
boolean
true
Valid R palette name
string
Set1
Check the content of RColorBrewer::brewer.pal.info
from an R terminal for valid palette names.
In differential analysis (DEseq2 or Limma), subset to the contrast samples before modelling variance?
boolean
test
parameter passed to DESeq()
string
either "Wald" or "LRT", which will then use either Wald significance tests (defined by nbinomWaldTest), or the likelihood ratio test on the difference in deviance between a full and reduced model formula (defined by nbinomLRT)
fitType
parameter passed to DESeq()
string
either "parametric", "local", "mean", or "glmGamPoi" for the type of fitting of dispersions to the mean intensity. See estimateDispersions for description.
sfType
parameter passed to DESeq()
string
either "ratio", "poscounts", or "iterate" for the type of size factor estimation. See estimateSizeFactors for description.
'minReplicatesForReplace' parameter passed to DESeq()
integer
7
the minimum number of replicates required in order to use replaceOutliers on a sample. If there are samples with so many replicates, the model will be refit after these replacing outliers, flagged by Cook's distance. Set to Inf in order to never replace outliers. It set to Inf for fitType="glmGamPoi".
useT
parameter passed to DESeq2
boolean
logical, passed to nbinomWaldTest, default is FALSE, where Wald statistics are assumed to follow a standard Normal
independentFiltering
parameter passed to results()
boolean
true
logical, whether independent filtering should be applied automatically
lfcThreshold
parameter passed to results()
integer
a non-negative value which specifies a log2 fold change threshold. The default value is 0, corresponding to a test that the log2 fold changes are equal to zero. The user can specify the alternative hypothesis using the altHypothesis argument, which defaults to testing for log2 fold changes greater in absolute value than a given threshold. If lfcThreshold is specified, the results are for Wald tests, and LRT p-values will be overwritten.
altHypothesis
parameter passed to results()
string
greaterAbs
character which specifies the alternative hypothesis, i.e. those values of log2 fold change which the user is interested in finding. The complement of this set of values is the null hypothesis which will be tested. If the log2 fold change specified by 'name' or by contrast' is written as beta , then the possible values for 'altHypothesis' represent the following alternate hypotheses: 1) greaterAbs: |beta| > lfcThreshold , and p-values are two-tailed 2) lessAbs: |beta| < lfcThreshold , p-values are the maximum of the upper and lower tests. The Wald statistic given is positive, an SE-scaled distance from the closest boundary 3) greater: beta > lfcThreshold 4) less: beta < -lfcThreshold
pAdjustMethod
parameter passed to results()
string
BH
the method to use for adjusting p-values, see help in R for the p.adjust() function (via ?p.adjust). At time of writing available values were "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "null".
alpha
parameter passed to results()
number
0.1
the significance cutoff used for optimizing the independent filtering (by default 0.1). If the adjusted p-value cutoff (FDR) will be a value other than 0.1, alpha should be set to that value.
minmu
parameter passed to results()
number
0.5
lower bound on the estimated count (used when calculating contrasts)
variance stabilisation method to use when making a variance stabilised matrix
string
'rlog', 'vst' or 'rlog,vst'
Shink fold changes in results?
boolean
true
'ashr' method is the only method currently implemented
Number of cores
integer
1
Number of cores to use with DESeq()
blind
parameter for rlog() and/ or vst()
boolean
true
logical, whether to blind the transformation to the experimental design
nsub
parameter passed to vst()
integer
1000
the number of genes to subset to (default 1000)
passed to lmFit(), positive integer giving the number of times each distinct probe is printed on each array.
number
passed to lmFit(), positive integer giving the spacing between duplicate occurrences of the same probe, spacing=1 for consecutive rows.
string
null
Sample sheet column to be used to derive a vector or factor specifying a blocking variable on the arrays
string
null
passed to lmFit(), the inter-duplicate or inter-technical replicate correlation
string
null
passed to lmFit(), the fitting method
string
"ls" for least squares or "robust" for robust regression
passed to eBayes(), a numeric value between 0 and 1, assumed proportion of genes which are differentially expressed
number
0.01
passed to eBayes(), logical, should an intensity-dependent trend be allowed for the prior variance?
boolean
If FALSE then the prior variance is constant. Alternatively, trend can be a row-wise numeric vector, which will be used as the covariate for the prior variance.
passed to eBayes(), logical, should the estimation of df.prior and var.prior be robustified against outlier sample variances?
boolean
passed to eBayes, comma separated string of two values, assumed lower and upper limits for the standard deviation of log2-fold-changes for differentially expressed genes
string
0.1,4
passed to eBayes, comma separated string of length 1 or 2, giving left and right tail proportions of x to Winsorize. Used only when robust=TRUE.
string
0.05,0.1
passed to topTable(), minimum absolute log2-fold-change required
integer
topTable and topTableF include only genes with (at least one) absolute log-fold-change greater than lfc. topTreat does not remove genes but ranks genes by evidence that their log-fold-change exceeds lfc.
passed to topTable(), logical, should confidence 95% intervals be output for logFC? Alternatively, can take a numeric value between zero and one specifying the confidence level required.
boolean
passed to topTable(), method used to adjust the p-values for multiple testing.
string
cutoff value for adjusted p-values. Only genes with lower p-values are listed.
number
1
Set to run GSEA to infer differential gene sets in contrasts
boolean
Permutation type
string
Select the type of permutation to perform in assessing the statistical significance of the enrichment score. (See 'required fields' at https://www.gsea-msigdb.org/gsea/doc/GSEAUserGuideFrame.html?Run_GSEA_Page for more info)
Number of permutations
integer
1000
Specify the number of permutations to perform in assessing the statistical significance of the enrichment score. (See 'required fields' at https://www.gsea-msigdb.org/gsea/doc/GSEAUserGuideFrame.html?Run_GSEA_Page)
Enrichment statistic
string
See 'basic fields' at https://www.gsea-msigdb.org/gsea/doc/GSEAUserGuideFrame.html?Run_GSEA_Page for a detailed explanation.
Metric for ranking genes
string
See https://www.gsea-msigdb.org/gsea/doc/GSEAUserGuideTEXT.htm#_Metrics_for_Ranking for a detailed explanation.
Gene list sorting mode
string
GSEA ranks the genes in the expression dataset and then analyzes that ranked list of genes. Use this parameter to determine whether to sort the genes using the real (default) or absolute value of the ranking metric.
See 'basic fields' at https://www.gsea-msigdb.org/gsea/doc/GSEAUserGuideFrame.html?Run_GSEA_Page
Gene list ordering mode
string
GSEA ranks the genes in the expression dataset and then analyzes that ranked list of genes. Use this parameter to determine whether to sort the genes in descending (default) or ascending order. Ascending order is usually applicable when the ranking metric is a measure of nearness (how close the genes are to one another) rather than distance.
See 'basic fields' at https://www.gsea-msigdb.org/gsea/doc/GSEAUserGuideFrame.html?Run_GSEA_Page
Max size: exclude larger sets
integer
500
After filtering from the gene sets any gene not in the expression dataset, gene sets larger than this are excluded from the analysis.
See 'basic fields' at https://www.gsea-msigdb.org/gsea/doc/GSEAUserGuideFrame.html?Run_GSEA_Page
Min size: exclude smaller sets
integer
15
After filtering from the gene sets any gene not in the expression dataset, gene sets smaller than this are excluded from the analysis.
See 'basic fields' at https://www.gsea-msigdb.org/gsea/doc/GSEAUserGuideFrame.html?Run_GSEA_Page
Normalisation mode
string
Normalization mode. Method used to normalize the enrichment scores across analyzed gene sets: 'meandiv' (default, GSEA normalizes the enrichment scores as described in Normalized Enrichment Score (NES)) OR 'null' (GSEA does not normalize the enrichment scores).
See 'advanced fields' at https://www.gsea-msigdb.org/gsea/doc/GSEAUserGuideFrame.html?Run_GSEA_Page
Randomization mode
string
Method used to randomly assign phenotype labels to samples for phenotype permutations. Not used for gene_set permutations.
See 'advanced fields' at https://www.gsea-msigdb.org/gsea/doc/GSEAUserGuideFrame.html?Run_GSEA_Page
Make detailed geneset report?
boolean
true
Use median for class metrics
boolean
Set to true (default=false) to use the median of each class, instead of the mean, in the metrics for ranking for genes
See 'advanced fields' at https://www.gsea-msigdb.org/gsea/doc/GSEAUserGuideFrame.html?Run_GSEA_Page.
Number of markers
integer
100
Number of features (gene or probes) to include in the butterfly plot in the Gene Markers section of the gene set enrichment report.
See 'advanced fields' at https://www.gsea-msigdb.org/gsea/doc/GSEAUserGuideFrame.html?Run_GSEA_Page.
Plot graphs for the top sets of each phenotype
integer
20
Generates summary plots and detailed analysis results for the top x genes in each phenotype, where x is 20 by default. The top genes are those with the largest normalized enrichment scores.
See 'advanced fields' at https://www.gsea-msigdb.org/gsea/doc/GSEAUserGuideFrame.html?Run_GSEA_Page.
Seed for permutation
string
timestamp
Seed used to generate a random number for phenotype and gene_set permutations: timestamp (default), 149, or user input. The specific seed value (149) generates consistent results, which is useful when testing software.
Save random ranked lists
boolean
Set to 'true' (default=false) to save the random ranked lists of genes created by phenotype permutations. When you save random ranked lists, for each permutation, GSEA saves the rank metric score for each gene (the score used to position the gene in the ranked list). Saving random ranked lists is memory intensive; therefore, this parameter is set to false by default.
Make a zipped file with all reports
boolean
Set to True (default=false) to create a zip file of the analysis results. The zip file is saved to the output folder with all of the other files generated by the analysis. This is useful for sharing analysis results
Set to run gprofiler2 and do a pathway enrichment analysis.
boolean
Short name of the organism that is analyzed, e.g. hsapiens for homo sapiens.
string
Set this to the short organism name consisting of the first letter of the genus and the full species name, e.g. hsapiens for Homo sapiens, mmusculus for Mus musculus. This has second priority and will be overridden by --gprofiler2_token.
Should only significant enrichment results be considered?
boolean
true
Default true; if false, will consider all enrichment results regardless of significance.
Should underrepresentation be measured instead of overrepresentation?
boolean
Default false; if true, will measure overrepresentation.
The method that should be used for multiple testing correction.
string
One of gSCS (synonyms: analytical, g_SCS), fdr (synonyms: false_discovery_rate), bonferroni.
On which source databases to run the gprofiler query
string
GO, GO:MF, GO:BP, GO:CC, KEGG, REAC, WP, TF, MIRNA, HPA, CORUM, HP, or any comma-reparated combination thereof, e.g. 'KEGG,REAC'. This works if --gprofiler2_organism is used; if a GMT file is provided with --gene_sets_files, should also work; the module will then remove any lines not starting with any of the source names. Does not work for --gprofiler2_token as g:Profiler will not filter such a run.
Whether to include evcodes in the results.
boolean
This can decrease performance and make the query slower. See https://rdrr.io/cran/gprofiler2/man/gost.html
Maximum q value used for significance testing.
number
0.05
Token that should be used as a query.
string
For reproducibility, instead of querying the online databases, you can provide a token, e.g. from a previous pipeline run or from a manual query on https://biit.cs.ut.ee/gprofiler/gost. This has highest priority and will override --gprofiler2_organism and --gene_sets_files.
Path to CSV/TSV/TXT file that should be used as a background for the query; alternatively, 'auto' (default) or 'false'.
string
^\S+\.(csv|tsv|txt)$|auto|false
It is advisable to run pathway analysis with a set of background genes describing which genes exist in the target organism in the first place so that other genes are not at all considered. This parameter is by default set to 'auto', meaning that the filtered input abundance matrix will be used. Alternatively, you can provide a CSV/TSV table where one column contains gene IDs and the other rows contain abundance values, or a TXT file that simply contains one gene ID per line. If a custom CSV/TSV is used, all genes will be considered which had at least some abundance (i.e. sum of all abundance values in a row > 0). Set to 'false' if you do not want to use a background.
Which column to use as gene IDs in the background matrix.
string
If a background matrix is provided but this parameter is not set, will assume that the first matrix column contains the IDs.
How to calculate the statistical domain size.
string
One of annotated (default), known, custom or custom_annotated; see https://rdrr.io/cran/gprofiler2/man/gost.html
How many genes must be differentially expressed in a pathway for it to be considered enriched? Default 1.
integer
1
Valid R palette name
string
Blues
Check the content of RColorBrewer::brewer.pal.info
from an R terminal for valid palette names.
Should a Shiny app be built?
boolean
true
At a minimum this will trigger generation of files you can quickly use to spin up a shiny app locally. But you can also use the 'shinyapps' settings to deploy an app straight to shinyapps.io.
Should the app be deployed to shinyapps.io?
boolean
Your shinyapps.io account name
string
null
The name of the app to push to in your shinyapps.io account
string
null
Should we guess the log status of matrices and unlog for the app?
boolean
true
In the app context, it's usually helpful if things are not in log scale, so that e.g. fold changes make some sense with respect to observed values. This flag will cause the shinyngs app-building script to make a guess based on observed values as to the log status of input matrices, and adjust the loading accordingly.
Files and options used by gene set analysis modules.
Gene sets in GMT or GMX-format; for GSEA: multiple comma-separated input files in either format are possible. For gprofiler2: A single file in GMT format is possible; this has lowest priority and will be overridden by --gprofiler2_token and --gprofiler2_organism.
string
null
Rmd report template from which to create the pipeline report
string
${projectDir}/assets/differentialabundance_report.Rmd
^\S+\.Rmd$
The pipeline will always generate a default report which gives a good overview of the analysis results. Should this default report not suit your needs, you can provide the path to a custom report instead.
Email address for completion summary.
string
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (~/.nextflow/config
) then you don't need to specify this on the command line for every run.
A logo to display in the report instead of the generic pipeline logo
string
${projectDir}/docs/images/nf-core-differentialabundance_logo_light.png
CSS to use to style the output, in lieu of the default nf-core styling
string
${projectDir}/assets/nf-core_style.css
A markdown file containing citations to include in the fiinal report
string
${projectDir}/CITATIONS.md
A title for reporting outputs
string
null
An author for reporting outputs
string
null
Semicolon-separated string of contributor info that should be listed in the report.
string
List here names, roles, affiliations, contact info etc. of contributors to your project. Entries of different contributors are separated by semicolons, linebreaks within a contributor are separated by
. The first line of each contributor will be bold in the report. E.g.: 'Jane Doe
Director of Institute of Microbiology
University of Smallville;John Smith
PhD student
University of Smallville'
A description for reporting outputs
string
null
Whether to generate a scree plot in the report
boolean
true
To how many digits should numeric output in different modules be rounded? If -1, will not round.
integer
4
This affects output from the following modules (both their tabular output and their result sections in the report): proteus, gprofiler2.
Reference genome related files and options required for the workflow.
Name of iGenomes reference.
string
If using a reference genome configured in the pipeline using iGenomes, use this parameter to give the ID for the reference. This is then used to build the full paths for all required reference genome files e.g. --genome GRCh38
.
See the nf-core website docs for more details.
Genome annotation file in GTF format
string
^\S+\.gtf(\.gz)?
"This parameter is mandatory if --genome
is not specified."
Do not load the iGenomes reference config.
boolean
Do not load igenomes.config
when running the pipeline. You may choose this option if you observe clashes between custom parameters and those supplied in igenomes.config
.
Parameters used to describe centralised config profiles. These should not be edited.
Git commit id for Institutional configs.
string
master
Base directory for Institutional configs.
string
https://raw.githubusercontent.com/nf-core/configs/master
If you're running offline, Nextflow will not be able to fetch the institutional config files from the internet. If you don't need them, then this is not a problem. If you do need them, you should download the files from the repo and tell Nextflow where to find them with this parameter.
Institutional config name.
string
Institutional config description.
string
Institutional config contact information.
string
Institutional config URL link.
string
Set the top limit for requested resources for any single job.
Maximum number of CPUs that can be requested for any single job.
integer
16
Use to set an upper-limit for the CPU requirement for each process. Should be an integer e.g. --max_cpus 1
Maximum amount of memory that can be requested for any single job.
string
128.GB
^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$
Use to set an upper-limit for the memory requirement for each process. Should be a string in the format integer-unit e.g. --max_memory '8.GB'
Maximum amount of time that can be requested for any single job.
string
240.h
^(\d+\.?\s*(s|m|h|d|day)\s*)+$
Use to set an upper-limit for the time requirement for each process. Should be a string in the format integer-unit e.g. --max_time '2.h'
Less common options for the pipeline, typically set in a config file.
Display help text.
boolean
Display version and exit.
boolean
Method used to save pipeline results to output directory.
string
The Nextflow publishDir
option specifies which intermediate files should be saved to the output directory. This option tells the pipeline what method should be used to move these files. See Nextflow docs for details.
Email address for completion summary, only when pipeline fails.
string
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
An email address to send a summary email to when the pipeline is completed - ONLY sent if the pipeline does not exit successfully.
Send plain-text email instead of HTML.
boolean
Do not use coloured log outputs.
boolean
Incoming hook URL for messaging service
string
Incoming hook URL for messaging service. Currently, MS Teams and Slack are supported.
Boolean whether to validate parameters against the schema at runtime
boolean
true
Show all params when using --help
boolean
By default, parameters set as hidden in the schema are not shown on the command line when a user runs with --help
. Specifying this option will tell the pipeline to show all parameters.
Validation of parameters fails when an unrecognised parameter is found.
boolean
By default, when an unrecognised parameter is found, it returns a warinig.
Validation of parameters in lenient more.
boolean
Allows string values that are parseable as numbers or booleans. For further information see JSONSchema docs.