Define where the pipeline should find input data and save output data.

Path to comma-separated file containing information about the samples in the experiment.

required
type: string
pattern: ^\S+\.csv$

Mode of the pipeline.

required
type: string

Tools used for hash demultiplexing.

type: string
default: gmm-demux

Tools used for genetic demultiplexing.

type: string
default: vireo

Perform BAM QC.

type: boolean
default: true

File with common variants. If provided, the BAM files will be filtered to only include reads that overlap with the common variants.

type: string
pattern: ^\S+\.vcf(\.gz)?$

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required
type: string

Save intermediate files.

type: boolean

Email address for completion summary.

type: string
pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

MultiQC report title. Printed as page header, used for filename if not otherwise specified.

type: string

Reference genome related files and options required for the workflow.

Name of iGenomes reference.

type: string

Path to FASTA genome file.

type: string
pattern: ^\S+\.fn?a(sta)?(\.gz)?$

Do not load the iGenomes reference config.

hidden
type: boolean

The base path to the igenomes reference files

hidden
type: string
default: s3://ngi-igenomes/igenomes/

Options for matching donors between different demultiplexing methods.

Match donor between different demultiplexing methods.

type: boolean
default: true

Path to demultiplexing result CSV file (necessary only for donor_match mode).

type: string,null

Path to Vireo filtered variants file (necessary only for donor_match mode).

type: string,null

Path to cell genotype file (necessary only for donor_match mode).

type: string,null

First method to use for donor matching.

type: string,null

Second method to use for donor matching.

type: string,null

Find variants for donor matching.

type: boolean
default: true

Option to subset the donor genotype based on detected variants.

type: boolean
default: true

Minimum variant count threshold.

type: integer
default: 10

The Minimal percentage of a variant for filtering. Has to be in a range between [0,5;1[. For example, 0.9 means that we only keep variants with a frequency higher than 90% or lower than 10%.

type: number
default: 0.9

Path to cell genotype file (necessary only for donor_match mode).

type: string,null

Options specific to the demuxEM tool for cell hashing demultiplexing.

Generate diagnostic plots.

type: boolean
default: true

The Dirichlet prior concentration parameter (alpha) on samples. An alpha value < 1.0 will make the prior sparse.

type: number

Only demultiplex cells/nuclei with at least this number of expressed genes.

type: integer
default: 100

Only demultiplex cells/nuclei with at least this number of UMIs.

type: integer
default: 100

Any cell/nucleus with less than this count of hashtags from the signal will be marked as unknown.

type: number
default: 10

The random seed used in the KMeans algorithm to separate empty ADT droplets from others.

type: integer

Comma-separated list of gender-specific genes (e.g. Xist) for generating violin plots.

type: string

Options specific to the BFF cell hashing demultiplexing.

Method(s) to use within BFF.

type: string
default: COMBINED

Whether to run preprocessing steps for BFF.

type: boolean
default: true

Path to barcode whitelist for preprocessing.

type: string,null

Path to cell barcode whitelist for GenerateCellHashingCalls().

type: string,null

Methods to use for consensus calling.

type: string,null

Optional metrics file path.

type: string,null

Whether to compute tSNE visualization in BFF.

type: boolean

Whether to generate heatmaps in BFF.

type: boolean
default: true

Per-cell saturation value.

type: number,null

Majority consensus threshold.

type: number,null

Library chemistry (e.g., 10xV3).

type: string
default: 10xV3

Threshold for caller disagreement.

type: number,null

Options specific to the GMM-Demux tool for cell hashing demultiplexing.

Comma separated list of HTO names, without whitespace. If null, hto_names are extracted from the input hto matrix from features.tsv.gz.

type: string,null

If specified, it will generate the statistic summary of the dataset, including MSM and SSM rates. This requires an estimated total number of cells in the assay as input.

type: integer,null

If true, full classification report is generated, otherwise the simplified classification report.

type: boolean
default: true

If true, summary report is generated.

type: boolean
default: true

Load a full classification report and skip the mtx folder as input. Requires a file path argument.

type: string,null

Provide the cell list. Requires a file path argument. Only executes if -u is set.

type: string,null

Names of the HTO tag(s) to extract, separated by ’,’. Joint HTO samples are combined with ’+’, such as ‘HTO_1+HTO_2’.

type: string

The confidence threshold value for classification. A higher value leads to more stringent classification.

type: number
default: 0.8

The random seed used in the GaussianMixture algorithm.

type: integer

Options specific to the Scanpy Hashsolo demultiplexing module.

Groovy list ([‘hash_1’, ‘hash_2’]) of .obs columns that contain cell hashing counts. Can be null if the data is in 10x Genomics format, as the columns are derived from the input.

type: array,null

List of comma-separated priors for each hypothesis: NEGATIVE, SINGLET, DOUBLET.

type: string
default: 0.01,0.8,0.19

Column in cell_hashing_adata.obs for how to break up demultiplexing.

type: string,null

Input directory containing transcriptomic data in 10x mtx format.

type: string,null

Number of barcodes to use to create noise distribution.

type: integer,null

Number of decimal places to round numeric values in cell_hashing_data.obs before saving. If omitted, no rounding is applied.

type: integer
default: 10

Options specific to the HTODemux tool for cell hashing demultiplexing.

The quantile to use for thresholding.

type: number
default: 0.99

Initialization method for clustering.

type: string
default: NULL

Number of starts for clustering.

type: integer
default: 100

Clustering function to use.

type: string
default: clara

Number of samples for clustering.

type: integer
default: 100

Random seed for reproducibility.

type: integer
default: 42

Whether to print verbose output.

type: boolean
default: true

Options specific to the HTODemux visualization tool for generating plots and visualizations.

Generate ridge plot.

type: boolean
default: true

The number of plots that are dispalyed next to each other in one row. The number of plots corresponds to the number of Hash Tag Oligo (HTO) identifiers.

type: integer
default: 2

Generate feature scatter plot. If no features are provided (one of them is null), the first two features from the assay will be used.

type: boolean
default: true

Name of a Hash Tag Oligo (HTO) identifiers, usually defined in the feature.tsv of the hto matrix folder.

type: string

Name of a Hash Tag Oligo (HTO) identifiers, usually defined in the feature.tsv of the hto matrix folder.

type: string

Generate violin plot.

type: boolean
default: true

Features to plot (gene expression, metrics, PC scores, anything that can be retrieved by FetchData).

type: string
default: nCount_RNA

Plot the feature axis on log scale.

type: boolean
default: true

Generate a two dimensional tSNE embedding for HTOs.

type: boolean
default: true

What should we remove from the object (we have Singlet, Doublet and Negative).

type: string

Invert tSNE selection.

type: boolean
default: true

Verbose tSNE.

type: boolean

Approximate tSNE.

type: boolean

Max number of donors.

type: integer
default: 2

Value for perplexity.

type: integer
default: 100

Generate heatmap.

type: boolean
default: true

Number of cells for heatmap.

type: integer
default: 500

Options specific to the MultiSeqDemux tool for cell hashing demultiplexing.

The quantile to use for thresholding.

type: number
default: 0.7

Whether to automatically determine thresholds.

type: boolean
default: true

Maximum number of iterations.

type: integer
default: 5

Start of quantile range.

type: number
default: 0.1

End of quantile range.

type: number
default: 0.9

Step size for quantile range.

type: number
default: 0.05

Whether to print verbose output.

type: boolean
default: true

Options specific to the HashedDrops tool for cell hashing demultiplexing.

Lower bound on total UMI count for empty droplets.

type: integer
default: 100

Number of iterations for Monte Carlo p-value calculations.

type: integer
default: 10000

Whether to test ambient RNA.

type: boolean
default: true

Whether to round non-integer values.

type: boolean
default: true

Alternative method for identifying empty droplets.

type: integer,null

FDR threshold for cell filtering.

type: number
default: 0.01

Column to use for gene names.

type: integer
default: 2

Lower bound for ignoring barcodes.

type: number,null

Scaling parameter for Dirichlet-multinomial sampling.

type: number,null

Whether to use ambient solution abundance.

type: boolean
default: true

Minimum proportion for ambient profile inference.

type: number
default: 0.05

Minimum pseudo-count for log-fold change computation.

type: integer
default: 5

Whether to use constant ambient contamination level.

type: boolean

Number of MADs to identify doublets.

type: integer
default: 3

Minimum threshold for doublet identification.

type: integer
default: 2

Whether to use 2-component mixture model for doublets.

type: boolean

Number of MADs to identify confident singlets.

type: integer
default: 3

Minimum threshold for confident singlet identification.

type: integer
default: 2

An integer matrix specifying valid combinations of HTOs. Number of items in each row has to be the same.

type: array

Whether to run EmptyDrops analysis as part of HashedDrops.

type: boolean

Options for preprocessing data for HTODemux and MultiSeq demultiplexing.

Method for feature selection.

type: string
default: mean.var.plot

Delimiter for parsing feature names.

type: string
default: _

Number of features to select.

type: integer
default: 2000

Assay type for preprocessing.

type: string
default: HTO

Margin parameter for preprocessing.

type: integer
default: 2

Normalization method to use.

type: string
default: CLR

Column containing gene information.

type: integer
default: 2

Options controlling hash summary and downstream exported formats.

Generate AnnData (.h5ad) outputs for hashing results.

type: boolean
default: true

Generate MuData outputs for hashing results.

type: boolean
default: true

Options specific to the CellSNP-lite tool for genotyping bi-allelic SNPs on single cells.

Tag for cell barcodes, e.g., CB for 10x Genomics. Set to ‘None’ for bulk RNA-seq or SMART-seq2.

type: string
default: CB

Tag for UMI barcodes, e.g., UB for 10x Genomics. Set to ‘None’ for bulk RNA-seq or SMART-seq2 without UMIs.

type: string
default: Auto

Minimum aggregated count (across cells) for SNPs to be included in the output.

type: integer
default: 20

Minimum minor allele frequency (MAF) for SNPs to be included in the output.

type: number

Required flags in SAM/BAM: skip reads that don’t have ALL of these flags. See SAM format specification for details.

type: string

Excluding flags in SAM/BAM: skip reads that have ANY of these flags. See SAM format specification for details.

type: string

Minimum read length (after clipping) for a read to be included.

type: integer
default: 30

Minimum mapping quality for a read to be included.

type: integer
default: 20

Maximum read depth at a position per input file. Set to 0 for highest possible value.

type: integer

If true, do not skip anomalous read pairs (i.e., count orphan reads).

type: boolean

Options specific to the Vireo tool for donor demultiplexing from single-cell RNA-seq data.

The tag for donor genotype in VCF file. Options: GT, GP, PL.

type: string

If true, do not check for doublets during demultiplexing.

type: boolean

Number of random initializations when GT needs to be learned.

type: integer
default: 50

Number of extra donors in pre-cluster, when GT needs to be learned.

type: integer

Method for searching from extra donors. ‘size’: n_cell per donor; ‘distance’: GT distance between donors.

type: string

If true, treat donor GT as prior only and learn genotypes from data.

type: boolean
default: true

If true, turn on SNP specific allelic ratio (ASE mode).

type: boolean

If true, turn off plotting GT distance.

type: boolean

Random seed for initialization.

type: integer

Range of cells to process, e.g., ‘0-10000’. Default is ‘all’.

type: string
default: all

If true, detect ambient RNAs in each cell (experimental feature).

type: boolean

Options specific to the DSC-Pileup tool for pileup generation from single-cell BAM files.

Tag representing readgroup or cell barcodes to partition the BAM file into multiple groups. For 10x Genomics, use CB.

type: string
default: CB

Tag representing UMIs. For 10x Genomics, use UB.

type: string
default: UB

Maximum base quality (higher BQ will be capped).

type: integer
default: 40

Minimum base quality to consider (lower BQ will be skipped).

type: integer
default: 13

Minimum mapping quality to consider (lower MQ will be ignored).

type: integer
default: 20

Minimum distance to the tail (lower will be ignored).

type: integer

SAM/BAM FLAGs to be excluded.

type: integer
default: 3844

Minimum number of total reads for a droplet/cell to be considered.

type: integer

Minimum number of unique reads (determined by UMI/SNP pair) for a droplet/cell to be considered.

type: integer

Minimum number of SNPs with coverage for a droplet/cell to be considered.

type: integer

Options specific to the Demuxlet tool for genotype-based demultiplexing of single-cell RNA-seq data.

FORMAT field to extract the genotype, likelihood, or posterior from.

type: string
default: GT

Offset of genotype error rate. [error] = [offset] + [1-offset][coeff][1-r2]

type: number
default: 0.1

Slope of genotype error rate. [error] = [offset] + [1-offset][coeff][1-r2]

type: number

INFO field name representing R2 value. Used for representing imputation quality.

type: string
default: R2

Minimum minor allele frequency.

type: integer
default: 1

Minimum call rate.

type: number
default: 0.5

Grid of alpha to search for.

type: string
default: 0.1,0.2,0.3,0.4,0.5

Prior probability of doublet.

type: number
default: 0.5

Options specific to the Freemuxlet tool for reference-free genotype-based demultiplexing.

Prior probability of doublet.

type: number
default: 0.5

Genotype error parameter per cluster.

type: number
default: 0.1

Bayes Factor Threshold used in the initial clustering.

type: number
default: 5.41

Fraction of droplets to be clustered in the very first round of initial clustering procedure.

type: number
default: 1

Iteration for initial cluster assignment (set to zero to skip the iterations).

type: integer
default: 10

Keep missing cluster assignment as missing in the initial iteration.

type: boolean

Randomize the singlet scores to test its effect.

type: boolean

Seed for random number (use clocks if not set).

type: integer

Options specific to the Souporcell tool for clustering mixed-genotype scRNAseq experiments by individual.

Ploidy, must be 1 or 2.

type: integer

Min alt to use locus.

type: integer
default: 10

Min ref to use locus.

type: integer
default: 10

Max loci per cell, affects speed.

type: integer
default: 2048

Number of restarts in clustering, when there are > 12 clusters we recommend increasing this to avoid local minima.

type: integer
default: 100

Common variant loci or known variant loci vcf, must be vs same reference fasta.

type: string
pattern: ^\S+\.vcf(\.gz)?$

Known variants per clone in population vcf mode, must be .vcf right now we dont accept gzip or bcf sorry.

type: string
pattern: ^\S+\.vcf$

Which samples in population vcf from known genotypes option represent the donors in your sample. Provide space-separated sample names for multiple donors.

type: string

Don’t remap with minimap2 (not recommended unless in conjunction with —common_variants).

type: boolean

Set to True to ignore data error assertions.

type: boolean

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden
type: string
default: master

Base directory for Institutional configs.

hidden
type: string
default: https://raw.githubusercontent.com/nf-core/configs/master

Institutional config name.

hidden
type: string

Institutional config description.

hidden
type: string

Institutional config contact information.

hidden
type: string

Institutional config URL link.

hidden
type: string

Less common options for the pipeline, typically set in a config file.

Display version and exit.

hidden
type: boolean

Method used to save pipeline results to output directory.

hidden
type: string

Email address for completion summary, only when pipeline fails.

hidden
type: string
pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Send plain-text email instead of HTML.

hidden
type: boolean

File size limit when attaching MultiQC reports to summary emails.

hidden
type: string
default: 25.MB
pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Do not use coloured log outputs.

hidden
type: boolean

Incoming hook URL for messaging service

hidden
type: string

Custom config file to supply to MultiQC.

hidden
type: string

Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file

hidden
type: string

Custom MultiQC yaml file containing HTML including a methods description.

type: string

Boolean whether to validate parameters against the schema at runtime

hidden
type: boolean
default: true

Base URL or local path to location of pipeline test dataset files

hidden
type: string
default: https://raw.githubusercontent.com/nf-core/test-datasets/

Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.

hidden
type: string

Display the help message.

type: boolean,string

Display the full detailed help message.

type: boolean

Display hidden parameters in the help message (only works when —help or —help_full are provided).

type: boolean