hadge: Parameters

Define where the pipeline should find input data and save output data.

Path to comma-separated file containing information about the samples in the experiment.

required

type: string

pattern: ^\S+\.csv$

Mode of the pipeline.

required

type: string

Tools used for hash demultiplexing.

type: string

default: gmm-demux

Tools used for genetic demultiplexing.

type: string

default: vireo

Perform BAM QC.

type: boolean

default: true

File with common variants. If provided, the BAM files will be filtered to only include reads that overlap with the common variants.

type: string

pattern: ^\S+\.vcf(\.gz)?$

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required

type: string

Save intermediate files.

type: boolean

Email address for completion summary.

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

MultiQC report title. Printed as page header, used for filename if not otherwise specified.

type: string

Reference genome related files and options required for the workflow.

Name of iGenomes reference.

type: string

Path to FASTA genome file.

type: string

pattern: ^\S+\.fn?a(sta)?(\.gz)?$

Do not load the iGenomes reference config.

hidden

type: boolean

The base path to the igenomes reference files

hidden

type: string

default: s3://ngi-igenomes/igenomes/

Options for matching donors between different demultiplexing methods.

Match donor between different demultiplexing methods.

type: boolean

default: true

Path to demultiplexing result CSV file (necessary only for donor_match mode).

type: string,null

Path to Vireo filtered variants file (necessary only for donor_match mode).

type: string,null

Path to cell genotype file (necessary only for donor_match mode).

type: string,null

First method to use for donor matching.

type: string,null

Second method to use for donor matching.

type: string,null

Find variants for donor matching.

type: boolean

default: true

Option to subset the donor genotype based on detected variants.

type: boolean

default: true

Minimum variant count threshold.

type: integer

default: 10

The Minimal percentage of a variant for filtering. Has to be in a range between [0,5;1[. For example, 0.9 means that we only keep variants with a frequency higher than 90% or lower than 10%.

type: number

default: 0.9

Path to cell genotype file (necessary only for donor_match mode).

type: string,null

Options specific to the demuxEM tool for cell hashing demultiplexing.

Generate diagnostic plots.

type: boolean

default: true

The Dirichlet prior concentration parameter (alpha) on samples. An alpha value < 1.0 will make the prior sparse.

type: number

Only demultiplex cells/nuclei with at least this number of expressed genes.

type: integer

default: 100

Only demultiplex cells/nuclei with at least this number of UMIs.

type: integer

default: 100

Any cell/nucleus with less than this count of hashtags from the signal will be marked as unknown.

type: number

default: 10

The random seed used in the KMeans algorithm to separate empty ADT droplets from others.

type: integer

Comma-separated list of gender-specific genes (e.g. Xist) for generating violin plots.

type: string

Options specific to the BFF cell hashing demultiplexing.

Method(s) to use within BFF.

type: string

default: COMBINED

Whether to run preprocessing steps for BFF.

type: boolean

default: true

Path to barcode whitelist for preprocessing.

type: string,null

Path to cell barcode whitelist for GenerateCellHashingCalls().

type: string,null

Methods to use for consensus calling.

type: string,null

Optional metrics file path.

type: string,null

Whether to compute tSNE visualization in BFF.

type: boolean

Whether to generate heatmaps in BFF.

type: boolean

default: true

Per-cell saturation value.

type: number,null

Majority consensus threshold.

type: number,null

Library chemistry (e.g., 10xV3).

type: string

default: 10xV3

Threshold for caller disagreement.

type: number,null

Options specific to the GMM-Demux tool for cell hashing demultiplexing.

Comma separated list of HTO names, without whitespace. If null, hto_names are extracted from the input hto matrix from features.tsv.gz.

type: string,null

If specified, it will generate the statistic summary of the dataset, including MSM and SSM rates. This requires an estimated total number of cells in the assay as input.

type: integer,null

If true, full classification report is generated, otherwise the simplified classification report.

type: boolean

default: true

If true, summary report is generated.

type: boolean

default: true

Load a full classification report and skip the mtx folder as input. Requires a file path argument.

type: string,null

Provide the cell list. Requires a file path argument. Only executes if -u is set.

type: string,null

Names of the HTO tag(s) to extract, separated by ’,’. Joint HTO samples are combined with ’+’, such as ‘HTO_1+HTO_2’.

type: string

The confidence threshold value for classification. A higher value leads to more stringent classification.

type: number

default: 0.8

The random seed used in the GaussianMixture algorithm.

type: integer

Options specific to the Scanpy Hashsolo demultiplexing module.

Groovy list ([‘hash_1’, ‘hash_2’]) of .obs columns that contain cell hashing counts. Can be null if the data is in 10x Genomics format, as the columns are derived from the input.

type: array,null

List of comma-separated priors for each hypothesis: NEGATIVE, SINGLET, DOUBLET.

type: string

default: 0.01,0.8,0.19

Column in cell_hashing_adata.obs for how to break up demultiplexing.

type: string,null

Input directory containing transcriptomic data in 10x mtx format.

type: string,null

Number of barcodes to use to create noise distribution.

type: integer,null

Number of decimal places to round numeric values in cell_hashing_data.obs before saving. If omitted, no rounding is applied.

type: integer

default: 10

Options specific to the HTODemux tool for cell hashing demultiplexing.

The quantile to use for thresholding.

type: number

default: 0.99

Initialization method for clustering.

type: string

default: NULL

Number of starts for clustering.

type: integer

default: 100

Clustering function to use.

type: string

default: clara

Number of samples for clustering.

type: integer

default: 100

Random seed for reproducibility.

type: integer

default: 42

Whether to print verbose output.

type: boolean

default: true

Options specific to the HTODemux visualization tool for generating plots and visualizations.

Generate ridge plot.

type: boolean

default: true

The number of plots that are dispalyed next to each other in one row. The number of plots corresponds to the number of Hash Tag Oligo (HTO) identifiers.

type: integer

default: 2

Generate feature scatter plot. If no features are provided (one of them is null), the first two features from the assay will be used.

type: boolean

default: true

Name of a Hash Tag Oligo (HTO) identifiers, usually defined in the feature.tsv of the hto matrix folder.

type: string

Name of a Hash Tag Oligo (HTO) identifiers, usually defined in the feature.tsv of the hto matrix folder.

type: string

Generate violin plot.

type: boolean

default: true

Features to plot (gene expression, metrics, PC scores, anything that can be retrieved by FetchData).

type: string

default: nCount_RNA

Plot the feature axis on log scale.

type: boolean

default: true

Generate a two dimensional tSNE embedding for HTOs.

type: boolean

default: true

What should we remove from the object (we have Singlet, Doublet and Negative).

type: string

Invert tSNE selection.

type: boolean

default: true

Verbose tSNE.

type: boolean

Approximate tSNE.

type: boolean

Max number of donors.

type: integer

default: 2

Value for perplexity.

type: integer

default: 100

Generate heatmap.

type: boolean

default: true

Number of cells for heatmap.

type: integer

default: 500

Options specific to the MultiSeqDemux tool for cell hashing demultiplexing.

The quantile to use for thresholding.

type: number

default: 0.7

Whether to automatically determine thresholds.

type: boolean

default: true

Maximum number of iterations.

type: integer

default: 5

Start of quantile range.

type: number

default: 0.1

End of quantile range.

type: number

default: 0.9

Step size for quantile range.

type: number

default: 0.05

Whether to print verbose output.

type: boolean

default: true

Options specific to the HashedDrops tool for cell hashing demultiplexing.

Lower bound on total UMI count for empty droplets.

type: integer

default: 100

Number of iterations for Monte Carlo p-value calculations.

type: integer

default: 10000

Whether to test ambient RNA.

type: boolean

default: true

Whether to round non-integer values.

type: boolean

default: true

Alternative method for identifying empty droplets.

type: integer,null

FDR threshold for cell filtering.

type: number

default: 0.01

Column to use for gene names.

type: integer

default: 2

Lower bound for ignoring barcodes.

type: number,null

Scaling parameter for Dirichlet-multinomial sampling.

type: number,null

Whether to use ambient solution abundance.

type: boolean

default: true

Minimum proportion for ambient profile inference.

type: number

default: 0.05

Minimum pseudo-count for log-fold change computation.

type: integer

default: 5

Whether to use constant ambient contamination level.

type: boolean

Number of MADs to identify doublets.

type: integer

default: 3

Minimum threshold for doublet identification.

type: integer

default: 2

Whether to use 2-component mixture model for doublets.

type: boolean

Number of MADs to identify confident singlets.

type: integer

default: 3

Minimum threshold for confident singlet identification.

type: integer

default: 2

An integer matrix specifying valid combinations of HTOs. Number of items in each row has to be the same.

type: array

Whether to run EmptyDrops analysis as part of HashedDrops.

type: boolean

Options for preprocessing data for HTODemux and MultiSeq demultiplexing.

Method for feature selection.

type: string

default: mean.var.plot

Delimiter for parsing feature names.

type: string

default: _

Number of features to select.

type: integer

default: 2000

Assay type for preprocessing.

type: string

default: HTO

Margin parameter for preprocessing.

type: integer

default: 2

Normalization method to use.

type: string

default: CLR

Column containing gene information.

type: integer

default: 2

Options controlling hash summary and downstream exported formats.

Generate AnnData (.h5ad) outputs for hashing results.

type: boolean

default: true

Generate MuData outputs for hashing results.

type: boolean

default: true

Options specific to the CellSNP-lite tool for genotyping bi-allelic SNPs on single cells.

Tag for cell barcodes, e.g., CB for 10x Genomics. Set to ‘None’ for bulk RNA-seq or SMART-seq2.

type: string

default: CB

Tag for UMI barcodes, e.g., UB for 10x Genomics. Set to ‘None’ for bulk RNA-seq or SMART-seq2 without UMIs.

type: string

default: Auto

Minimum aggregated count (across cells) for SNPs to be included in the output.

type: integer

default: 20

Minimum minor allele frequency (MAF) for SNPs to be included in the output.

type: number

Required flags in SAM/BAM: skip reads that don’t have ALL of these flags. See SAM format specification for details.

type: string

Excluding flags in SAM/BAM: skip reads that have ANY of these flags. See SAM format specification for details.

type: string

Minimum read length (after clipping) for a read to be included.

type: integer

default: 30

Minimum mapping quality for a read to be included.

type: integer

default: 20

Maximum read depth at a position per input file. Set to 0 for highest possible value.

type: integer

If true, do not skip anomalous read pairs (i.e., count orphan reads).

type: boolean

Options specific to the Vireo tool for donor demultiplexing from single-cell RNA-seq data.

The tag for donor genotype in VCF file. Options: GT, GP, PL.

type: string

If true, do not check for doublets during demultiplexing.

type: boolean

Number of random initializations when GT needs to be learned.

type: integer

default: 50

Number of extra donors in pre-cluster, when GT needs to be learned.

type: integer

Method for searching from extra donors. ‘size’: n_cell per donor; ‘distance’: GT distance between donors.

type: string

If true, treat donor GT as prior only and learn genotypes from data.

type: boolean

default: true

If true, turn on SNP specific allelic ratio (ASE mode).

type: boolean

If true, turn off plotting GT distance.

type: boolean

Random seed for initialization.

type: integer

Range of cells to process, e.g., ‘0-10000’. Default is ‘all’.

type: string

default: all

If true, detect ambient RNAs in each cell (experimental feature).

type: boolean

Options specific to the DSC-Pileup tool for pileup generation from single-cell BAM files.

Tag representing readgroup or cell barcodes to partition the BAM file into multiple groups. For 10x Genomics, use CB.

type: string

default: CB

Tag representing UMIs. For 10x Genomics, use UB.

type: string

default: UB

Maximum base quality (higher BQ will be capped).

type: integer

default: 40

Minimum base quality to consider (lower BQ will be skipped).

type: integer

default: 13

Minimum mapping quality to consider (lower MQ will be ignored).

type: integer

default: 20

Minimum distance to the tail (lower will be ignored).

type: integer

SAM/BAM FLAGs to be excluded.

type: integer

default: 3844

Minimum number of total reads for a droplet/cell to be considered.

type: integer

Minimum number of unique reads (determined by UMI/SNP pair) for a droplet/cell to be considered.

type: integer

Minimum number of SNPs with coverage for a droplet/cell to be considered.

type: integer

Options specific to the Demuxlet tool for genotype-based demultiplexing of single-cell RNA-seq data.

FORMAT field to extract the genotype, likelihood, or posterior from.

type: string

default: GT

Offset of genotype error rate. [error] = [offset] + [1-offset][coeff][1-r2]

type: number

default: 0.1

Slope of genotype error rate. [error] = [offset] + [1-offset][coeff][1-r2]

type: number

INFO field name representing R2 value. Used for representing imputation quality.

type: string

default: R2

Minimum minor allele frequency.

type: integer

default: 1

Minimum call rate.

type: number

default: 0.5

Grid of alpha to search for.

type: string

default: 0.1,0.2,0.3,0.4,0.5

Prior probability of doublet.

type: number

default: 0.5

Options specific to the Freemuxlet tool for reference-free genotype-based demultiplexing.

Prior probability of doublet.

type: number

default: 0.5

Genotype error parameter per cluster.

type: number

default: 0.1

Bayes Factor Threshold used in the initial clustering.

type: number

default: 5.41

Fraction of droplets to be clustered in the very first round of initial clustering procedure.

type: number

default: 1

Iteration for initial cluster assignment (set to zero to skip the iterations).

type: integer

default: 10

Keep missing cluster assignment as missing in the initial iteration.

type: boolean

Randomize the singlet scores to test its effect.

type: boolean

Seed for random number (use clocks if not set).

type: integer

Options specific to the Souporcell tool for clustering mixed-genotype scRNAseq experiments by individual.

Ploidy, must be 1 or 2.

type: integer

Min alt to use locus.

type: integer

default: 10

Min ref to use locus.

type: integer

default: 10

Max loci per cell, affects speed.

type: integer

default: 2048

Number of restarts in clustering, when there are > 12 clusters we recommend increasing this to avoid local minima.

type: integer

default: 100

Common variant loci or known variant loci vcf, must be vs same reference fasta.

type: string

pattern: ^\S+\.vcf(\.gz)?$

Known variants per clone in population vcf mode, must be .vcf right now we dont accept gzip or bcf sorry.

type: string

pattern: ^\S+\.vcf$

Which samples in population vcf from known genotypes option represent the donors in your sample. Provide space-separated sample names for multiple donors.

type: string

Don’t remap with minimap2 (not recommended unless in conjunction with —common_variants).

type: boolean

Set to True to ignore data error assertions.

type: boolean

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden

type: string

default: master

Base directory for Institutional configs.

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/configs/master

Institutional config name.

hidden

type: string

Institutional config description.

hidden

type: string

Institutional config contact information.

hidden

type: string

Institutional config URL link.

hidden

type: string

Less common options for the pipeline, typically set in a config file.

Display version and exit.

hidden

type: boolean

Method used to save pipeline results to output directory.

hidden

type: string

Email address for completion summary, only when pipeline fails.

hidden

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Send plain-text email instead of HTML.

hidden

type: boolean

File size limit when attaching MultiQC reports to summary emails.

hidden

type: string

default: 25.MB

pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Do not use coloured log outputs.

hidden

type: boolean

Incoming hook URL for messaging service

hidden

type: string

Custom config file to supply to MultiQC.

hidden

type: string

Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file

hidden

type: string

Custom MultiQC yaml file containing HTML including a methods description.

type: string

Boolean whether to validate parameters against the schema at runtime

hidden

type: boolean

default: true

Base URL or local path to location of pipeline test dataset files

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/test-datasets/

Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.

hidden

type: string

Display the help message.

type: boolean,string

Display the full detailed help message.

type: boolean

Display hidden parameters in the help message (only works when —help or —help_full are provided).

type: boolean

nf-core/hadge