nf-core/variantbenchmarking

Pipeline to evaluate and validate the accuracy of variant calling methods in genomic research

benchmarksmall-variantsstructural-variantsvariant-calling

These pages are for an old version of the pipeline (1.2.0). The latest stable release is 1.4.0 .

Launch version 1.2.0 https://github.com/nf-core/variantbenchmarking

Define where the pipeline should find input data and save output data.

Path to comma-separated file containing information about the samples in the experiment.

required

type: string

pattern: ^\S+\.(csv|tsv|yaml|yml|json)$

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required

type: string

Truth id, sample name to define truth vcf

required

type: string

The analysis type used by the input files

required

type: string

Variant types to benchmark

required

type: string

The benchmarking methods to use. For germline small variants (SNV and INDEL) use happy and/or rtgtools, for somatic small variants (SNV and INDEL) use sompy and/or rtgtools, for structural variants use wittyer, truvari and/or svanalyzer, for copy number variations use wittyer and/or truvari. Use intersect to intersect BED files. Should be a comma-separate list of one or more of the following options: truvari, svanalyzer, happy, sompy, rtgtools, wittyer, intersect

required

type: string

Path to regions BED or VCF files. Works similar to Bcftools -R.

type: string

pattern: ^\S+\.(bed|vcf)?(\.gz)?$

Path to targets BED. Works similar to Bcftools -T. It will be only used with happy, sompy or rtgtools.

type: string

pattern: ^\S+\.(bed|vcf)?(\.gz)?$

Path to false positive BED. Only applicable to happy and sompy tool.

type: string

pattern: ^\S+\.(bed)?(\.gz)?$

Path to ambiguous BED. Only applicable to sompy tool.

type: string

pattern: ^\S+\.(bed)?(\.gz)?$

Path to the golden set VCF files.

type: string

pattern: ^\S+\.vcf(\.gz)?$

The preprocessing steps to perform on the input files. Should be a comma-separated list of one or more of the following options: split_multiallelic, normalizate, deduplicate, prepy, filter_contigs

type: string

The standardization methods to perform on the input files. Should be a comma-separated list of one or more of the following options: homogenize, svync

type: string

Minimum SV size of variants to benchmark, 0 to disable

type: integer

Maximum SV size of variants to benchmark, -1 to disable

type: integer

default: -1

Minimum Alele Frequency of variants to benchmark, Use -1 to disable

type: number

default: -1

Minimum number of read supporting variants to benchmark, Use, -1 to disable

type: integer

default: -1

Use bcftools expressions https://samtools.github.io/bcftools/bcftools.html#expressions to exclude variants

type: string

Use bcftools expressions https://samtools.github.io/bcftools/bcftools.html#expressions to include variants

type: string

Email address for completion summary.

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

MultiQC report title. Printed as page header, used for filename if not otherwise specified.

type: string

Reference genome related files and options required for the workflow.

Name of iGenomes reference.

type: string

Path to FASTA genome file.

required

type: string

pattern: ^\S+\.fn?a(sta)?(\.gz)?$

Path to FAI genome file.

required

type: string

pattern: ^\S+\.fai$

The SDF file needed to run rtgtools vcfeval

type: string

pattern: ^\S+\.sdf$

Path to stratification BED files provided in a directory. This directory has to be given together with stratification_tsv, list BED files in stratification_tsv. Only applicable to happy tool.

type: string

List the stratification BED files in this file, to be used with stratification_bed

type: string

pattern: ^\S+\.tsv$

Do not load the iGenomes reference config.

hidden

type: boolean

The base path to the igenomes reference files

hidden

type: string

default: s3://ngi-igenomes/igenomes/

Run liftover workflow: test,truth

hidden

type: string

Path to the chain file required for liftover.

type: string

pattern: ^\S+\.(chain|bed)?(\.gz)?$

Path to the ranaming chromosomes for lifting over.

type: string

pattern: ^\S+\.txt$

The dictionary file is required ofr liftover process. It has to be .dict of genome file used in the workflow.

type: string

pattern: ^\S+\.dict$

Parameters used to describe centralized config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden

type: string

default: master

Base directory for Institutional configs.

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/configs/master

Institutional config name.

hidden

type: string

Institutional config description.

hidden

type: string

Institutional config contact information.

hidden

type: string

Institutional config URL link.

hidden

type: string

Base path / URL for data used in the test profiles

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/test-datasets/variantbenchmarking

Less common options for the pipeline, typically set in a config file.

Display version and exit.

hidden

type: boolean

Method used to save pipeline results to output directory.

hidden

type: string

Email address for completion summary, only when pipeline fails.

hidden

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Send plain-text email instead of HTML.

hidden

type: boolean

File size limit when attaching MultiQC reports to summary emails.

hidden

type: string

default: 25.MB

pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Do not use coloured log outputs.

hidden

type: boolean

Do not use coloured log outputs.

hidden

type: boolean

Incoming hook URL for messaging service

hidden

type: string

Custom config file to supply to MultiQC.

hidden

type: string

Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file

hidden

type: string

Custom MultiQC yaml file containing HTML including a methods description.

type: string

Boolean whether to validate parameters against the schema at runtime

hidden

type: boolean

default: true

Base URL or local path to location of pipeline test dataset files

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/test-datasets/

Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.

hidden

type: string

On this page

nf-core/variantbenchmarking

Input/output options

Reference genome options

Institutional config options

Generic options