Description

Perform annotation with ensemblvep and/or snpeff and bgzip + tabix index the resulting VCF file. This subworkflow uses the scatter-gather method to run VEP/snpEff in parallel to increase throughput. The input VCF is split into multiple smaller VCFs of fixed size, which are annotated separately and concatenated back together to a single output file per sample. Only VCF/BCF outputs are currently supported.

Input

Name (Type)
Description
Pattern

ch_vcf

vcf file to annotate
Structure: [ val(meta), path(vcf), path(tbi) ]

ch_fasta

Reference genome fasta file (optional)
Structure: [ val(meta2), path(fasta) ]

val_vep_genome (string)

genome to use for ensemblvep

val_vep_species (string)

species to use for ensemblvep

val_vep_cache_version (integer)

cache version to use for ensemblvep

ch_vep_cache

the root cache folder for ensemblvep (optional)
Structure: [ path(cache) ]

ch_vep_extra_files

any extra files needed by plugins for ensemblvep (optional)
Structure: [ path(file1), path(file2)… ]

val_snpeff_db (string)

database to use for snpeff

ch_snpeff_cache

the root cache folder for snpeff (optional)
Structure: [ path(cache) ]

val_tools_to_use (list)

The tools to use. Options => ’[“ensemblvep”, “snpeff”]’

val_sites_per_chunk (integer)

The amount of variants per scattered VCF.
Set this value to null, [] or false to disable scattering.

Output

Name (Type)
Description
Pattern

vcf_tbi

Compressed vcf file + tabix index
Structure: [ val(meta), path(vcf), path(tbi) ]

reports (file)

html reports

*.html

versions (file)

File containing software versions

versions.yml