Description

SNP table generator from GATK UnifiedGenotyper with functionality geared for aDNA

Input

name:type
description
pattern

meta

:map

Groovy Map containing sample information e.g. [ id:'sample1', single_end:false ]

vcfs

:file

One or a list of gzipped or uncompressed VCF file

*.vcf

meta2

:map

Groovy Map containing sample information e.g. [ id:'sample1', single_end:false ]

fasta

:file

Reference genome VCF was generated against

*.{fasta,fna,fa}

meta3

:map

Groovy Map containing sample information e.g. [ id:'sample1', single_end:false ]

snpeff_results

:file

Results from snpEff in txt format (Optional)

*.txt

meta4

:map

Groovy Map containing sample information e.g. [ id:'sample1', single_end:false ]

gff

:file

GFF file corresponding to reference genome fasta (Optional)

*.gff

allele_freqs

:boolean

Whether to include the percentage of reads a given allele is present in in the SNP table.

genotype_quality

:integer

Minimum GATK genotyping threshold threshold of which a SNP call falling under is ‘discarded’

coverage

:integer

Minimum number of a reads that a position must be covered by to be reported

homozygous_freq

:integer

Fraction of reads a base must have to be called ‘homozygous’

heterozygous_freq

:integer

Fraction of which whereby if a call falls above this value, and lower than the homozygous threshold, a base will be called ‘heterozygous’.

meta5

:map

Groovy Map containing sample information e.g. [ id:'sample1', single_end:false ]

gff_exclude

:file

file listing positions that will be ‘filtered’ (i.e. ignored) (Optional)

*.vcf

Output

name:type
description
pattern

full_alignment

meta

:map

Groovy Map containing sample information

*fullAlignment.fasta.gz

:file

Fasta a fasta file of all positions contained in the VCF files i.e. including ref calls

.fasta.gz

info_txt

meta

:map

Groovy Map containing sample information

*info.txt

:file

Information about the run

.txt

snp_alignment

meta

:map

Groovy Map containing sample information

*snpAlignment.fasta.gz

:file

A fasta file of just SNP positions with samples only

.fasta.gz

snp_genome_alignment

meta

:map

Groovy Map containing sample information

*snpAlignmentIncludingRefGenome.fasta.gz

:file

A fasta file of just SNP positions with reference genome

.fasta.gz

snpstatistics

meta

:map

Groovy Map containing sample information

*snpStatistics.tsv

:file

Some basic statistics about the SNP calls of each sample

.tsv

snptable

meta

:map

Groovy Map containing sample information

*snpTable.tsv

:file

Basic SNP table of combined positions taken from each VCF file

.tsv

snptable_snpeff

meta

:map

Groovy Map containing sample information

*snpTableForSnpEff.tsv

:file

Input file for SnpEff

.tsv

snptable_uncertainty

meta

:map

Groovy Map containing sample information

*snpTableWithUncertaintyCalls.tsv

:file

Same as above, but with lower case characters indicating uncertain calls

.tsv

structure_genotypes

meta

:map

Groovy Map containing sample information

*structureGenotypes.tsv

:file

Input file for STRUCTURE

.tsv

structure_genotypes_nomissing

meta

:map

Groovy Map containing sample information

*structureGenotypes_noMissingData-Columns.tsv

:file

Alternate input file for STRUCTURE

.tsv

json

meta

:map

Groovy Map containing sample information

*MultiVCFAnalyzer.json

:file

Summary statistics in MultiQC JSON format

.json

versions_multivcfanalyzer

${task.process}

:string

The name of the process

multivcfanalyzer

:string

The name of the tool

multivcfanalyzer -h | head -n 1 | cut -f 3 -d " "

:eval

The expression to obtain the version of the tool

versions_tabix

${task.process}

:string

The name of the process

tabix

:string

The name of the tool

tabix -h 2>&1 | grep Version | cut -f 2 -d " "

:eval

The expression to obtain the version of the tool

Topics

name:type
description
pattern

versions

${task.process}

:string

The name of the process

multivcfanalyzer

:string

The name of the tool

multivcfanalyzer -h | head -n 1 | cut -f 3 -d " "

:eval

The expression to obtain the version of the tool

${task.process}

:string

The name of the process

tabix

:string

The name of the tool

tabix -h 2>&1 | grep Version | cut -f 2 -d " "

:eval

The expression to obtain the version of the tool

Tools

multivcfanalyzer
GPL >=3

MultiVCFAnalyzer is a VCF file post-processing tool tailored for aDNA. License on Github repository.