Description

Hap.py is a tool to compare diploid genotypes at haplotype level. Rather than comparing VCF records row by row, hap.py will generate and match alternate sequences in a superlocus. A superlocus is a small region of the genome (sized between 1 and around 1000 bp) that contains one or more variants.

Input

name:type
description
pattern

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end

]

query_vcf:file

VCF/GVCF file to query

*.{gvcf,vcf}.gz

truth_vcf:file

gold standard VCF file

*.{gvcf,vcf}.gz

regions_bed:file

Sparse regions to restrict the analysis to

*.bed

targets_bed:file

Dense regions to restrict the analysis to

*.bed

meta2:map

Groovy Map containing fasta file information e.g. [ id:‘test2’]

fasta:file

FASTA file of the reference genome

*.{fa,fasta}

meta3:map

Groovy Map containing fai file information e.g. [ id:‘test3’]

fasta_fai:file

The index of the reference FASTA

*.fai

meta4:map

Groovy Map containing false_positives_bed file information e.g. [ id:‘test4’]

false_positives_bed:file

False positive / confident call regions. Calls outside these regions will be labelled as UNK.

*.{bed,bed.gz}

meta5:map

Groovy Map containing stratification_tsv file information e.g. [ id:‘test5’]

stratification_tsv:file

Stratification file list in TSV format

*.tsv

meta6:map

Groovy Map containing stratification_beds file information e.g. [ id:‘test6’]

stratification_beds:file

One or more BED files used for stratification (these should be referenced in the stratification TSV)

*.bed

Output

name:type
description
pattern

summary_csv

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end

]

*.summary.csv:file

A CSV file containing the summary of the benchmarking

*.summary.csv

roc_all_csv

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end

]

*.roc.all.csv.gz:file

A CSV file containing ROC values for all variants

*.roc.all.csv.gz

roc_indel_locations_csv

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end

]

*.roc.Locations.INDEL.csv.gz:file

A CSV file containing ROC values for all indels

*.roc.Locations.INDEL.csv.gz

roc_indel_locations_pass_csv

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end

]

*.roc.Locations.INDEL.PASS.csv.gz:file

A CSV file containing ROC values for all indels that passed all filters

*.roc.Locations.INDEL.PASS.csv.gz

roc_snp_locations_csv

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end

]

*.roc.Locations.SNP.csv.gz:file

A CSV file containing ROC values for all SNPs

*.roc.Locations.SNP.csv.gz

roc_snp_locations_pass_csv

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end

]

*.roc.Locations.SNP.PASS.csv.gz:file

A CSV file containing ROC values for all SNPs that passed all filters

*.roc.Locations.SNP.PASS.csv.gz

extended_csv

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end

]

*.extended.csv:file

A CSV file containing extended info of the benchmarking

*.extended.csv

runinfo

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end

]

*.runinfo.json:file

A JSON file containing the benchmarking metrics

*.metrics.json.gz

metrics_json

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end

]

*.metrics.json.gz:file

A JSON file containing the run info

*.runinfo.json

vcf

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end

]

*.vcf.gz:file

An annotated VCF

*.vcf.gz

tbi

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end

]

*.tbi:file

The index of the annotated VCF

*.tbi

versions

versions.yml:file

File containing software versions

versions.yml