Description

Detecting and estimating inter-sample DNA contamination became a crucial quality assessment step to ensure high quality sequence reads and reliable downstream analysis.

Input

Name (Type)
Description
Pattern

meta (map)

Groovy Map containing sample information
e.g. [ id:‘test’, single_end

]

bam (file)

BAM/CRAM/SAM file

*.{bam,cram,sam}

bai (file)

BAI/CRAI/CSI index file

*.{bai,crai,csi}

svd_ud (file)

.UD matrix file from SVD result of genotype matrix

*.UD

svd_mu (file)

.mu matrix file of genotype matrix

*.mu

svd_bed (file)

.Bed file for markers used in this analysis,format(chr\tpos-1\tpos\trefAllele\taltAllele)[Required]

*.bed

references (file)

reference file [Required]

*.fasta

refvcf (file)

Reference panel VCF with genotype information, for generation of .UD .mu .bed files [Optional]

*.vcf

Output

Name (Type)
Description
Pattern

meta (map)

Groovy Map containing sample information
e.g. [ id:‘test’, single_end

]

mu (file)

.mu matrix file of genotype matrix from customized reference vcf input

*.mu

ud (file)

.UD matrix file from customized reference vcf input

*.UD

bed (file)

.Bed file from customized reference marker vcf input

*.bed

versions (file)

File containing software versions

versions.yml

log (file)

Detailed summary of the VerifyBamId2 results

*.log

self_sm (file)

Shares the same format as legacy VB1 and the key information FREEMIX indicates the estimated contamination level.

*.selfSM

ancenstry (file)

PC coordinates for both intended sample and contaminating sample, with each row being one PC.

*.Ancestry

Tools

verifybamid2
MIT

A robust tool for DNA contamination estimation from sequence reads using ancestry-agnostic method.