Description
Detecting and estimating inter-sample DNA contamination became a crucial quality assessment step to ensure high quality sequence reads and reliable downstream analysis.
Input
BAM file, a sorted, indexed, base quality recalibrated, and duplication-marked BAM file.
It also requires to contain “@RG” header lines to annotation different readGroups (sequencing runs and lanes).
The SM tag in the “@RG” header should match with one of the genotyped sample.
*.bam
Output
Per-sample statistics describing how well the sample matches to the annotated sample.
*.selfSM
Per-readGroup statistics describing how well each lane matches to the annotated sample. (available only without —ignoreRG option)
*.selfRG
The depth distribution of the sequence reads per readGroup. (available only without —ignoreRG option)
*.depthRG
Per-sample best-match statistics with best-matching sample among the genotyped sample (available only with —best option)
*.bestSM