Description

Determining whether sequencing data comes from the same individual by using SNP matching. Designed for humans on vcf or bam files.

Input

Name (Type)
Description
Pattern

meta (map)

Groovy Map containing sample information
e.g. [ id:‘test’]

files (file)

VCF or BAM files for each sample, in a merged channel (possibly gzipped). BAM files require an index too.

*.{vcf,vcf.gz,bam,bai}

meta2 (map)

Groovy Map containing SNP information
e.g. [ id:‘test’ ]

snp_bed (file)

BED file containing the SNPs to analyse

*.{bed}

meta3 (map)

Groovy Map containing reference fasta index information
e.g. [ id:‘test’ ]

fasta (file)

fasta file for the genome, only used in the bam mode

*.{bed}

Output

Name (Type)
Description
Pattern

versions (file)

File containing software versions

versions.yml

pdf (file)

A pdf containing a dendrogram showing how the samples match up

*.{pdf}

corr_matrix (file)

A text file containing the correlation matrix between each sample

*corr_matrix.txt

matched (file)

A txt file containing only the samples that match with each other

*matched.txt

all (file)

A txt file containing all the sample comparisons, whether they match or not

*all.txt

vcf (file)

If ran in bam mode, vcf files for each sample giving the SNP calls used

*.vcf

Tools

ngscheckmate
MIT

NGSCheckMate is a software package for identifying next generation sequencing (NGS) data files from the same individual, including matching between DNA and RNA.