Description

Summarizes counts of reads that support reference, alternate and other alleles for given sites. Results can be used with CalculateContamination. Requires a common germline variant sites file, such as from gnomAD.

Input

name:type
description
pattern

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

input

:file

BAM/CRAM file to be summarised.

*.{bam,cram}

index

:file

Index file for the input BAM/CRAM file.

*.{bam.bai,cram.crai}

intervals

:file

File containing specified sites to be used for the summary. If this option is not specified, variants file is used instead automatically.

*.interval_list

meta2

:map

Groovy Map containing reference information e.g. [ id:‘genome’ ]

fasta

:file

The reference fasta file

*.fasta

meta3

:map

Groovy Map containing reference information e.g. [ id:‘genome’ ]

fai

:file

Index of reference fasta file

*.fasta.fai

meta4

:map

Groovy Map containing reference information e.g. [ id:‘genome’ ]

dict

:file

GATK sequence dictionary

*.dict

variants

:file

Population vcf of germline sequencing, containing allele fractions. Is also used as sites file if no separate sites file is specified.

*.vcf.gz

variants_tbi

:file

Index file for the germline resource.

*.vcf.gz.tbi

Output

name:type
description
pattern

table

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.pileups.table

:file

Table containing read counts for each site.

*.pileups.table

versions_gatk4

${task.process}

:string

The name of the process

gatk4

:string

The name of the tool

gatk --version | sed -n '/GATK.*v/s/.*v//p'

:eval

The expression to obtain the version of the tool

Topics

name:type
description
pattern

versions

${task.process}

:string

The name of the process

gatk4

:string

The name of the tool

gatk --version | sed -n '/GATK.*v/s/.*v//p'

:eval

The expression to obtain the version of the tool

Tools

gatk4
Apache-2.0

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.