Description

This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.

Input

Name (Type)
Description
Pattern

meta (map)

Groovy Map containing sample information
e.g. [ id:‘test’, single_end

]

bam (file)

Sorted BAM file

*.{bam}

fasta (file)

Fasta file

*.{fasta}

fasta_fai (file)

Fasta index file

*.{fai}

Output

Name (Type)
Description
Pattern

meta (map)

Groovy Map containing sample information
e.g. [ id:‘test’, single_end

]

versions (file)

File containing software versions

versions.yml

bam (file)

Marked duplicates BAM file

*.{bam}

cram (file)

Marked duplicates CRAM file

*.{cram}

bai (file)

BAM index file

*.{bam.bai}

crai (file)

CRAM index file

*.{cram.crai}

metrics (file)

Duplicate metrics file generated by GATK

*.{metrics.txt}

Tools

gatk4
MIT

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.