Description

This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.

Input

name:type
description
pattern

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

bam{:bash}

:file

Sorted BAM file

*.{bam}

fasta{:bash}

:file

The reference fasta file

*.fasta

fasta_fai{:bash}

:file

Index of reference fasta file

*.fai

dict{:bash}

:file

GATK sequence dictionary

*.dict

Output

name:type
description
pattern

output{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

${prefix}{:bash}

:file

Marked duplicates BAM/CRAM file

*.{bam,cram}

bam_index{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

${prefix}.bai{:bash}

:file

Optional BAM index file

*.bai

metrics{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

*.metrics{:bash}

:file

Metrics file

*.metrics

versions{:bash}

versions.yml{:bash}

:file

File containing software versions

versions.yml

Tools

gatk4
MIT

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.