Description

This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are defined as originating from a single fragment of DNA.

Input

name:type
description
pattern

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

bam

:file

Sorted BAM file

*.{bam}

fasta

:file

The reference fasta file

*.fasta

fasta_fai

:file

Index of reference fasta file

*.fai

dict

:file

GATK sequence dictionary

*.dict

Output

name:type
description
pattern

output

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

${prefix}

:file

Marked duplicates BAM/CRAM file

*.{bam,cram}

bam_index

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

${prefix}.bai

:file

Optional BAM index file

*.bai

metrics

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

*.metrics

:file

Metrics file

*.metrics

versions_gatk4

${task.process}

:string

The name of the process

gatk4

:string

The name of the tool

gatk --version | sed -n '/GATK.*v/s/.*v//p'

:eval

The expression to obtain the version of the tool

Topics

name:type
description
pattern

versions

${task.process}

:string

The name of the process

gatk4

:string

The name of the tool

gatk --version | sed -n '/GATK.*v/s/.*v//p'

:eval

The expression to obtain the version of the tool

Tools

gatk4
MIT

Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size.