Description

Create fasta consensus with TOPAS toolkit with options to penalize substitutions for typical DNA damage present in ancient DNA

Input

name:type
description
pattern

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

vcf

:file

Gzipped compressed vcf file generated with GATK UnifiedGenotyper containing the called snps

*.vcf.gz

meta2

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

vcf_indels

:file

Optional gzipped compressed vcf file generated with GATK UnifiedGenotyper containing the called indels

*.vcf.gz

meta3

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

reference

:file

Fasta file of reference genome

*.fasta

meta4

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

fai

:file

Optional index for the fasta file of reference genome

*.fai

vcf_output

:boolean

Boolean value to indicate if a compressed vcf file with the consensus calls included as SNPs should be produced

Output

name:type
description
pattern

fasta

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

*.fasta.gz

:file

Gzipped consensus fasta file with bases under threshold replaced with Ns

*.fasta.gz

vcf

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

*.vcf.gz

:file

Gzipped vcf file with updated calls for the SNPs used in the consensus generation and for bases under threshold replaced with Ns

*.vcf.gz

ccf

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

*.ccf

:file

Statistics file containing information about the consensus calls in the fasta file

*.ccf

log

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

*.log

:file

Log file

*.log

versions

versions.yml

:file

File containing software versions

versions.yml

Tools

topas
CC-BY

This toolkit allows the efficient manipulation of sequence data in various ways. It is organized into modules: The FASTA processing modules, the FASTQ processing modules, the GFF processing modules and the VCF processing modules.