Description

This workflow uses the suite FGBIO to identify and remove UMI tags from FASTQ reads convert them to unmapped BAM file, map them to the reference genome, and finally use the mapped information to group UMIs and generate consensus reads in each group

Input

name
description
pattern

meta

Groovy Map containing sample information
e.g. [ id:โ€˜testโ€™ ]

reads

list umi-tagged reads

[ *.{fastq.gz/fq.gz} ]

fasta

The reference fasta file

*.fasta

read_structure

A read structure should always be provided for each of the fastq files.
If single end, the string will contain only one structure (i.e. โ€œ2M11S+Tโ€), if paired-end the string
will contain two structures separated by a blank space (i.e. โ€œ2M11S+T 2M11S+Tโ€).
If the read does not contain any UMI, the structure will be +T (i.e. only template of any length).
https://github.com/fulcrumgenomics/fgbio/wiki/Read-Structures

groupreadsbyumi_strategy

Output

name
description
pattern

versions

File containing software versions

versions.yml

ubam

unmapped bam file

*.bam

groupbam

mapped bam file, where reads are grouped by UMI tag

*.bam

consensusbam

mapped bam file, where reads are created as consensus of those
belonging to the same UMI group

*.bam