Description

This workflow uses the suite FGBIO to identify and remove UMI tags from FASTQ reads convert them to unmapped BAM file, map them to the reference genome, and finally use the mapped information to group UMIs and generate consensus reads in each group

Input

Name (Type)
Description
Pattern

meta (map)

Groovy Map containing sample information
e.g. [ id:‘test’ ]

reads (list)

list umi-tagged reads

[ *.{fastq.gz/fq.gz} ]

fasta (file)

The reference fasta file

*.fasta

read_structure (string)

A read structure should always be provided for each of the fastq files.
If single end, the string will contain only one structure (i.e. “2M11S+T”), if paired-end the string
will contain two structures separated by a blank space (i.e. “2M11S+T 2M11S+T”).
If the read does not contain any UMI, the structure will be +T (i.e. only template of any length).
https://github.com/fulcrumgenomics/fgbio/wiki/Read-Structures

groupreadsbyumi_strategy (string)

Output

Name (Type)
Description
Pattern

versions (file)

File containing software versions

versions.yml

ubam (file)

unmapped bam file

*.bam

groupbam (file)

mapped bam file, where reads are grouped by UMI tag

*.bam

consensusbam (file)

mapped bam file, where reads are created as consensus of those
belonging to the same UMI group

*.bam