This workflow uses the suite FGBIO to identify and remove UMI tags from FASTQ reads convert them to unmapped BAM file, map them to the reference genome, and finally use the mapped information to group UMIs and generate consensus reads in each group


Name (Type)

meta (map)

Groovy Map containing sample information
e.g. [ id:‘test’ ]

reads (list)

list umi-tagged reads

[ *.{fastq.gz/fq.gz} ]

fasta (file)

The reference fasta file


read_structure (string)

A read structure should always be provided for each of the fastq files.
If single end, the string will contain only one structure (i.e. “2M11S+T”), if paired-end the string
will contain two structures separated by a blank space (i.e. “2M11S+T 2M11S+T”).
If the read does not contain any UMI, the structure will be +T (i.e. only template of any length).

groupreadsbyumi_strategy (string)


Name (Type)

versions (file)

File containing software versions


ubam (file)

unmapped bam file


groupbam (file)

mapped bam file, where reads are grouped by UMI tag


consensusbam (file)

mapped bam file, where reads are created as consensus of those
belonging to the same UMI group