Description

CGC annotation module for the dbcan pipeline. This module is used to annotate carbohydrate-active enzymes (CAZymes) from genomic data using the dbCAN annotation tool.

Input

name:type
description
pattern

meta:map

Groovy Map containing sample information e.g. [ id:'sample1' ]

input_raw_data:file

FASTA file for protein sequences.

*.{fasta,fa,faa}

meta2:map

Groovy Map containing sample information e.g. [ id:'sample1' ]

input_gff:file

GFF file for protein sequences.

gff_type:string

Type of GFF file. Options are NCBI_prok, JGI, NCBI_euk, and prodigal. This is used to parse the GFF file correctly.

dbcan_db:directory

Path to the dbCAN database directory.

Output

name:type
description
pattern

cazyme_annotation

meta:map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_overview.tsv:file

TSV file containing the results of dbCAN CAZyme annotation.

dbcanhmm_results

meta:map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_dbCAN_hmm_results.tsv:file

TSV file containing the detailed dbCAN HMM results for CAZyme annotation.

dbcansub_results

meta:map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_dbCANsub_hmm_results.tsv:file

TSV file containing the detailed dbCAN subfamily results for CAZyme annotation.

dbcandiamond_results

meta:map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_diamond.out:file

TSV file containing the detailed dbCAN diamond results for CAZyme annotation.

cgc_gff

meta:map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_cgc.gff:file

GFF file containing the CAZyme gene clusters (CGC) identified by dbCAN. This file is generated from the dbCAN annotation and contains the locations of CAZyme gene clusters in the genome.

cgc_standard_out

meta:map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_cgc_standard_out.tsv:file

Standard output file from dbCAN for CAZyme gene clusters (CGC) in a tabular format. This file summarizes the CAZyme gene clusters identified in the genome.

diamond_out_tc

meta:map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_diamond.out.tc:file

TSV file containing the diamond output for transporter annotation.

tf_hmm_results

meta:map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_TF_hmm_results.tsv:file

TSV file containing the results of Transcription factor.

stp_hmm_results

meta:map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_STP_hmm_results.tsv:file

TSV file containing the results of signaling transduction proteins (STP) annotation.

versions

versions.yml:file

File containing software versions

versions.yml

Tools

dbcan
GPL v3-or-later

Standalone version of dbCAN annotation tool for automated CAZyme annotation.