Description

CGC annotation module for the dbcan pipeline. This module is used to annotate carbohydrate-active enzymes (CAZymes) from genomic data using the dbCAN annotation tool.

Input

name:type
description
pattern

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:'sample1' ]

input_raw_data{:bash}

:file

FASTA file for protein sequences.

*.{fasta,fa,faa}

meta2{:bash}

:map

Groovy Map containing sample information e.g. [ id:'sample1' ]

input_gff{:bash}

:file

GFF file for protein sequences.

gff_type{:bash}

:string

Type of GFF file. Options are NCBI_prok, JGI, NCBI_euk, and prodigal. This is used to parse the GFF file correctly.

dbcan_db{:bash}

:directory

Path to the dbCAN database directory.

Output

name:type
description
pattern

cazyme_annotation{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_overview.tsv{:bash}

:file

TSV file containing the results of dbCAN CAZyme annotation.

dbcanhmm_results{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_dbCAN_hmm_results.tsv{:bash}

:file

TSV file containing the detailed dbCAN HMM results for CAZyme annotation.

dbcansub_results{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_dbCANsub_hmm_results.tsv{:bash}

:file

TSV file containing the detailed dbCAN subfamily results for CAZyme annotation.

dbcandiamond_results{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_diamond.out{:bash}

:file

TSV file containing the detailed dbCAN diamond results for CAZyme annotation.

cgc_gff{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_cgc.gff{:bash}

:file

GFF file containing the CAZyme gene clusters (CGC) identified by dbCAN. This file is generated from the dbCAN annotation and contains the locations of CAZyme gene clusters in the genome.

cgc_standard_out{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_cgc_standard_out.tsv{:bash}

:file

Standard output file from dbCAN for CAZyme gene clusters (CGC) in a tabular format. This file summarizes the CAZyme gene clusters identified in the genome.

diamond_out_tc{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_diamond.out.tc{:bash}

:file

TSV file containing the diamond output for transporter annotation.

tf_hmm_results{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_TF_hmm_results.tsv{:bash}

:file

TSV file containing the results of Transcription factor.

stp_hmm_results{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:'sample1']

${prefix}_STP_hmm_results.tsv{:bash}

:file

TSV file containing the results of signaling transduction proteins (STP) annotation.

versions{:bash}

versions.yml{:bash}

:file

File containing software versions

versions.yml

Tools

dbcan
GPL v3-or-later

Standalone version of dbCAN annotation tool for automated CAZyme annotation.