modules/rundbcan_easycgc

CGC annotation module for the dbcan pipeline. This module is used to annotate carbohydrate-active enzymes (CAZymes) from genomic data using the dbCAN annotation tool.

dbCANdownloadCAZymeCAZyme gene Clustergenomes

https://github.com/nf-core/modules/[...]/modules/nf-core/rundbcan/easycgc

Description

CGC annotation module for the dbcan pipeline. This module is used to annotate carbohydrate-active enzymes (CAZymes) from genomic data using the dbCAN annotation tool.

Input

name:type

description

pattern

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:'sample1' ]

`input_raw_data{:bash}`
`:file`

FASTA file for protein sequences.

*.{fasta,fa,faa}

`meta2{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:'sample1' ]

`input_gff{:bash}`
`:file`

GFF file for protein sequences.

`gff_type{:bash}`
`:string`

Type of GFF file. Options are NCBI_prok, JGI, NCBI_euk, and prodigal. This is used to parse the GFF file correctly.

`dbcan_db{:bash}`
`:directory`

Path to the dbCAN database directory.

Output

name:type

description

pattern

`cazyme_annotation{:bash}`

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:'sample1']

`${prefix}_overview.tsv{:bash}`
`:file`

TSV file containing the results of dbCAN CAZyme annotation.

`dbcanhmm_results{:bash}`

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:'sample1']

`${prefix}_dbCAN_hmm_results.tsv{:bash}`
`:file`

TSV file containing the detailed dbCAN HMM results for CAZyme annotation.

`dbcansub_results{:bash}`

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:'sample1']

`${prefix}_dbCANsub_hmm_results.tsv{:bash}`
`:file`

TSV file containing the detailed dbCAN subfamily results for CAZyme annotation.

`dbcandiamond_results{:bash}`

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:'sample1']

`${prefix}_diamond.out{:bash}`
`:file`

TSV file containing the detailed dbCAN diamond results for CAZyme annotation.

`cgc_gff{:bash}`

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:'sample1']

`${prefix}_cgc.gff{:bash}`
`:file`

GFF file containing the CAZyme gene clusters (CGC) identified by dbCAN. This file is generated from the dbCAN annotation and contains the locations of CAZyme gene clusters in the genome.

`cgc_standard_out{:bash}`

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:'sample1']

`${prefix}_cgc_standard_out.tsv{:bash}`
`:file`

Standard output file from dbCAN for CAZyme gene clusters (CGC) in a tabular format. This file summarizes the CAZyme gene clusters identified in the genome.

`diamond_out_tc{:bash}`

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:'sample1']

`${prefix}_diamond.out.tc{:bash}`
`:file`

TSV file containing the diamond output for transporter annotation.

`tf_hmm_results{:bash}`

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:'sample1']

`${prefix}_TF_hmm_results.tsv{:bash}`
`:file`

TSV file containing the results of Transcription factor.

`stp_hmm_results{:bash}`

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:'sample1']

`${prefix}_STP_hmm_results.tsv{:bash}`
`:file`

TSV file containing the results of signaling transduction proteins (STP) annotation.

`versions{:bash}`

`versions.yml{:bash}`
`:file`

File containing software versions

versions.yml

Tools

dbcan
GPL v3-or-later

Standalone version of dbCAN annotation tool for automated CAZyme annotation.

bcb.unl.edu/dbCAN2 run-dbcan.readthedocs.io/en/latest https://github.com/bcb-unl/run_dbcan 10.1093/nar/gkad328