Description

GECCO is a fast and scalable method for identifying putative novel Biosynthetic Gene Clusters (BGCs) in genomic and metagenomic data using Conditional Random Fields (CRFs).

Input

name:type
description
pattern

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

input

:file

A genomic file containing one or more sequences as input. Input type is any supported by Biopython (fasta, gbk, etc.)

*

hmm

:file

Alternative HMM file(s) to use in HMMER format

*.hmm

model_dir

:directory

Path to an alternative CRF (Conditional Random Fields) module to use

Output

name:type
description
pattern

genes

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

*.genes.tsv

:file

TSV file containing detected/predicted genes with BGC probability scores. Will not be generated if no hits are found.

*.genes.tsv

features

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

*.features.tsv

:file

TSV file containing identified domains

*.features.tsv

clusters

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

*.clusters.tsv

:file

TSV file containing coordinates of predicted clusters and BGC types. Will not be generated if no hits are found.

*.clusters.tsv

gbk

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

*_cluster_*.gbk

:file

Per cluster GenBank file (if found) containing sequence with annotations. Will not be generated if no hits are found.

*.gbk

json

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

*.json

:file

AntiSMASH v6 sideload JSON file (if —antismash-sideload) supplied. Will not be generated if no hits are found.

*.gbk

versions_gecco

${task.process}

:string

The name of the process

gecco

:string

The name of the tool

gecco -V |& sed 's/gecco //'

:eval

The expression to obtain the version of the tool

Topics

name:type
description
pattern

versions

${task.process}

:string

The name of the process

gecco

:string

The name of the tool

gecco -V |& sed 's/gecco //'

:eval

The expression to obtain the version of the tool

Tools

gecco
GPL v3

Biosynthetic Gene Cluster prediction with Conditional Random Fields.