Description

Cluster sequences using a single-pass, greedy centroid-based clustering algorithm.

Input

name:type
description
pattern

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

fasta{:bash}

:file

Sequences to cluster in FASTA format

*.{fasta,fa,fasta.gz,fa.gz}

Output

name:type
description
pattern

aln{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.aln.gz{:bash}

:file

Results in pairwise alignment format

*.aln.gz

biom{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.biom.gz{:bash}

:file

Results in an OTU table in the biom version 1.0 file format

*.biom.gz

mothur{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.mothur.tsv.gz{:bash}

:file

Results in an OTU table in the mothur ’shared’ tab-separated plain text file format

*.mothur.tsv.gz

otu{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.otu.tsv.gz{:bash}

:file

Results in an OTU table in the classic tab-separated plain text format

*.otu.tsv.gz

bam{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.bam{:bash}

:file

Results written in bam format

*.bam

out{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.out.tsv.gz{:bash}

:file

Results in tab-separated output, columns defined by user

*.out.tsv.gz

blast{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.blast.tsv.gz{:bash}

:file

Tab delimited results in blast-like tabular format

*.blast.tsv.gz

uc{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.uc.tsv.gz{:bash}

:file

Tab delimited results in a uclust-like format with 10 columns

*.uc.gz

centroids{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.centroids.fasta.gz{:bash}

:file

Centroid sequences in FASTA format

*.centroids.fasta.gz

clusters{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.clusters.fasta*.gz{:bash}

:file

Clustered sequences in FASTA format

*.clusters.fasta*.gz

profile{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.profile.txt.gz{:bash}

:file

Profile of the clustering results

*.profile.txt.gz

msa{:bash}

meta{:bash}

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.msa.fasta.gz{:bash}

:file

Multiple sequence alignment of the centroids

*.msa.fasta.gz

versions{:bash}

versions.yml{:bash}

:file

File containing software versions

versions.yml

Tools

vsearch
GPL v3-or-later OR BSD-2-clause

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)