Description

Cluster sequences using a single-pass, greedy centroid-based clustering algorithm.

Input

name:type
description
pattern

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

fasta

:file

Sequences to cluster in FASTA format

*.{fasta,fa,fasta.gz,fa.gz}

Output

name:type
description
pattern

aln

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.aln.gz

:file

Results in pairwise alignment format

*.aln.gz

biom

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.biom.gz

:file

Results in an OTU table in the biom version 1.0 file format

*.biom.gz

mothur

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.mothur.tsv.gz

:file

Results in an OTU table in the mothur ’shared’ tab-separated plain text file format

*.mothur.tsv.gz

otu

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.otu.tsv.gz

:file

Results in an OTU table in the classic tab-separated plain text format

*.otu.tsv.gz

bam

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.bam

:file

Results written in bam format

*.bam

out

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.out.tsv.gz

:file

Results in tab-separated output, columns defined by user

*.out.tsv.gz

blast

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.blast.tsv.gz

:file

Tab delimited results in blast-like tabular format

*.blast.tsv.gz

uc

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.uc.tsv.gz

:file

Tab delimited results in a uclust-like format with 10 columns

*.uc.gz

centroids

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.centroids.fasta.gz

:file

Centroid sequences in FASTA format

*.centroids.fasta.gz

clusters

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.clusters.fasta*.gz

:file

Clustered sequences in FASTA format

*.clusters.fasta*.gz

profile

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.profile.txt.gz

:file

Profile of the clustering results

*.profile.txt.gz

msa

meta

:map

Groovy Map containing sample information e.g. [ id:‘test’ ]

*.msa.fasta.gz

:file

Multiple sequence alignment of the centroids

*.msa.fasta.gz

versions_vsearch

${task.process}

:string

The process the versions were collected from

vsearch

:string

The tool name

vsearch --version 2>&1 | sed -n "1s/.*v\([0-9.]*\).*/\\1/p"

:eval

The expression to obtain the version of the tool

versions_samtools

${task.process}

:string

The process the versions were collected from

samtools

:string

The name of the tool

samtools version | sed '1!d;s/.* //'

:eval

The expression to obtain the version of the tool

Topics

name:type
description
pattern

versions

${task.process}

:string

The process the versions were collected from

vsearch

:string

The tool name

vsearch --version 2>&1 | sed -n "1s/.*v\([0-9.]*\).*/\\1/p"

:eval

The expression to obtain the version of the tool

${task.process}

:string

The process the versions were collected from

samtools

:string

The name of the tool

samtools version | sed '1!d;s/.* //'

:eval

The expression to obtain the version of the tool

Tools

vsearch
GPL v3-or-later OR BSD-2-clause

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)