modules/vsearch_cluster

Cluster sequences using a single-pass, greedy centroid-based clustering algorithm.

vsearchclusteringmicrobiome

https://github.com/nf-core/modules/[...]/modules/nf-core/vsearch/cluster

Description

Cluster sequences using a single-pass, greedy centroid-based clustering algorithm.

Input

name:type

description

pattern

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:‘test’ ]

`fasta{:bash}`
`:file`

Sequences to cluster in FASTA format

*.{fasta,fa,fasta.gz,fa.gz}

Output

name:type

description

pattern

`aln{:bash}`

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:‘test’ ]

`*.aln.gz{:bash}`
`:file`

Results in pairwise alignment format

*.aln.gz

`biom{:bash}`

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:‘test’ ]

`*.biom.gz{:bash}`
`:file`

Results in an OTU table in the biom version 1.0 file format

*.biom.gz

`mothur{:bash}`

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:‘test’ ]

`*.mothur.tsv.gz{:bash}`
`:file`

Results in an OTU table in the mothur ’shared’ tab-separated plain text file format

*.mothur.tsv.gz

`otu{:bash}`

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:‘test’ ]

`*.otu.tsv.gz{:bash}`
`:file`

Results in an OTU table in the classic tab-separated plain text format

*.otu.tsv.gz

`bam{:bash}`

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:‘test’ ]

`*.bam{:bash}`
`:file`

Results written in bam format

*.bam

`out{:bash}`

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:‘test’ ]

`*.out.tsv.gz{:bash}`
`:file`

Results in tab-separated output, columns defined by user

*.out.tsv.gz

`blast{:bash}`

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:‘test’ ]

`*.blast.tsv.gz{:bash}`
`:file`

Tab delimited results in blast-like tabular format

*.blast.tsv.gz

`uc{:bash}`

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:‘test’ ]

`*.uc.tsv.gz{:bash}`
`:file`

Tab delimited results in a uclust-like format with 10 columns

*.uc.gz

`centroids{:bash}`

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:‘test’ ]

`*.centroids.fasta.gz{:bash}`
`:file`

Centroid sequences in FASTA format

*.centroids.fasta.gz

`clusters{:bash}`

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:‘test’ ]

`.clusters.fasta.gz{:bash}`
`:file`

Clustered sequences in FASTA format

*.clusters.fasta*.gz

`profile{:bash}`

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:‘test’ ]

`*.profile.txt.gz{:bash}`
`:file`

Profile of the clustering results

*.profile.txt.gz

`msa{:bash}`

`meta{:bash}`
`:map`

Groovy Map containing sample information e.g. [ id:‘test’ ]

`*.msa.fasta.gz{:bash}`
`:file`

Multiple sequence alignment of the centroids

*.msa.fasta.gz

`versions{:bash}`

`versions.yml{:bash}`
`:file`

File containing software versions

versions.yml

Tools

vsearch
GPL v3-or-later OR BSD-2-clause

VSEARCH is a versatile open-source tool for microbiome analysis, including chimera detection, clustering, dereplication and rereplication, extraction, FASTA/FASTQ/SFF file processing, masking, orienting, pair-wise alignment, restriction site cutting, searching, shuffling, sorting, subsampling, and taxonomic classification of amplicon sequences for metagenomics, genomics, and population genetics. (USEARCH alternative)

github.com/torognes/vsearch github.com/torognes/vsearch/releases/download/v2.21.1/vsearch_manual.pdf https://github.com/torognes/vsearch 10.7717/peerj.2584

modules/vsearch_cluster

Description

Input

meta{:bash}:map

fasta{:bash}:file

Output

aln{:bash}

meta{:bash}:map

*.aln.gz{:bash}:file

biom{:bash}

meta{:bash}:map

*.biom.gz{:bash}:file

mothur{:bash}

meta{:bash}:map

*.mothur.tsv.gz{:bash}:file

otu{:bash}

meta{:bash}:map

*.otu.tsv.gz{:bash}:file

bam{:bash}

meta{:bash}:map

*.bam{:bash}:file

out{:bash}

meta{:bash}:map

*.out.tsv.gz{:bash}:file

blast{:bash}

meta{:bash}:map

*.blast.tsv.gz{:bash}:file

uc{:bash}

meta{:bash}:map

*.uc.tsv.gz{:bash}:file

centroids{:bash}

meta{:bash}:map

*.centroids.fasta.gz{:bash}:file

clusters{:bash}

meta{:bash}:map

*.clusters.fasta*.gz{:bash}:file

profile{:bash}

meta{:bash}:map

*.profile.txt.gz{:bash}:file

msa{:bash}

meta{:bash}:map

*.msa.fasta.gz{:bash}:file

versions{:bash}

versions.yml{:bash}:file

Tools

vsearch GPL v3-or-later OR BSD-2-clause

included in

maintainer

get in touch

`meta{:bash}`
`:map`

`fasta{:bash}`
`:file`

`aln{:bash}`

`meta{:bash}`
`:map`

`*.aln.gz{:bash}`
`:file`

`biom{:bash}`

`meta{:bash}`
`:map`

`*.biom.gz{:bash}`
`:file`

`mothur{:bash}`

`meta{:bash}`
`:map`

`*.mothur.tsv.gz{:bash}`
`:file`

`otu{:bash}`

`meta{:bash}`
`:map`

`*.otu.tsv.gz{:bash}`
`:file`

`bam{:bash}`

`meta{:bash}`
`:map`

`*.bam{:bash}`
`:file`

`out{:bash}`

`meta{:bash}`
`:map`

`*.out.tsv.gz{:bash}`
`:file`

`blast{:bash}`

`meta{:bash}`
`:map`

`*.blast.tsv.gz{:bash}`
`:file`

`uc{:bash}`

`meta{:bash}`
`:map`

`*.uc.tsv.gz{:bash}`
`:file`

`centroids{:bash}`

`meta{:bash}`
`:map`

`*.centroids.fasta.gz{:bash}`
`:file`

`clusters{:bash}`

`meta{:bash}`
`:map`

`.clusters.fasta.gz{:bash}`
`:file`

`profile{:bash}`

`meta{:bash}`
`:map`

`*.profile.txt.gz{:bash}`
`:file`

`msa{:bash}`

`meta{:bash}`
`:map`

`*.msa.fasta.gz{:bash}`
`:file`

`versions{:bash}`

`versions.yml{:bash}`
`:file`

vsearch
GPL v3-or-later OR BSD-2-clause