Description

Accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads.

Input

name:type
description
pattern

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:true ]

reads:file

fasta/fastq file

*.{fasta,fastq}

mode:string

Canu mode depending on the input data (source and error rate)

-pacbio|-nanopore|-pacbio-hifi

genomesize:string

An estimate of the size of the genome. Common suffices are allowed, for example, 3.7m or 2.8g

<number>[g|m|k]

Output

name:type
description
pattern

report

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

*.report:file

Most of the analysis reported during assembly

*.report

assembly

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

*.contigs.fasta.gz:file

Everything which could be assembled and is the full assembly, including both unique, repetitive, and bubble elements.

*.contigs.fasta

contigs

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

*.unassembled.fasta.gz:file

Reads and low-coverage contigs which could not be incorporated into the primary assembly.

*.unassembled.fasta

corrected_reads

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

*.correctedReads.fasta.gz:file

The reads after correction.

*.correctedReads.fasta.gz

corrected_trimmed_reads

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

*.trimmedReads.fasta.gz:file

The corrected reads after overlap based trimming

*.trimmedReads.fasta.gz

metadata

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

*.contigs.layout:file

(undocumented)

*.contigs.layout

contig_position

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

*.contigs.layout.readToTig:file

The position of each read in a contig

*.contigs.layout.readToTig

contig_info

meta:map

Groovy Map containing sample information e.g. [ id:‘test’, single_end:false ]

*.contigs.layout.tigInfo:file

A list of the contigs, lengths, coverage, number of reads and other metadata. Essentially the same information provided in the FASTA header line.

*.contigs.layout.tigInfo

versions

versions.yml:file

File containing software versions

versions.yml

Tools

canu
GPL v2 and others

Canu is a fork of the Celera Assembler designed for high-noise single-molecule sequencing.