Description

Accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads.

Input

Name (Type)
Description
Pattern

meta (map)

Groovy Map containing sample information
e.g. [ id:‘test’, single_end

]

reads (file)

fasta/fastq file

*.{fasta,fastq}

mode (value)

Canu mode depending on the input data (source and error rate)

-pacbio|-nanopore|-pacbio-hifi

genomesize (value)

An estimate of the size of the genome. Common suffices are allowed, for example, 3.7m or 2.8g

<number>[g|m|k]

Output

Name (Type)
Description
Pattern

meta (map)

Groovy Map containing sample information
e.g. [ id:‘test’, single_end

]

versions (file)

File containing software versions

versions.yml

report (file)

Most of the analysis reported during assembly

*.report

assembly (file)

Everything which could be assembled and is the full assembly, including both unique, repetitive, and bubble elements.

*.contigs.fasta

contigs (file)

Reads and low-coverage contigs which could not be incorporated into the primary assembly.

*.unassembled.fasta

corrected_reads (file)

The reads after correction.

*.correctedReads.fasta.gz

corrected_trimmed_reads (file)

The corrected reads after overlap based trimming

*.trimmedReads.fasta.gz

metadata (file)

(undocumented)

*.contigs.layout

contig_position (file)

The position of each read in a contig

*.contigs.layout.readToTig

contig_info (file)

A list of the contigs, lengths, coverage, number of reads and other metadata. Essentially the same information provided in the FASTA header line.

*.contigs.layout.tigInfo

Tools

canu
GPL v2 and others

Canu is a fork of the Celera Assembler designed for high-noise single-molecule sequencing.