Description

Compare k-mer frequency in reads and assembly to devise the metrics K* and QV*

Input

Name (Type)
Description
Pattern

meta (map)

Groovy Map containing sample information
e.g. [ id:'sample1', single_end:false ]

fasta_assembly (file)

Genome assembly in FASTA; uncompressed, gz compressed [REQUIRED]

*.{fasta, fasta.gz}

meta1 (map)

Groovy Map containing sample read information
e.g. [ id:'sample1', single_end:false ]

meryl_db_reads (file)

K-mer database produced from raw reads using Meryl [REQUIRED]

*.{meryl_db}

lookup_table (file)

Input vector of k-mer probabilities (obtained by genomescope2 with parameter —fitted_hist) [OPTIONAL]

lookup_table.txt

seqmers (file)

Input for pre-built sequence meryl db. By default, the sequence meryl db will be generated from the input genome assembly [OPTIONAL]

*.{meryl_db}

peak (float)

Input to hard set copy 1 and infer multiplicity to copy number. Can be calculated using genomescope2 [REQUIRED]

Output

Name (Type)
Description
Pattern

meta (map)

Groovy Map containing sample information
e.g. [ id:'sample1', single_end:false ]

versions (file)

File containing software versions

versions.yml

hist (file)

The generated 0-centered k*.histogram for sequences in <fasta_assembly.fasta>. Positive k*.values are expected collapsed copies. Negative k*.values are expected expanded copies. Closer to 0 means the expected and found k-mers are well balenced, 1:1.

*.{hist}

log_stderr (file)

Log (stderr) of hist tool execution. The QV and QV*.metrics are reported at the end.

*.{hist.stderr.log}

Tools

merfin
Apache-2.0

Merfin (k-mer based finishing tool) is a suite of subtools to variant filtering, assembly evaluation and polishing via k-mer validation. The subtool -hist estimates the QV (quality value of [Merqury](https://github.com/marbl/merqury)) for each scaffold/contig and genome-wide averages. In addition, Merfin produces a QV* estimate, which accounts also for kmers that are seen in excess with respect to their expected multiplicity predicted from the reads.