modules/merfin_hist

Compare k-mer frequency in reads and assembly to devise the metrics K* and QV*

assemblyevaluationqualitycompleteness

https://github.com/nf-core/modules/[...]/modules/nf-core/merfin/hist

Description

Compare k-mer frequency in reads and assembly to devise the metrics K* and QV*

Input

name:type

description

pattern

`meta:map`

Groovy Map containing sample information e.g. [ id:'sample1', single_end:false ]

`fasta_assembly:file`

Genome assembly in FASTA; uncompressed, gz compressed [REQUIRED]

*.{fasta, fasta.gz}

`meta1:map`

Groovy Map containing sample read information e.g. [ id:'sample1', single_end:false ]

`meryl_db_reads:file`

K-mer database produced from raw reads using Meryl [REQUIRED]

*.{meryl_db}

`lookup_table:file`

Input vector of k-mer probabilities (obtained by genomescope2 with parameter —fitted_hist) [OPTIONAL]

lookup_table.txt

`seqmers:file`

Input for pre-built sequence meryl db. By default, the sequence meryl db will be generated from the input genome assembly [OPTIONAL]

*.{meryl_db}

`peak:float`

Input to hard set copy 1 and infer multiplicity to copy number. Can be calculated using genomescope2 [REQUIRED]

Output

name:type

description

pattern

`hist`

`meta:map`

Groovy Map containing sample information e.g. [ id:'sample1', single_end:false ]

`*.hist:file`

The generated 0-centered k*.histogram for sequences in <fasta_assembly.fasta>. Positive k*.values are expected collapsed copies. Negative k*.values are expected expanded copies. Closer to 0 means the expected and found k-mers are well balanced, 1:1.

*.{hist}

`log_stderr`

`meta:map`

Groovy Map containing sample information e.g. [ id:'sample1', single_end:false ]

`*.hist.stderr.log:file`

Log (stderr) of hist tool execution. The QV and QV*.metrics are reported at the end.

*.{hist.stderr.log}

`versions`

`versions.yml:file`

File containing software versions

versions.yml

Tools

merfin
Apache-2.0

Merfin (k-mer based finishing tool) is a suite of subtools to variant filtering, assembly evaluation and polishing via k-mer validation. The subtool -hist estimates the QV (quality value of [Merqury](https://github.com/marbl/merqury)) for each scaffold/contig and genome-wide averages. In addition, Merfin produces a QV* estimate, which accounts also for kmers that are seen in excess with respect to their expected multiplicity predicted from the reads.

github.com/arangrhie/merfin github.com/arangrhie/merfin/wiki/Best-practices-for-Merfin 10.1038/s41592-022-01445-y

modules/merfin_hist

Description

Input

`meta:map`

`fasta_assembly:file`

`meta1:map`

`meryl_db_reads:file`

`lookup_table:file`

`seqmers:file`

`peak:float`

Output

`hist`

`meta:map`

`*.hist:file`

`log_stderr`

`meta:map`

`*.hist.stderr.log:file`

`versions`

`versions.yml:file`

Tools

merfin
Apache-2.0

maintainer

get in touch

modules/merfin_hist

Description

Input

meta:map

fasta_assembly:file

meta1:map

meryl_db_reads:file

lookup_table:file

seqmers:file

peak:float

Output

hist

meta:map

*.hist:file

log_stderr

meta:map

*.hist.stderr.log:file

versions

versions.yml:file

Tools

merfin Apache-2.0

maintainer

get in touch

`meta:map`

`fasta_assembly:file`

`meta1:map`

`meryl_db_reads:file`

`lookup_table:file`

`seqmers:file`

`peak:float`

`hist`

`meta:map`

`*.hist:file`

`log_stderr`

`meta:map`

`*.hist.stderr.log:file`

`versions`

`versions.yml:file`

merfin
Apache-2.0