Description

A program that counts sequence occurrences in FASTQ files.

Input

name:type
description
pattern

meta:map

Groovy Map containing output name. e.g. [ id:‘test’]

fastq:directory

Folder with FASTQ file(s). 2FAST2Q automatically picks up all the FASTQ files inside the provided folder.

*.{fastq,gz}

meta2:map

Groovy Map containing sample information e.g. [ id:‘library_name’, multiple_features_per_read

]

library:file

.csv library file following the ´Feature_name,sequence´ or ´Feature_name,sequence1

´ format. See 2FAST2Q instructions for more information.

*.csv

Output

name:type
description
pattern

count_matrix

meta:map

Groovy Map containing output name. e.g. [ id:‘test’ ]

${prefix}.csv:file

Count matrix csv file

stats

meta:map

Groovy Map containing output name. e.g. [ id:‘test’ ]

${prefix}_stats.csv:file

File containing all the relevant statistics such as quality passing reads, aligned reads, total reads, and sample run times.

distribution_plot

meta:map

Groovy Map containing output name. e.g. [ id:‘test’ ]

${prefix}_distribution_plot.png:file

Violin plot of the distribution of reads per feature across all samples.

reads_plot

meta:map

Groovy Map containing output name. e.g. [ id:‘test’ ]

${prefix}_reads_plot.png:file

Bar plot with the distribution of reads, in absolute numbers, binned to the different quality metrics indicated in the statistics.csv

reads_plot_percentage

meta:map

Groovy Map containing output name. e.g. [ id:‘test’ ]

${prefix}_reads_plot_percentage.png:file

Bar plot with the distribution of reads, in percentage, binned to the different quality metrics indicated in the statistics.csv

versions

versions.yml:file

File containing software versions

versions.yml

Tools

2FAST2Q
GPL-3.0-or-later

2FAST2Q is ideal for CRISPRi-Seq, and for extracting and counting any kind of information from reads in the fastq format, such as barcodes in Bar-seq experiments. 2FAST2Q can work with sequence mismatches, Phred-score, and be used to find and extract unknown sequences delimited by known sequences. 2FAST2Q can extract multiple features per read using either fixed positions or delimiting search sequences.