nf-core/ribomsqc
Edit

QC pipeline that monitors mass spectrometer performance in ribonucleoside analysis

This is the development version of the pipeline.

Launch development version https://github.com/nf-core/ribomsqc

Introduction

nf-core/ribomsqc is a quality control (QC) pipeline designed for monitoring mass spectrometry performance in ribonucleoside analysis. It parses RAW files, performs XIC extraction using analyte definitions, and generates summary plots with MultiQC.

Samplesheet input

Before running the pipeline, you must provide a samplesheet CSV file containing metadata about the RAW files to be processed. The pipeline supports one or multiple samples, each defined on a separate line in the CSV.

--input '[path to samplesheet.csv]'

The samplesheet must contain a header and two columns:

samplesheet.csv

id,raw_file
Day_5,/full/path/to/Day_5.raw
Sample_XYZ,/another/path/to/Sample_XYZ.raw

Column descriptions

id: Unique identifier for the sample. This can be any descriptive label and is used to name output files.
raw_file: Full path to the corresponding RAW instrument file. Multiple entries can be included to analyze several samples in one pipeline run.

Note

The Day_5 row above is just an example. Replace it with your actual sample IDs and file paths.

Analytes TSV input

To define which compounds should be targeted for chromatographic extraction (XIC), the pipeline requires a tab-delimited TSV file passed with:

--analytes_tsv '[path to analytes.tsv]'

This file must follow a strict format and include the following columns in this exact order:

short_name	long_name	mz_M0	ms2_mz	rt_teoretical
C	Cytidine 50 μg/mL	244.0928	112.0505	555
U	Uridine 25 μg/mL	245.0768	113.0346	1566
m3C	3-Methylcytidine methosulfate 100 μg/mL	258.1084	126.0662	508
m5C	5-Methylcytidine 100 μg/mL	258.1084	126.0662	655
Cm	2-O-Methylcytidine 20 μg/mL	258.1084	112.0505	883
m5U	5-Methyluridine 50 μg/mL	259.0925	127.0502	1866
I	Inosine 25 μg/mL	269.088	137.0458	1741
m1A	1-Methyladenosine 25 μg/mL	282.1197	150.0774	523
G	Guanosine 25 μg/mL	284.0989	152.0567	1726
m7G	7-Methylguanosine 25 μg/mL	298.1146	166.0723	554

Column Descriptions

Column	Description
`short_name`	Unique short identifier for the compound (used internally by the pipeline).
`long_name`	Full descriptive name of the analyte, optionally including concentration.
`mz_M0`	Monoisotopic mass-to-charge ratio. Required.
`mz_M1` / `mz_M2`	Optional isotopic variant m/z values. Reserved for future versions.
`ms2_mz`	Fragment ion used for MS2-level extraction, if applicable.
`rt_teoretical`	Expected retention time (in seconds). You must customize this value.

Notes

The retention times in rt_teoretical must reflect your instrument’s chromatography performance.
If multiple transitions are known, you can fill in mz_M1, mz_M2, and ms2_mz for more targeted detection.
At least one compound must be selected at runtime using the --analyte parameter.

Output directory (`--outdir`)

The --outdir parameter specifies where the pipeline output files will be stored. Its behavior depends on how the path is defined:

If a relative folder name is provided (e.g., results), the directory will be created in the current working directory from which the pipeline is launched.
If an absolute path is given (e.g., /home/user/project/ribomsqc_output), the output will be created exactly at the specified location.

Tip

Use absolute paths in scripts or production workflows to ensure consistent and predictable file placement, especially when running from different directories or via automation.

Running the pipeline

Typical command:

nextflow run nf-core/ribomsqc \
  --input /home/proteomics/mydata/csv/samplesheet.csv \
  --analytes_tsv /home/proteomics/mydata/tsv/qcn1.tsv \
  --analyte m3C \
  --rt_tolerance 150 \
  --mz_tolerance 20 \
  --ms_level 2 \
  --plot_xic_ms1 false \
  --plot_xic_ms2 false \
  --plot_output_path xic_plot \
  --overwrite_tsv true \
  --outdir results \
  -profile singularity

Alternatively, you can define parameters in a separate file:

Minimal `params.yaml` example

params.yaml

input: /home/proteomics/mydata/csv/samplesheet.csv
analytes_tsv: /home/proteomics/mydata/tsv/qcn1.tsv
analyte: m3C
rt_tolerance: 150
mz_tolerance: 20
ms_level: 2
plot_xic_ms1: false
plot_xic_ms2: false
plot_xic_ms2: false
plot_output_path: xic_plot
overwrite_tsv: true
outdir: results

Run with:

nextflow run nf-core/ribomsqc -profile singularity -params-file params.yaml

MultiQC Integration

The pipeline integrates MultiQC. It collects the consolidated .json files generated by the MERGEJSONS module to summarise QC metrics. Output is stored in ${params.outdir}.

Reproducibility

Use -r to specify a pipeline version:

nextflow run nf-core/ribomsqc -r 1.0.1dev ...

Update the pipeline:

nextflow pull nf-core/ribomsqc

Core Nextflow options

-profile docker|singularity|conda|podman|...: Choose execution environment
-resume: Resume from a previous run
-params-file: Load parameters from a YAML/JSON file
-c: Load additional config for cluster resources, etc.

Tips

Use -profile singularity for reproducibility
Compatible with Wave containers for dynamic container resolution
For cluster configs, see nf-core/configs

Example output

results/
├── thermorawfileparser/
│   └── *.mzML
├── msnbasexic/
│   └── *.json
├── mergejsons/
│   └── *_merged_mqc.json
├── multiqc/
│   └── multiqc_report.html
└── pipeline_info/
    └── nf_core_ribomsqc_software_versions.yml

Generated with ❤️ by nf-core and adapted for custom analyte QC workflows.

On this page

nf-core/ribomsqc Edit

Introduction

Samplesheet input

Column descriptions

Analytes TSV input

Column Descriptions

Column Descriptions

Notes

Output directory (--outdir)

Running the pipeline

Minimal params.yaml example

MultiQC Integration

Reproducibility

Core Nextflow options

Tips

Example output

nf-core/ribomsqc
Edit

Output directory (`--outdir`)

Minimal `params.yaml` example