Introduction

nf-core/ribomsqc is a quality control (QC) pipeline designed for monitoring mass spectrometry performance in ribonucleoside analysis. It parses RAW files, performs XIC extraction using analyte definitions, and generates summary plots with MultiQC.

Samplesheet input

Before running the pipeline, you must provide a samplesheet CSV file containing metadata about the RAW files to be processed. The pipeline supports one or multiple samples, each defined on a separate line in the CSV.

--input '[path to samplesheet.csv]'

The samplesheet must contain a header and two columns:

samplesheet.csv
id,raw_file
Day_5,/full/path/to/Day_5.raw
Sample_XYZ,/another/path/to/Sample_XYZ.raw

Column descriptions

  • id: Unique identifier for the sample. This can be any descriptive label and is used to name output files.
  • raw_file: Full path to the corresponding RAW instrument file. Multiple entries can be included to analyze several samples in one pipeline run.
Note

The Day_5 row above is just an example. Replace it with your actual sample IDs and file paths.

Analytes TSV input

To define which compounds should be targeted for chromatographic extraction (XIC), the pipeline requires a tab-delimited TSV file passed with:

--analytes_tsv '[path to analytes.tsv]'

This file must follow a strict format and include the following columns in this exact order:

short_namelong_namemz_M0mz_M1mz_M2ms2_mzrt_teoretical
CCytidine 50 μg/mL244.0928112.0505555
UUridine 25 μg/mL245.0768113.03461566
m3C3-Methylcytidine methosulfate 100 μg/mL258.1084126.0662508
m5C5-Methylcytidine 100 μg/mL258.1084126.0662655
Cm2-O-Methylcytidine 20 μg/mL258.1084112.0505883
m5U5-Methyluridine 50 μg/mL259.0925127.05021866
IInosine 25 μg/mL269.088137.04581741
m1A1-Methyladenosine 25 μg/mL282.1197150.0774523
GGuanosine 25 μg/mL284.0989152.05671726
m7G7-Methylguanosine 25 μg/mL298.1146166.0723554

Column Descriptions

Column Descriptions

ColumnDescription
short_nameUnique short identifier for the compound (used internally by the pipeline).
long_nameFull descriptive name of the analyte, optionally including concentration.
mz_M0Monoisotopic mass-to-charge ratio. Required.
mz_M1 / mz_M2Optional isotopic variant m/z values. Reserved for future versions.
ms2_mzFragment ion used for MS2-level extraction, if applicable.
rt_teoreticalExpected retention time (in seconds). You must customize this value.

Notes

  • The retention times in rt_teoretical must reflect your instrument’s chromatography performance.
  • If multiple transitions are known, you can fill in mz_M1, mz_M2, and ms2_mz for more targeted detection.
  • At least one compound must be selected at runtime using the --analyte parameter.

Output directory (--outdir)

The --outdir parameter specifies where the pipeline output files will be stored. Its behavior depends on how the path is defined:

  • If a relative folder name is provided (e.g., results), the directory will be created in the current working directory from which the pipeline is launched.
  • If an absolute path is given (e.g., /home/user/project/ribomsqc_output), the output will be created exactly at the specified location.
Tip

Use absolute paths in scripts or production workflows to ensure consistent and predictable file placement, especially when running from different directories or via automation.

Running the pipeline

Typical command:

nextflow run nf-core/ribomsqc \
  --input /home/proteomics/mydata/csv/samplesheet.csv \
  --analytes_tsv /home/proteomics/mydata/tsv/qcn1.tsv \
  --analyte m3C \
  --rt_tolerance 150 \
  --mz_tolerance 20 \
  --ms_level 2 \
  --plot_xic_ms1 false \
  --plot_xic_ms2 false \
  --plot_output_path xic_plot \
  --overwrite_tsv true \
  --outdir results \
  -profile singularity

Alternatively, you can define parameters in a separate file:

Minimal params.yaml example

params.yaml
input: /home/proteomics/mydata/csv/samplesheet.csv
analytes_tsv: /home/proteomics/mydata/tsv/qcn1.tsv
analyte: m3C
rt_tolerance: 150
mz_tolerance: 20
ms_level: 2
plot_xic_ms1: false
plot_xic_ms2: false
plot_xic_ms2: false
plot_output_path: xic_plot
overwrite_tsv: true
outdir: results

Run with:

nextflow run nf-core/ribomsqc -profile singularity -params-file params.yaml

MultiQC Integration

The pipeline integrates MultiQC. It collects the consolidated .json files generated by the MERGEJSONS module to summarise QC metrics. Output is stored in ${params.outdir}.

Reproducibility

Use -r to specify a pipeline version:

nextflow run nf-core/ribomsqc -r 1.0.1dev ...

Update the pipeline:

nextflow pull nf-core/ribomsqc

Core Nextflow options

  • -profile docker|singularity|conda|podman|...: Choose execution environment
  • -resume: Resume from a previous run
  • -params-file: Load parameters from a YAML/JSON file
  • -c: Load additional config for cluster resources, etc.

Tips

  • Use -profile singularity for reproducibility
  • Compatible with Wave containers for dynamic container resolution
  • For cluster configs, see nf-core/configs

Example output

results/
├── thermorawfileparser/
   └── *.mzML
├── msnbasexic/
   └── *.json
├── mergejsons/
   └── *_merged_mqc.json
├── multiqc/
   └── multiqc_report.html
└── pipeline_info/
    └── nf_core_ribomsqc_software_versions.yml

Generated with ❤️ by nf-core and adapted for custom analyte QC workflows.