Introduction

This document describes the output produced by the pipeline.

The directories listed below will be created in the results directory after the pipeline has finished. All paths are relative to the top-level results directory.

Pipeline overview

The pipeline consists of various steps each will produce multiple files types.

Preprocessing

Centroiding

Output files
  • centroided/
    • *.mzML: Centroided mzML files.

Quantification

Quantification

Output files
  • quantification/
    • *.featureXML: Mass traces in featureXML format.

Requantification

Output files
  • requantification/
    • *.featureXML: Mass traces in featureXML format.
  • requantification_merged/
    • *.featureXML: Mass traces in featureXML format. This is a merged version of quantification and requantification based features.

Annotation

Output files
  • annotation/
    • *.featureXML: Mass traces in featureXML format including adduct information.

Alignment and linking

Alignment

Output files
  • alignment/
    • *.featureXML: Time aligned mass traces in featureXML format.
  • alignment_mzml/
    • *.mzML: Time aligned mzML files.

Linking

Output files
  • linking/
    • *.consensusXML: Linked consensus traces in consensusXML format.

Expression output

The most important output are TSV files produced by quantification and identification step.

These can be found under TABLE_OUTPUT.

Depending on the pipeline parameters, it can contain one, and up to four different files.

  • the file starting with output_sirius_ contains the formula identification by SIRIUS
  • the file starting with output_fingerid_ contains the structural identification by FINGERID
  • the file starting with output_ms2query_ contains the analogue identification by MS2Query
  • the last file starting with output_quantification_ contains quantification information

All files are TSV (tab separate files) and have a column called “id”. This ID can be used to match rows across different files.

Pipeline information

Output files
  • pipeline_info/
    • Reports generated by Nextflow: execution_report.html, execution_timeline.html, execution_trace.txt and pipeline_dag.dot/pipeline_dag.svg.
    • Reports generated by the pipeline: pipeline_report.html, pipeline_report.txt and software_versions.yml. The pipeline_report* files will only be present if the --email / --email_on_fail parameter’s are used when running the pipeline.
    • Reformatted samplesheet files used as input to the pipeline: samplesheet.valid.csv.

Nextflow provides excellent functionality for generating various reports relevant to the running and execution of the pipeline. This will allow you to troubleshoot errors with the running of the pipeline, and also provide you with other information such as launch commands, run times and resource usage.