nf-core/mhcquant
Identify and quantify MHC eluted peptides from mass spectrometry raw data
1.2.3). The latest stable release is3.2.0.Output
This document describes the output produced by the pipeline
Pipeline overview
The final output of the pipeline should include the following files:
- mzTab - the community standard format for sharing mass spectrometry search results
- csv - aggregate csv report, containing all information about peptide identification and quantification results
mzTab
mzTab is a light-weight format to report mass spectrometry search results. It provides all important information about idenfied peptide hits and is compatible with the PRIDE Archive - proteomics data repository:
Griss, J. et al. The mzTab Data Exchange Format: Communicating Mass-spectrometry-based Proteomics and Metabolomics Experimental Results to a Wider Audience. Mol Cell Proteomics 13, 2765–2775 (2014).
csv
The csv output file is a table containing all information extracted from a database search throughout the pipeline. See the OpenMS or PSI documentation for more information about annotated scores and format (http://ftp.mi.fu-berlin.de/pub/OpenMS/release1.9-documentation/html/TOPP_TextExporter.html).
Each row index is represented by a label describing its content:
#MAP id filename label size#MAP contains information about the different mzML files that were provided initially
#RUN run_id score_type score_direction date_time search_engine_version parameters#RUN contains information about the search that was performed on each run
#PROTEIN score rank accession protein_description coverage sequence#PROTEIN contains infomration about the protein ids corresponding to the peptides that were detected (No protein inference was performed)
#UNASSIGNEDPEPTIDE rt mz score rank sequence charge aa_before aa_after score_type search_identifier accessions FFId_category feature_id file_origin map_index spectrum_reference COMET:IonFrac COMET:deltCn COMET:deltLCn COMET:lnExpect COMET:lnNumSP COMET:lnRankSP MS:1001491 MS:1001492 MS:1001493 MS:1002252 MS:1002253 MS:1002254 MS:1002255 MS:1002256 MS:1002257 MS:1002258 MS:1002259 num_matched_peptides protein_references target_decoy#UNASSIGNEDPEPTIDE contains information about PSMs that were identified but couldn’t be quantified to a precursor feature on MS Level 1
#CONSENSUS rt_cf mz_cf intensity_cf charge_cf width_cf quality_cf rt_0 mz_0 intensity_0 charge_0 width_0 rt_1 mz_1 intensity_1 charge_1 width_1 rt_2 mz_2 intensity_2 charge_2 width_2 rt_3 mz_3 intensity_3 charge_3 width_3#CONSENSUS contains information about precursor features that were identified in multiple runs (eg. run 1-3 in this case)
#PEPTIDE rt mz score rank sequence charge aa_before aa_after score_type search_identifier accessions FFId_category fea#PEPTIDE contains information about peptide hits that were identified and correspond to the consensus features described one row above.
Predictions
The prediction output is a comma separated table (csv) for each allele, listing each peptide sequence and its corresponding predicted affinity scores:
peptide allele prediction prediction_low prediction_high prediction_percentile