nf-core/diaproteomics
Automated quantitative analysis of DIA proteomics mass spectrometry measurements.
22.10.6
.
Learn more.
Define where the pipeline should find input data and save output data.
Input sample sheet (containing path and meta data of raw or mzML files)
string
Use this to specify a sample sheet table including your input raw or mzml files as well as their metainformation such as BatchID, MSstats_Condition and MSstats_BioReplicate (of note: the BioReplicate column is optional). For example:
| Sample | BatchID | MSstats_Condition | MSstats_BioReplicate | Spectra_Filepath |
| -----|:------------:| ----------:|----------:|------------------------------------------:|
| 1 | MelanomaStudy | Malignant | BioReplicate1 | data/Melanoma_DIA_standard_rep1.raw |
| 2 | MelanomaStudy | Malignant | BioReplicate1 | data/Melanoma_DIA_standard_rep2.raw |
| 3 | MelanomaStudy | Benign | BioReplicate2 | data/SkinTissue_DIA_standard_rep1.raw |
| 4 | MelanomaStudy | Benign | BioReplicate2 | data/SkinTissue_DIA_standard_rep2.raw |
| 5 | BreastCancerStudy | Malignant | BioReplicate1 | data/BreastCancer_DIA_standard_rep1.raw |
| 6 | BreastCancerStudy | Malignant | BioReplicate1 | data/BreastCancer_DIA_standard_rep2.raw |
| 7 | BreastCancerStudy | Benign | BioReplicate2 | data/BreastTissue_DIA_standard_rep1.raw |
| 8 | BreastCancerStudy | Benign | BioReplicate2 | data/BreastTissue_DIA_standard_rep2.raw |
Input sample sheet of spectral libraries (tsv, pqp, TraML)
string
Use this to specify a sample sheet table including your input spectral library files as well as their metainformation such as BatchID and MSstats_Condition. For example:
| Sample | BatchID | Library_Filepath |
| -----|:------------:|------------------------------------------:|
| 1 | MelanomaStudy | data/Melanoma_library.tsv |
| 2 | BreastCancerStudy | data/BraCa_library.tsv |
Path to internal retention time standard sample sheet (tsv, pqp, TraML)
string
Use this to specify a sample sheet table including your input internal retention time spectral library files as well as their metainformation such as BatchID and MSstats_Condition. For example:
| Sample | BatchID | irt_Filepath |
| -----|:------------:|------------------------------------------:|
| 1 | MelanomaStudy | data/Melanoma_irt_library.tsv |
| 2 | BreastCancerStudy | data/BraCa_irt_library.tsv |
The output directory where the results will be saved.
string
./results
Email address for completion summary.
string
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (~/.nextflow/config
) then you don't need to specify this on the command line for every run.
Set this flag if the spectral library should be generated using EasyPQP from provided DDA data - identification search results and corresponding raw data.
boolean
Input sample sheet to use for library generation eg. DDA raw data (mzML) and DDA identification data (pepXML, mzid, idXML)
string
Use this to specify a sample sheet table including your input DDA raw or mzml files as well as their corresponding peptide identification files and BatchID metainformation. For example:
| Sample | BatchID | Spectra_Filepath | Id_Filepath |
| -----|:------------:| ----------:|------------------------------------------:|
| 1 | MelanomaStudy | data/Melanoma_DDA_rep1.mzML | data/Melanoma_DDA_rep1.pepXML |
| 2 | MelanomaStudy | data/Melanoma_DDA_rep2.mzML | data/Melanoma_DDA_rep2.pepXML |
| 3 | BreastCancerStudy | data/BraCa_DDA_rep1.mzML | data/BraCa_DDA_rep1.pepXML |
| 4 | BreastCancerStudy | data/BraCa_DDA_rep2.mzML | data/BraCa_DDA_rep2.pepXML |
PSM fdr threshold to align peptide ids with reference run.
number
0.01
Minimum number of transitions for assay
integer
4
Maximum number of transitions for assay
integer
6
Method for generating decoys
string
Set this flag if using a spectral library that already includes decoy sequences and therefor skip assay and decoy generation.
boolean
Path to unimod file needs to be provided
string
https://raw.githubusercontent.com/nf-core/test-datasets/diaproteomics/unimod.xml
Example file:
https://raw.githubusercontent.com/nf-core/test-datasets/diaproteomics/unimod.xml
Set this flag if you only want to generate spectral libraries from DDA data
boolean
Set this flag if pseudo internal retention time standards should be generated using EasyPQP from provided DDA data - identification search results and corresponding raw data.
boolean
Number of pseudo irts selected from dda identifications based on the best q-value
integer
250
Set this flag if pseudo irts should be selected from the 1st and 4th RT quantile only
boolean
Set this flag if the libraries defined in the input or by generation should be merged according to the BatchID
boolean
Set this flag if pairwise RT alignment should be applied to libraries when merging.
boolean
Minimum number of peptides to compute RT alignment during pairwise merging of libraries
integer
100
Mass tolerance for transition extraction (ppm)
integer
30
Unit for mz window
string
ppm
Mass tolerance for precursor transition extraction (ppm)
integer
10
Unit for mz window
string
ppm
RT window for transition extraction (seconds)
integer
600
Minimal random mean squared error for irt RT alignment
number
0.95
Number of bins defined for the RT Normalization
integer
10
Number of bins that have to be covered for the RT Normalization
integer
8
Method for irt RT alignment for example
string
Force the analysis of the OpenSwathWorkflow despite severe warnings
boolean
Whether to use ms1 information for scoring and extraction
boolean
true
Minimal distance to the upper edge of a Swath window to still consider a precursor, in Thomson
integer
Set mode whether to work in memory or to store data as cache first
string
Machine learning classifier used for pyprophet target / decoy separation
string
MS Level of pyprophet FDR calculation
string
Abstraction level of pyrophet FDR calculation
string
Threshold for pyprophet FDR filtering on peakgroup abstraction level
number
0.01
Threshold for pyprophet FDR filtering on peptide abstraction level
number
0.01
Threshold for pyprophet FDR filtering on protein abstraction level
number
0.01
Start for pyprophet non-parametric pi0 estimation
number
0.1
End for pyprophet non-parametric pi0 estimation
number
0.5
Steps for pyprophet non-parametric pi0 estimation
number
0.05
DIAlignR global alignment FDR threshold: After the chromatogram alignment all peaks should still satisfy the global alignment FDR threshold.
number
0.01
DIAlignR analyte FDR threshold: Before the chromatogram alignment only peaks satisfying this threshold will be matched across runs.
number
0.01
DIAlignR unalignment FDR threshold: XICs below this threshold will be considered valid without any alignment.
number
0.01
DIAlignR alignment FDR threshold: After the chromatogram alignment aligned peaks should satisfy this threshold.
number
0.05
DIAlignR query FDR threshold: During the chromatogram alignment only peaks satisfying this maximum FDR threshold will be considered as potential matches.
number
0.05
DIAlignR XICfilter parameter
string
Whether DIAlignR should be executed using multithreading (may cause errors)
boolean
true
Set this flag if statistical normalization and visualizations should be generated using MSstats
boolean
true
Set this flag if output plots should be generated.
boolean
true
- BarChartProtein/Peptide Counts
- Pie Chart: Peptide Charge distribution
- Density Scatter: Library vs run RT deviations for all identifications
- Heatmap: Peptide quantities across MS runs
- Pyprophet score plots
In addition MSstats will run and export comparative protein statistics plots such as Volcano plots if protein level is specified.
Optional mzTab export (Warning: the mzTab format is not yet well supported for DIA)
boolean
Less common options for the pipeline, typically set in a config file.
Display help text.
boolean
Method used to save pipeline results to output directory.
string
The Nextflow publishDir
option specifies which intermediate files should be saved to the output directory. This option tells the pipeline what method should be used to move these files. See Nextflow docs for details.
Boolean whether to validate parameters against the schema at runtime
boolean
true
Email address for completion summary, only when pipeline fails.
string
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
This works exactly as with --email
, except emails are only sent if the workflow is not successful.
Send plain-text email instead of HTML.
boolean
Set to receive plain-text e-mails instead of HTML formatted.
Do not use coloured log outputs.
boolean
Set to disable colourful command line output and live life in monochrome.
Directory to keep pipeline Nextflow logs and reports.
string
${params.outdir}/pipeline_info
Show all params when using --help
boolean
Set the top limit for requested resources for any single job.
Maximum number of CPUs that can be requested for any single job.
integer
16
Use to set an upper-limit for the CPU requirement for each process. Should be an integer e.g. --max_cpus 1
Maximum amount of memory that can be requested for any single job.
string
128.GB
^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$
Use to set an upper-limit for the memory requirement for each process. Should be a string in the format integer-unit e.g. --max_memory '8.GB'
Maximum amount of time that can be requested for any single job.
string
240.h
^(\d+\.?\s*(s|m|h|day)\s*)+$
Use to set an upper-limit for the time requirement for each process. Should be a string in the format integer-unit e.g. --max_time '2.h'
Parameters used to describe centralised config profiles. These should not be edited.
Git commit id for Institutional configs.
string
master
Provide git commit id for custom Institutional configs hosted at nf-core/configs
. This was implemented for reproducibility purposes. Default: master
.
## Download and use config file with following git commit id
--custom_config_version d52db660777c4bf36546ddb188ec530c3ada1b96
Base directory for Institutional configs.
string
https://raw.githubusercontent.com/nf-core/configs/master
If you're running offline, nextflow will not be able to fetch the institutional config files from the internet. If you don't need them, then this is not a problem. If you do need them, you should download the files from the repo and tell nextflow where to find them with the custom_config_base
option. For example:
## Download and unzip the config files
cd /path/to/my/configs
wget https://github.com/nf-core/configs/archive/master.zip
unzip master.zip
## Run the pipeline
cd /path/to/my/data
nextflow run /path/to/pipeline/ --custom_config_base /path/to/my/configs/configs-master/
Note that the nf-core/tools helper package has a
download
command to download all required pipeline files + singularity containers + institutional configs in one go for you, to make this process easier.
Institutional configs hostname.
string
Institutional config name.
string
Institutional config description.
string
Institutional config contact information.
string
Institutional config URL link.
string