drugresponseeval: Parameters

Define the models and baselines to be tested.

Model to be tested.

type: string

default: NaiveDrugMeanPredictor

Model to be tested. See the documentation for a list of pre-implemented models. Can be multiple models separated by ','.

Baselines to be tested.

type: string

default: NaiveMeanEffectsPredictor

Baselines to be tested. See documentation of a list of available models. For baselines, randomization and robustness tests are not run. The NaiveMeanEffectsPredictor will always be included.

Define where the pipeline should find input data and save output data.

Run name for the pipeline. The subdirectory in results will be named like this.

type: string

default: my_run

You will need to set a run identifier for the pipeline. This is used to create a unique output directory for each run.

Name of the dataset. Pre-supplied datasets are CTRPv2, CTRPv1, CCLE, GDSC1, GDSC2, TOYv1, TOYv2.

type: string

default: CTRPv2

Name of the dataset used for the pipeline. This can be either one of the provided datasets ('GDSC1', 'GDSC2', 'CCLE', 'CTRPv2', 'TOYv1', 'TOYv2) in which case the datasets with the fitted curves is downloaded, or a custom dataset name, pointing either to raw viability measurements for automatic curve fitting, or pre-fit data (see no_refitting option; not recommended for dataset comparability reasons due to potential differences in fitting procedures).

The output directory where the results will be saved. Default is results/

type: string

default: results

Email address for completion summary.

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (~/.nextflow/config) then you don't need to specify this on the command line for every run.

Define the mode in which the pipeline will be run.

Run the pipeline in test mode LPO (Leave-random-Pairs-Out), LCO (Leave-Cell-line-Out), or LDO (Leave-Drug-Out).

type: string

default: LCO

pattern: ^((LPO|LCO|LTO|LDO)?,?)*(?<!,)$

Which tests to run (LPO=Leave-random-Pairs-Out, LCO=Leave-Cell-line-Out, LTO=Leave-Tissue-Out, LDO=Leave-Drug-Out). Can be a list of test runs e.g. 'LPO,LCO,LTO,LDO' to run all tests. Default is LCO.

Options for randomization.

Randomization mode for the pipeline.

type: string

default: None

pattern: ^(None|(?:SVR[CD]|SVC[CD])(,(?:SVR[CD]|SVC[CD]))*)$

Which randomization tests to run, additionally to the normal run. Default is None which means no randomization tests are run. Modes: SVCC, SVRC, SVCD, SVRD. Can be a list of randomization tests e.g. 'SCVC,SCVD' to run two tests. Default is None. SVCC: Single View Constant for Cell Lines: in this mode, one experiment is done for every cell line view the model uses (e.g. gene expression, mutation, ..). For each experiment one cell line view is held constant while the others are randomized. SVRC Single View Random for Cell Lines: in this mode, one experiment is done for every cell line view the model uses (e.g. gene expression, mutation, ..).

Randomization type for the pipeline.

type: string

type of randomization to use. Choose from "permutation", "invariant". Default is "permutation

Options for robustness.

Number of trials to run for the robustness test

type: integer

Number of trials to run for the robustness test. Default is 0, which means no robustness test is run. The robustness test is a test where the model is trained with varying seeds. This is done multiple times to see how stable the model is.

Options for data input.

Path to the data directory.

type: string

default: data

Path to the data directory. The downloaded data will be exported here. If you supply custom data, it goes here, too.

The name of the drug response measure to use.

type: string

Column of the response dataset in which the drug response is stored.

Datasets for cross-study prediction.

type: string

List of datasets to use to evaluate predictions across studies. Can be a combination like 'CTRPv1,CCLE'. Default is empty string which means no cross-study datasets are used.

Link to the latest Zenodo version of the dataset.

type: string

default: https://zenodo.org/records/15533857/files/

pattern: ^https://zenodo.org/records/[0-9]+/files/$

Link to the Zenodo dataset from where pre-supplied datasets like CTRPv2 are downloaded.

Additional options for the pipeline.

False by default (=refitting). By default, we use measures calculated with CurveCurator instead of original measures reported by the authors for the available datasets, or invoke automatic fitting of custom raw viability data with CurveCurator. Set this flag to disable this option.

type: boolean

By default, measures calculated by CurveCurator (by re-fitting the response curves, see 'measure' option for details) are used for available datasets, which allows better comparability between datasets. When providing a custom dataset (see 'dataset_name' option), we expect a csv-formatted file at <path_data>/<dataset_name>/<dataset_name>_raw.csv (also see 'path_data' option) containing the raw response data. We fit the curves by default with CurveCurator to provide fair comparison to our other available datasets. The fitted data will then be stored at <path_data>/<dataset_name>/<dataset_name>.csv. If you want to disable this option, set the flag.

Optimization metric for the pipeline.

type: string

Optimization metric for the pipeline. All models will minimize (MSE, RMSE, MAE)/maximize (R^2, Pearson, Spearman, Kendall) this metric calculated on the validation set. Default is RMSE.

Number of cross-validation splits.

type: integer

default: 10

Number of cross-validation splits. Default is 10.

Response transformation

type: string

Transformation to apply to the response variable possible values: None, standard, minmax, robust

Model checkpoint directory

type: string

default: TEMPORARY

Directory to save model checkpoints.

Disable hyperparameter tuning.

type: boolean

Set this flag to disable hyperparameter tuning. If set, the pipeline will not perform hyperparameter tuning and will use the default parameters for the models (more meant for quick runs or debugging).

Train final model on full data.

type: boolean

If True, saves a final model, trained/tuned on the union of all folds after CV. This is useful if you want to use the model for predictions on new data after running the pipeline.

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden

type: string

default: master

Base directory for Institutional configs.

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/configs/master

If you're running offline, Nextflow will not be able to fetch the institutional config files from the internet. If you don't need them, then this is not a problem. If you do need them, you should download the files from the repo and tell Nextflow where to find them with this parameter.

Institutional config name.

hidden

type: string

Institutional config description.

hidden

type: string

Institutional config contact information.

hidden

type: string

Institutional config URL link.

hidden

type: string

Less common options for the pipeline, typically set in a config file.

Display version and exit.

hidden

type: boolean

Method used to save pipeline results to output directory.

hidden

type: string

The Nextflow publishDir option specifies which intermediate files should be saved to the output directory. This option tells the pipeline what method should be used to move these files. See Nextflow docs for details.

Email address for completion summary, only when pipeline fails.

hidden

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

An email address to send a summary email to when the pipeline is completed - ONLY sent if the pipeline does not exit successfully.

Send plain-text email instead of HTML.

hidden

type: boolean

Do not use coloured log outputs.

hidden

type: boolean

Incoming hook URL for messaging service

hidden

type: string

Incoming hook URL for messaging service. Currently, MS Teams and Slack are supported.

Boolean whether to validate parameters against the schema at runtime

hidden

type: boolean

default: true

Base URL or local path to location of pipeline test dataset files

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/test-datasets/

Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.

hidden

type: string

nf-core/drugresponseeval