Define the models and baselines to be tested.

Model to be tested.

Baselines to be tested.

Define where the pipeline should find input data and save output data.

Run ID for the pipeline.

You will need to set a run identifier for the pipeline. This is used to create a unique output directory for each run.

Name of the dataset.

Name of the dataset used for the pipeline. Allowed values are GDSC1, GDSC2, and Custom.

The output directory where the results will be saved. Default is results/

Email address for completion summary.

Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (~/.nextflow/config) then you don't need to specify this on the command line for every run.

Define the mode in which the pipeline will be run.

Run the pipeline in test mode LPO, LCO, or LDO.

Which tests to run (LPO=Leave-random-Pairs-Out, LCO=Leave-Cell-line-Out, LDO=Leave-Drug-Out). Can be a list of test runs e.g. 'LPO LCO LDO' to run all tests. Default is LPO.

Options for randomization.

Randomization mode for the pipeline.

Which randomization tests to run, additionally to the normal run. Default is None which means no randomization tests are run. Modes: SVCC, SVRC, SVCD, SVRD. Can be a list of randomization tests e.g. 'SCVC SCVD' to run two tests. Default is None. SVCC: Single View Constant for Cell Lines: in this mode, one experiment is done for every cell line view the model uses (e.g. gene expression, mutation, ..). For each experiment one cell line view is held constant while the others are randomized. SVRC Single View Random for Cell Lines: in this mode, one experiment is done for every cell line view the model uses (e.g. gene expression, mutation, ..).

Randomization type for the pipeline.

type of randomization to use. Choose from "permutation", "invariant". Default is "permutation

Options for robustness.

Number of trials to run for the robustness test

Number of trials to run for the robustness test. Default is 0, which means no robustness test is run. The robustness test is a test where the model is trained with varying seeds. This is done multiple times to see how stable the model is.

Additional options for the pipeline.

Run the curve curator.

Whether to run " "CurveCurator " "to sort out " "non-reactive " "curves

Overwrite existing results.

Whether to overwrite existing results.

Optimization metric for the pipeline.

Optimization metric for the pipeline. Default is RMSE.

Number of cross-validation splits.

Number of cross-validation splits. Default is 5.

Response transformation

Transformation to apply to the response variable possible values: standard, minmax, robust

Path to the data directory.

Path to the data directory.

Datasets for cross-study prediction.

List of datasets to use to evaluate predictions across studies. Default is empty string which means no cross-study datasets are used.

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

Base directory for Institutional configs.

type: string

If you're running offline, Nextflow will not be able to fetch the institutional config files from the internet. If you don't need them, then this is not a problem. If you do need them, you should download the files from the repo and tell Nextflow where to find them with this parameter.

Institutional config name.

Institutional config description.

Institutional config contact information.

Institutional config URL link.

Set the top limit for requested resources for any single job.

Maximum number of CPUs that can be requested for any single job.

Use to set an upper-limit for the CPU requirement for each process. Should be an integer e.g. --max_cpus 1

Maximum amount of memory that can be requested for any single job.

Use to set an upper-limit for the memory requirement for each process. Should be a string in the format integer-unit e.g. --max_memory '8.GB'

Maximum amount of time that can be requested for any single job.

Use to set an upper-limit for the time requirement for each process. Should be a string in the format integer-unit e.g. --max_time '2.h'

Less common options for the pipeline, typically set in a config file.

Display help text.

Display version and exit.

Method used to save pipeline results to output directory.

The Nextflow publishDir option specifies which intermediate files should be saved to the output directory. This option tells the pipeline what method should be used to move these files. See Nextflow docs for details.

Email address for completion summary, only when pipeline fails.

An email address to send a summary email to when the pipeline is completed - ONLY sent if the pipeline does not exit successfully.

Send plain-text email instead of HTML.

Do not use coloured log outputs.

Incoming hook URL for messaging service

Incoming hook URL for messaging service. Currently, MS Teams and Slack are supported.

Boolean whether to validate parameters against the schema at runtime

Show all params when using --help

By default, parameters set as hidden in the schema are not shown on the command line when a user runs with --help. Specifying this option will tell the pipeline to show all parameters.

Validation of parameters fails when an unrecognised parameter is found.

By default, when an unrecognised parameter is found, it returns a warinig.

Validation of parameters in lenient more.

Allows string values that are parseable as numbers or booleans. For further information see JSONSchema docs.