nf-core/tumourevo
Analysis pipleine to model tumour clonal evolution from WGS data (driver annotation, quality control of copy number calls, subclonal and mutational signature deconvolution)
Define where the pipeline should find input data and save output data.
Path to comma-separated file containing information about the samples in the experiment.
string
^\S+\.csv$
The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.
string
Most common options used for the pipeline.
Path to reference fasta file.
string
Reference genome name.
string
List of tools for running the pipeline.
string
mobster,viber,pyclone-vi,sparsesignatures,sigprofiler
Flag for filtering or not QC mutations.
boolean
true
Method used to save pipeline results to output directory.
string
copy
Variant Annotation parameters.
Parameter for downloading VEP cache.
string
Path to VEP cache.
string
VEP cache version.
string
VEP species.
string
VEP reference genome name.
string
Add an extra custom argument to VEP.
string
--everything --filter_common --per_gene --total_length --offline --format vcf
Driver Annotation parameters.
Path to driver table.
string
https://raw.githubusercontent.com/nf-core/test-datasets/refs/heads/tumourevo/data/DRIVER_ANNOTATION/ANNOTATE_DRIVER/Compendium_Cancer_Genes.tsv
Filtering parameters from vcf file.
Flag for filtering mutations from vcf.
boolean
true
CNAqc tool parameters.
For clonal simple CNAs, the list of segments to test.
string
c(\'1:0\', \'1:1\', \'2:0\', \'2:1\', \'2:2\')
For clonal simple CNAs, a filter for the segments to test.
integer
For clonal simple CNAs, as min_karyotype_size but with a cut measured on absolute mutation counts.
integer
100
For clonal simple CNAs, peaks detected will be filtered if, in a peak, we map less than p_binsize_peaks * N mutations.
number
0.005
Deprecated parameter.
string
NULL
For clonal simple CNAs, the purity error tolerance to determine QC pass or fail.
number
0.05
For clonal simple CNAs, a tolerance in comparing bands overlaps which is applied to the raw VAF values.
number
0.015
For clonal simple CNAs, the number of times peak detection is bootstrapped (by default 1).
integer
1
For KDE-based matches the adjust density parameter; see density.
integer
1
For clonal simple CNAs, if “closest” the closest peak will be used to match the expected peak. If “rightmost” peaks are matched prioritizing right to left peaks (the higher-VAF gets matched first); this strategy is more correct in principle but works only if there are no spurious peaks in the estimated density.
string
rightmost
Deprecated parameter.
string
TRUE
For subclonal simple CNAs, the starting state to determine linear versus branching evolutionary models.
string
1:1
For subclonal simple CNAs, the starting state to determine linear versus branching evolutionary models.
string
FALSE
Minimum number of mutations that are required to be mapped to a karyotype in order to compute CCF values (default 25).
integer
25
For the entropy-based method, percentage of mutations that can be not-assigned (NA) in a karyotype.
number
0.1
Either “ENTROPY” (default) or “ROUGH”, to reflect the two different algorithms to compute CCF.
string
ENTROPY
string
absolute
joinCNAqc
If TRUE the mutations flagged as FAILED by CNAqc are discarded while building the joinCNAqc segmentation, if FALSE they are kept in the new object.
string
FALSE
If TRUE the original CNAqc object is kept in the joinCNAqc object, otherwise it is lost.
string
TRUE
The probability density used to model the read count data. Choices are beta-binomial and binomial.
string
beta-binomial
Number of random restarts of variational inference.
integer
100
Number of grid points used for approximating the posterior distribution.
integer
100
The number of clusters to use while fitting.
integer
20
A vector with the number of Beta components to use. All values of K must be positive and strictly greater than 0.
string
1:5
Boolean value whether to use or not tail mutations for subclonal deconvolution.
string
TRUE
The minimum mixing proportion of a cluster to be returned as output.
number
0.02
The minimum number of mutations assigned to a cluster to be returned as output.
integer
10
The maximum number of clusters returned
integer
10
The number of fits to be computed.
integer
10
The concentration parameter of the Dirichlet mixture.
number
0.000001
The prior Beta hyperparameter for each Binomial component a
integer
1
The prior Beta hyperparameter for each Binomial component b
integer
1
The maximum number of fit iterations
integer
5000
The epsilon to measure convergence as ELBO absolute difference
number
1e-10
Initialization of the q-distribution to compute the approximation of the posterior distributions.
string
prior
Boolean value whether to return the trace of model fit.
string
FALSE
The minimum Binomial success probability when applying a heuristic procedure to filter clusters after Variational Inference.
number
0.05
The minimum size of the mixture component when applying a heuristic procedure to filter clusters after Variational Inference.
number
0.02
Boolean value whether point assigned to a cluster that is filtered our, are re-assigned from the density function.
string
FALSE
The minimum number of dimensions where we want to detect a Binomial component when applying a heuristic procedure to filter clusters after Variational Inference.
integer
1
The number of signatures (min. value = 2) to be fit to the dataset, including the background signature.
string
2:10
The number of iterations of every single run of NMF LASSO.
integer
30
Number of iterations to estimate the length(K) matrices beta (including the background signature) in case the argument beta is NULL. Ignored if beta is given.
integer
10
The number of sub-iterations involved in the sparsification phase, within a full NMF LASSO iteration.
integer
10000
The number of requested NMF worker subprocesses to spawn. If Inf, an adaptive maximum number is automatically chosen. If NA or NULL, the function is run as a single process.
string
all
The cross-validation test size, i.e., the percentage of entries set to zero during NMF and used for validation.
number
0.01
The number of repetitions of the cross-validation procedure.
integer
50
The number of randomized restarts of a single cross-validation repetition, in case of poor fits.
integer
5
The candidate values of the sparsity parameter for the signature matrix ‘beta’ whose goodness of fit is assessed by cross-validation.
string
c(0.01, 0.05, 0.1, 0.2)
The candidate values of the sparsity parameter for the exposure-matrix entries alpha whose goodness of fit is assessed by cross-validation.
integer
Seed for the random number generation. To be set for reproducibility.
integer
42
The candidate values of the sparsity parameter for the exposure-matrix entries alpha whose goodness of fit is assessed by cross-validation.
string
c(0.00, 0.01, 0.05, 0.10)
Mode of publishing the SigProfiler genome.
string
move
Specify True if the reference genome should be downloaded.
boolean
true
Specify the path to the reference genome (if downloaded by the user), e.g. path/to/genome/tsb
string
‘matrix’ is used for table format inputs using a tab separated file.
string
matrix
The minimum number of signatures to be extracted.
integer
1
The maximum number of signatures to be extracted.
integer
25
Mutation context name(s), separated by commas (,), that define the mutational contexts for signature extraction. In the default value, 96 represents the SBS96 context, DINUC represents the dinucleotide context, and ID represents the indel context.
string
96,DINUC,ID
The number of iteration to be performed to extract each number signature.
integer
100
Value defines the minimum number of iterations to be completed before NMF converges.
integer
10000
Value defines the maximum number of iterations to be completed before NMF converges .
integer
1000000
Value defines the number number of iterations to done between checking next convergence .
integer
10000
A number that represent the normal contamination level for which the sample is considered passed or failed.
integer
3
Email address for completion summary.
boolean
Email address for completion summary, only when pipeline fails.
boolean
Do not use coloured log outputs.
boolean
Send plain-text email instead of HTML.
boolean
Git commit id for Institutional configs.
string
master
Base directory for Institutional configs.
string
https://raw.githubusercontent.com/nf-core/configs/master
Institutional config description.
boolean
Institutional config contact information.
boolean
string
s3://ngi-igenomes/igenomes/
Institutional config URL link.
boolean
boolean
string
null/pipeline_info
boolean
boolean
string
string
https://raw.githubusercontent.com/nf-core/test-datasets/
string
boolean
true
string