PDC Configuration

nf-core pipelines have been successfully configured for use on the PDC cluster dardel. No other clusters have yet been tested, but support can be added if needed.

Getting started

The base java installation on dardel is Java 11. By loading the PDC and Java module, different versions (e.g. 17) are available.

To pull new singularity images, singularity must be available (e.g. through the module system) to the nextflow monitoring process, suggested preparatory work before launching nextflow is:

 
module load PDC Java singularity

(for reproducibility, it may be a good idea to check what versions you have loaded with module list and using those afterwards, e.g. module load PDC/22.06 singularity/3.10.4-cpeGNU-22.06 Java/17.0.4.)

No singularity images or nextflow versions are currently preloaded on dardel, to get started you can e.g. download nextflow through

wget https://raw.githubusercontent.com/nextflow-io/nextflow/master/nextflow && \
  chmod a+x nextflow

The profile pdc_kth has been provided for convenience, it expects you to pass the project used for slurm accounting through --project, e.g. --project=nais2023-22-1027.

Due to how partitions are set up on dardel, in particular the lack of long-runtime nodes with more memory. Some runs may be difficult to get through.

Note that node local scratch is not available and SNIC_TMP as well as PDC_TMP point to a cluster-scratch area that will have similar perfomance characteristics as your project storage. /tmp points to a local tmpfs which uses RAM to store contents. Given that nodes don’t have swap space anything stored in /tmp will mean less memory is available for your job.

Config file

See config file on GitHub

pdc_kth.config
// Nextflow config for use with PDC at KTH
 
def cluster = "unknown"
 
try {
    cluster = ['/bin/bash', '-c', 'sacctmgr show cluster -n | grep -o "^\s*[^ ]*\s*"'].execute().text.trim()
} catch (java.io.IOException e) {
    System.err.println("WARNING: Could not run scluster, defaulting to unknown")
}
 
params {
    project = null // Naiss project allocation
 
    config_profile_description = 'PDC profile.'
    config_profile_contact = 'Pontus Freyhult (@pontus)'
    config_profile_url = "https://www.pdc.kth.se/"
 
    max_memory = 1790.GB
    max_cpus = 256
    max_time = 7.d
 
    schema_ignore_params = "genomes,input_paths,cluster-options,clusterOptions,project,validationSchemaIgnoreParams"
    validationSchemaIgnoreParams = "genomes,input_paths,cluster-options,clusterOptions,project,schema_ignore_params"
}
 
 
def containerOptionsCreator = {
    switch(cluster) {
        case "dardel":
            return '-B /cfs/klemming/'
    }
 
    return ''
}
 
def clusterOptionsCreator = { mem, time, cpus ->
    String base = "-A $params.project ${params.clusterOptions ?: ''}"
 
    switch(cluster) {
        case "dardel":
            String extra = ''
 
            if (time <= 7.d && mem <= 111.GB && cpus <= 256) {
                extra += ' -p shared '
            }
            else if (time < 1.d) {
                // Shortish
                if (mem > 222.GB) {
                    extra += ' -p memory,main '
                } else {
                    extra += ' -p main '
                }
            } else {
                // Not shortish
                if (mem > 222.GB) {
                    extra += ' -p memory '
                } else {
                    extra += ' -p long '
                }
            }
 
            if (!mem || mem < 6.GB) {
                // Impose minimum memory if request is below
                extra += ' --mem=6G '
            }
 
            return base+extra
    }
 
    return base
}
 
 
singularity {
    enabled = true
    runOptions = containerOptionsCreator()
}
 
process {
    resourceLimits = [
        memory: 1790.GB,
        cpus: 256,
        time: 7.d
    ]
    // Should we lock these to specific versions?
    beforeScript = 'module load PDC apptainer'
 
    executor = 'slurm'
    clusterOptions = { clusterOptionsCreator(task.memory, task.time, task.cpus) }
}