KAUST Configuration

manage the pipeline jobs via the KAUST Configuration

The purpose of this custom configurations is to streamline executing nf-core pipelines on the KAUST Ibex cluster.

Getting help

We have a wiki page dedicated to the Bioinformatics team at KAUST to help users: Bioinformatics Workflows.

Using the KAUST config profile

The recommended way to activate Nextflow, that is needed to run the nf-core workflows on Ibex, is to use the module system:

# Log in to the desired cluster
ssh <USER>@ilogin.kaust.edu.sa
 
# Activate the modules, you can also choose to use a specific version with e.g. `Nextflow/24.04.4`.
module load nextflow

Launch the pipeline with -profile kaust (one hyphen) to run the workflows using the KAUST profile. This will download and launch the kaust.config which has been pre-configured with a setup suitable for the KAUST servers. It will enable Nextflow to manage the pipeline jobs via the Slurm job scheduler and Singularity to run the tasks. Using the KAUST profile, Docker image(s) containing required software(s) will be downloaded, and converted to Singularity image(s) if needed before execution of the pipeline. To avoid downloading same images by multiple users, we provide a singularity libraryDir that is configured to use images already downloaded in our central container library. Images missing from our library will be downloaded to your home directory path as defined by cacheDir.

The KAUST profile makes running the nf-core workflows as simple as:

 
module load nextflow
module load singularity
 
# Launch nf-core pipeline with the kaust profile, e.g. for analyzing human data:
$ nextflow run nf-core/<PIPELINE> -profile kaust -r <PIPELINE_VERSION> --genome GRCh38.p14 --samplesheet input.csv [...]

Where input_csv contains information about the samples and datafile paths.

Remember to use -bg to launch Nextflow in the background, so that the pipeline doesn’t exit if you leave your terminal session. Alternatively, you can also launch Nextflow in a tmux or a screen session.

Config file

See config file on GitHub

kaust.config
// KAUST Config Profile
params {
    config_profile_name = 'KAUST'
    config_profile_description = 'Profile for use on King Abdullah Univesity of Science and Technology (KAUST) Ibex Cluster.'
    config_profile_contact = 'Husen Umer (@kaust.edu.sa)'
    config_profile_url = 'https://docs.hpc.kaust.edu.sa/quickstart/ibex.html'
    save_reference = false
    igenomes_ignore = true
}
 
// Load genome resources and assets hosted by the Bioinformatics team on IBEX cluster
// includeConfig '/biocorelab/BIX/resources/configs/genomes.yaml'
 
singularity {
    enabled = true
    autoMounts = true
    pullTimeout = '60 min'
    // Use existing images from the centralized library, if available
    libraryDir = "/biocorelab/BIX/resources/singularity/images/"
    // Download images that are missing from the library to user space
    cacheDir = "/home/$USER/.singularity/nf_images/"
}
 
process {
    executor = 'slurm'
    clusterOptions = "-p batch"
    maxRetries = 5
    errorStrategy = { task.exitStatus in [143,137,104,134,139,151,140,247,12] ? 'retry' : 'finish' }
    beforeScript = 'module load singularity'
    // Max allowed resources per process on Ibex
    resourceLimits = [
        memory: 1600.GB,
        cpus: 200,
        time: 10.d
    ]
}
 
process {
 
    withLabel:process_single {
        time = 20.h
    }
 
    withLabel:process_low {
        cpus   = { 4 * task.attempt }
        memory = { 16.GB * task.attempt }
        time   = { 6.h  * task.attempt }
    }
 
    withLabel:process_medium {
        cpus   = { 20 * task.attempt }
        memory = { 96.GB * task.attempt }
        time   = { 12.h  * task.attempt }
    }
 
    withLabel:process_high {
        cpus   = { 40 * task.attempt }
        memory = { 256.GB * task.attempt }
        time   = { 20.h  * task.attempt }
    }
 
    withLabel:process_long {
        cpus   = { 12 * task.attempt }
        memory = { 128.GB * task.attempt }
        time   = { 96.h  * task.attempt }
    }
}