University of Ghent High Performance Computing Infrastructure (VSC)

NB: You will need an account to use the HPC cluster to run the pipeline.

Regarding environment variables in ~/.bashrc, make sure you have a setup similar to the one below. If you’re already part of a VO, ask for one or use VSC_DATA_USER instead of VSC_DATA_VO_USER.

# Needed for Tier1 accounts, not for Tier2
export SLURM_ACCOUNT={FILL_IN_NAME_OF_YOUR_ACCOUNT}
export SALLOC_ACCOUNT=$SLURM_ACCOUNT
export SBATCH_ACCOUNT=$SLURM_ACCOUNT
# Needed for running Nextflow jobs
export NXF_HOME=$VSC_DATA_VO_USER/.nextflow
# Needed for running Apptainer containers
export APPTAINER_CACHEDIR=$VSC_DATA_VO_USER/.apptainer/cache
export APPTAINER_TMPDIR=$VSC_DATA_VO_USER/.apptainer/tmp

First you should go to the cluster you want to run the pipeline on. You can check what clusters have the most free space on this link. Use the following commands to easily switch between clusters:

module purge
module swap cluster/<CLUSTER>

Before running the pipeline you will need to create a PBS script to submit as a job.

#!/bin/bash
 
module load Nextflow
 
nextflow run <pipeline> -profile vsc_ugent,<CLUSTER> <Add your other parameters>

All of the intermediate files required to run the pipeline will be stored in the work/ directory. It is recommended to delete this directory after the pipeline has finished successfully because it can get quite large, and all of the main output files will be saved in the results/ directory anyway. The config contains a cleanup command that removes the work/ directory automatically once the pipeline has completed successfully. If the run does not complete successfully then the work/ dir should be removed manually to save storage space. The default work directory is set to $VSC_SCRATCH_VO_USER/work per this configuration

You can also add several TORQUE options to the PBS script. More about this on this link.

To submit your job to the cluster by using the following command:

qsub <script name>.pbs

The VSC does not support Apptainer containers provided via a URL (e.g., shub://… or docker://…). One solution is to download all the containers beforehand, like in this pipeline.

First get the containers.json file from the pipeline you want to run:

nextflow inspect main.nf -profile vsc_ugent,<CLUSTER> > containers.json

Then run a build script (script appended below) to build all the containers. This can take a long time and a lot of space, but it is a one-time cost. For many large images, consider running this as a job.

bash build_all_containers.sh containers.json

Overwrite the container in your nextflow.config. If you need GPU support, also apply the label ‘use_gpu’:

process {
    withName: DEEPCELL_MESMER {
        label = 'use_gpu'
        // container "docker.io/vanvalenlab/deepcell-applications:0.4.1"
        container = "./DEEPCELL_MESMER_GPU.sif"
    }
}

NB: The profile only works for the clusters skitty, swalot, victini, kirlia and doduo.

NB: The default directory where the work/ and singularity/ (cache directory for images) is located in $VSC_SCRATCH_VO_USER.

build_all_containers.sh:

#!/bin/env bash
 
# avoid that Apptainer uses $HOME/.cache
export APPTAINER_CACHEDIR=/tmp/$USER/apptainer/cache
# instruct Apptainer to use temp dir on local filessytem
export APPTAINER_TMPDIR=/tmp/$USER/apptainer/tmpdir
# specified temp dir must exist, so create it
mkdir -p $APPTAINER_TMPDIR
 
# pull all containers from the given JSON file
# usage: build_all_containers.sh containers.json [FORCE]
JSON=$1
FORCE=${2:-false}
 
echo "Building containers from $JSON"
NAMES=$(sed -nE 's/.*"name": "([^"]*)".*/\1/p' $JSON)
CONTAINERS=$(sed -nE 's/.*"container": "([^"]*)".*/\1/p' $JSON)
# default FORCE to false
# paste name and containers together
paste <(echo "$NAMES") <(echo "$CONTAINERS") | while IFS=$'\t' read -r name container; do
    # is sif already present, continue unless FORCE is true
    if [ -f "$name.sif" ] && [ "$FORCE" != "true" ]; then
        continue
    fi
 
    # if container is null, skip
    if [ -z "$container" ]; then
        continue
    fi
 
    # if not docker://, add docker://
    if [[ $container != docker://* ]]; then
        container="docker://$container"
    fi
    echo "Building $container"
    # overwrite the existing container
    apptainer build --fakeroot /tmp/$USER/$name.sif $container
    mv /tmp/$USER/$name.sif $name.sif
done

Config file

See config file on GitHub

vsc_ugent.config
// Get the hostname and check some values for tier1
def hostname = "doduo"
try {
    hostname = ['/bin/bash', '-c', 'sinfo --local -N -h | head -n 1 | cut -d " " -f1'].execute().text.trim()
} catch (java.io.IOException e) {
    System.err.println("WARNING: Could not run sinfo to determine current cluster, defaulting to doduo")
}
 
def tier1_project = System.getenv("SBATCH_ACCOUNT") ?: System.getenv("SLURM_ACCOUNT")
 
if (! tier1_project && hostname.contains("dodrio")) {
    // Hard-code that Tier 1 cluster dodrio requires a project account
    System.err.println("Please specify your VSC project account with environment variable SBATCH_ACCOUNT or SLURM_ACCOUNT.")
    System.exit(1)
}
 
// Define the Scratch directory
def scratch_dir =   System.getenv("VSC_SCRATCH_PROJECTS_BASE") ? "${System.getenv("VSC_SCRATCH_PROJECTS_BASE")}/$tier1_project" : // Tier 1 scratch
                    System.getenv("VSC_SCRATCH_VO_USER") ?: // VO scratch
                    System.getenv("VSC_SCRATCH") // user scratch
 
// Specify the work directory
workDir = "$scratch_dir/work"
 
// Perform work directory cleanup when the run has succesfully completed
cleanup = true
 
// Reduce the job submit rate to about 30 per minute, this way the server won't be bombarded with jobs
// Limit queueSize to keep job rate under control and avoid timeouts
// Extend the exit read timeout to 3 days to avoid timeouts on tier1 clusters
executor {
    submitRateLimit = '30/1min'
    queueSize = 100
    exitReadTimeout = "3day"
}
 
// Add backoff strategy to catch cluster timeouts and proper symlinks of files in scratch to the work directory
process {
    stageInMode = "symlink"
    stageOutMode = "rsync"
    errorStrategy = { sleep(Math.pow(2, task.attempt ?: 1) * 200 as long); return 'retry' }
    maxRetries    = 5
    // add GPU support with GPU label
    // Adapted from https://github.com/nf-core/configs/blob/76970da5d4d7eadd8354ef5c5af2758ce187d6bc/conf/leicester.config#L26
    // More info on GPU SLURM options: https://hpc.vub.be/docs/job-submission/gpu-job-types/#gpu-job-types
    withLabel: use_gpu {
        // works on all GPU clusters of Tier 1 and Tier 2
        beforeScript = 'module load cuDNN/8.4.1.50-CUDA-11.7.0'
        // TODO: Support multi-GPU configuations with e.g. ${task.ext.gpus}
        // only add account if present
        clusterOptions = {"--gpus=1" + (tier1_project ? " --account=$tier1_project" : "")}
        containerOptions = {
                // Ensure that the container has access to the GPU
                workflow.containerEngine == "singularity" ? '--nv':
                ( workflow.containerEngine == "docker" ? '--gpus all': null )
            }
    }
}
 
// Specify that singularity should be used and where the cache dir will be for the images
// containerOptions --containall or --no-home can break e.g. downloading big models to ~/.cache
// solutions to error 'no disk space left':
//   1. remove --no-home using NXF_APPTAINER_HOME_MOUNT=true
//   2. increase the memory of the job.
//   3. change the script so the tool does not use the home folder.
//   4. increasing the Singularity memory limit using --memory.
singularity {
    enabled = true
    autoMounts = true
    cacheDir = "$scratch_dir/singularity"
}
 
env {
    APPTAINER_TMPDIR="$scratch_dir/.apptainer/tmp"
    APPTAINER_CACHEDIR="$scratch_dir/.apptainer/cache"
}
 
// AWS maximum retries for errors (This way the pipeline doesn't fail if the download fails one time)
aws {
    maxErrorRetry = 3
}
 
// Define profiles for each cluster
profiles {
    skitty {
        params {
            config_profile_description = 'HPC_SKITTY profile for use on the Skitty cluster of the VSC HPC.'
            config_profile_contact = 'ict@cmgg.be'
            config_profile_url = 'https://www.ugent.be/hpc/en'
            max_memory = 177.GB
            max_cpus = 36
            max_time = "3day"
        }
 
        process {
            executor = 'slurm'
            queue = 'skitty'
        }
    }
 
    swalot {
        params {
            config_profile_description = 'HPC_SWALOT profile for use on the Swalot cluster of the VSC HPC.'
            config_profile_contact = 'ict@cmgg.be'
            config_profile_url = 'https://www.ugent.be/hpc/en'
            max_memory = 116.GB
            max_cpus = 20
            max_time = "3day"
        }
 
        process {
            executor = 'slurm'
            queue = 'swalot'
        }
    }
 
    victini {
        params {
            config_profile_description = 'HPC_VICTINI profile for use on the Victini cluster of the VSC HPC.'
            config_profile_contact = 'ict@cmgg.be'
            config_profile_url = 'https://www.ugent.be/hpc/en'
            max_memory = 88.GB
            max_cpus = 36
            max_time = "3day"
        }
 
        process {
            executor = 'slurm'
            queue = 'victini'
        }
    }
 
    kirlia {
        params {
            config_profile_description = 'HPC_KIRLIA profile for use on the Kirlia cluster of the VSC HPC.'
            config_profile_contact = 'ict@cmgg.be'
            config_profile_url = 'https://www.ugent.be/hpc/en'
            max_memory = 738.GB
            max_cpus = 36
            max_time = "3day"
        }
 
        process {
            executor = 'slurm'
            queue = 'kirlia'
        }
    }
 
    doduo {
        params {
            config_profile_description = 'HPC_DODUO profile for use on the Doduo cluster of the VSC HPC.'
            config_profile_contact = 'ict@cmgg.be'
            config_profile_url = 'https://www.ugent.be/hpc/en'
            max_memory = 250.GB
            max_cpus = 96
            max_time = "3day"
        }
 
        process {
            executor = 'slurm'
            queue = 'doduo'
        }
    }
 
    cpu_rome {
        params {
            config_profile_description = 'HPC_DODRIO_cpu_rome profile for use on the Dodrio/cpu_rome cluster of the VSC HPC.'
            config_profile_contact = 'ict@cmgg.be'
            config_profile_url = 'https://www.ugent.be/hpc/en'
            max_memory = 256.GB
            max_cpus = 128
            max_time = "3day"
        }
 
        process {
            executor = 'slurm'
            queue = 'dodrio/cpu_rome'
            clusterOptions = "-A ${tier1_project}"
        }
    }
 
    cpu_rome_512 {
        params {
            config_profile_description = 'HPC_DODRIO_cpu_rome_512 profile for use on the Dodrio/cpu_rome_512 cluster of the VSC HPC.'
            config_profile_contact = 'ict@cmgg.be'
            config_profile_url = 'https://www.ugent.be/hpc/en'
            max_memory = 512.GB
            max_cpus = 128
            max_time = "3day"
        }
 
        process {
            executor = 'slurm'
            queue = 'dodrio/cpu_rome_512'
            clusterOptions = "-A ${tier1_project}"
        }
    }
 
    cpu_milan {
        params {
            config_profile_description = 'HPC_DODRIO_cpu_milan profile for use on the Dodrio/cpu_milan cluster of the VSC HPC.'
            config_profile_contact = 'ict@cmgg.be'
            config_profile_url = 'https://www.ugent.be/hpc/en'
            max_memory = 256.GB
            max_cpus = 128
            max_time = "3day"
        }
 
        process {
            executor = 'slurm'
            queue = 'dodrio/cpu_milan'
            clusterOptions = "-A ${tier1_project}"
        }
    }
 
    gpu_rome_a100_40 {
        params {
            config_profile_description = 'HPC_DODRIO_gpu_rome_a100_40 profile for use on the Dodrio/gpu_rome_a100_40 cluster of the VSC HPC.'
            config_profile_contact = 'ict@cmgg.be'
            config_profile_url = 'https://www.ugent.be/hpc/en'
            max_memory = 256.GB
            max_cpus = 48
            max_time = "3day"
        }
 
        process {
            executor = 'slurm'
            queue = 'dodrio/gpu_rome_a100_40'
            clusterOptions = "-A ${tier1_project}"
        }
    }
 
    gpu_rome_a100_80 {
        params {
            config_profile_description = 'HPC_DODRIO_gpu_rome_a100_80 profile for use on the Dodrio/gpu_rome_a100_80 cluster of the VSC HPC.'
            config_profile_contact = 'ict@cmgg.be'
            config_profile_url = 'https://www.ugent.be/hpc/en'
            max_memory = 512.GB
            max_cpus = 48
            max_time = "3day"
        }
 
        process {
            executor = 'slurm'
            queue = 'dodrio/gpu_rome_a100_80'
            clusterOptions = "-A ${tier1_project}"
        }
    }
 
    debug_rome {
        params {
            config_profile_description = 'HPC_DODRIO_debug_rome profile for use on the Dodrio/debug_rome cluster of the VSC HPC.'
            config_profile_contact = 'ict@cmgg.be'
            config_profile_url = 'https://www.ugent.be/hpc/en'
            max_memory = 256.GB
            max_cpus = 48
            max_time = "3day"
        }
 
        process {
            executor = 'slurm'
            queue = 'dodrio/debug_rome'
            clusterOptions = "-A ${tier1_project}"
        }
    }
 
    cpu_rome_all {
        params {
            config_profile_description = 'HPC_DODRIO_cpu_rome_all profile for use on the Dodrio/cpu_rome_all cluster of the VSC HPC.'
            config_profile_contact = 'ict@cmgg.be'
            config_profile_url = 'https://www.ugent.be/hpc/en'
            max_memory = 250.GB
            max_cpus = 128
            max_time = "3day"
        }
 
        process {
            executor = 'slurm'
            queue = 'dodrio/cpu_rome_all'
            clusterOptions = "-A ${tier1_project}"
        }
    }
 
    gpu_rome_a100 {
        params {
            config_profile_description = 'HPC_DODRIO_gpu_rome_a100 profile for use on the Dodrio/gpu_rome_a100 cluster of the VSC HPC.'
            config_profile_contact = 'ict@cmgg.be'
            config_profile_url = 'https://www.ugent.be/hpc/en'
            max_memory = 384.GB
            max_cpus = 48
            max_time = "3day"
        }
 
        process {
            executor = 'slurm'
            queue = 'dodrio/gpu_rome_a100'
            clusterOptions = "-A ${tier1_project}"
        }
    }
 
    stub {
        params {
            config_profile_description = 'Stub profile for the VSC HPC. Please also specify the `-stub` argument when using this profile.'
            config_profile_contact = 'ict@cmgg.be'
            config_profile_url = 'https://www.ugent.be/hpc/en'
            max_memory = 2.GB
            max_cpus = 1
            max_time = 1.h
        }
    }
}