CRA HPC Configuration

nfcore pipeline sarek and rnaseq have been tested on the CRA HPC.

Before running the pipeline

  • You will need an account to use the CRA HPC cluster in order to run the pipeline.
  • Make sure that Singularity and Nextflow are installed.
  • Downlode pipeline singularity images to a HPC system using nf-core tools
$ conda install nf-core
$ nf-core download
  • You will need to specify a Singularity cache directory in your ~./bashrc. This will store your container images in this cache directory without repeatedly downloading them every time you run a pipeline. Since space on home directory is limited, using lustre file system is recommended.
export NXF_SINGULARITY_CACHEDIR = "/lustre/fs0/storage/yourCRAAccount/cache_dir"
  • Download iGenome reference to be used as a local copy.
$ aws s3 --no-sign-request --region eu-west-1 sync s3://ngi-igenomes/igenomes/Homo_sapiens/GATK/GRCh38/ /lustre/fs0/storage/yourCRAAccount/references/Homo_sapiens/GATK/GRCh38/

Running the pipeline using the adcra config profile

  • Run the pipeline within a screen or tmux session.
  • Specify the config profile with -profile adcra.
  • Using lustre file systems to store results (--outdir) and intermediate files (-work-dir) is recommended.
nextflow run /path/to/nf-core/<pipeline-name> -profile adcra \
--genome GRCh38 \
--igenomes_base /path/to/genome_references/ \
... # the rest of pipeline flags

Config file

See config file on GitHub

adcra.config
/*
* --------------------------------------------------------------
*   nf-core pipelines config file for AD project using CRA HPC
* --------------------------------------------------------------
*/
 
params {
  config_profile_name = 'adcra'
  config_profile_description = 'CRA HPC profile provided by nf-core/configs'
  config_profile_contact = 'Kalayanee Chairat (@kalayaneech)'
  config_profile_url = 'https://bioinformatics.kmutt.ac.th/'
  }
  
params {  
  max_cpus = 16
  max_memory = 128.GB
  max_time = 120.h
}
 
// Specify the job scheduler
executor {  
  name = 'slurm'
  queueSize = 20
  submitRateLimit = '6/1min'
}
 
singularity {
  enabled = true
  autoMounts = true
}
 
process {
  scratch = true
  queue = 'unlimit'
  queueStatInterval = '10 min'
  maxRetries = 3
  errorStrategy = { task.attempt <=3 ? 'retry' : 'finish' }
  cache = 'lenient'
  exitStatusReadTimeoutMillis = '2700000'
}
 
adcra.config
/*
* --------------------------------------------------------------
*   nf-core pipelines config file for AD project using CRA HPC
* --------------------------------------------------------------
*/
 
params {
  config_profile_name = 'adcra'
  config_profile_description = 'CRA HPC profile provided by nf-core/configs'
  config_profile_contact = 'Kalayanee Chairat (@kalayaneech)'
  config_profile_url = 'https://bioinformatics.kmutt.ac.th/'
  }
  
params {  
  max_cpus = 16
  max_memory = 128.GB
  max_time = 120.h
}
 
// Specify the job scheduler
executor {  
  name = 'slurm'
  queueSize = 20
  submitRateLimit = '6/1min'
}
 
singularity {
  enabled = true
  autoMounts = true
}
 
process {
  scratch = true
  queue = 'unlimit'
  queueStatInterval = '10 min'
  maxRetries = 3
  errorStrategy = { task.attempt <=3 ? 'retry' : 'finish' }
  cache = 'lenient'
  exitStatusReadTimeoutMillis = '2700000'
}