nf-core/configs: sage
Edit

The Sage Bionetworks Nextflow Config Profile

nf-core/configs: Sage Bionetworks Global Configuration

To use this custom configuration, run the pipeline with -profile sage. This will download and load the sage.config, which contains a number of optimizations relevant to Sage employees running workflows on AWS (e.g. using Nextflow Tower). This profile will also load any applicable pipeline-specific configuration.

This global configuration includes the following tweaks:

Update the default value for igenomes_base to s3://sage-igenomes
Enable retries for all failures
Allow pending jobs to finish if the number of retries are exhausted
Increase resource allocations for specific resource-related exit codes
Optimize resource allocations to better “fit” EC2 instance types
Slow the increase in the number of allocated CPU cores on retries
Increase the default time limits because we run pipelines on AWS
Increase the amount of time allowed for file transfers
Improve reliability of file transfers with retries and reduced concurrency

Additional information about iGenomes

The following iGenomes prefixes have been copied from s3://ngi-igenomes/ (eu-west-1) to s3://sage-igenomes (us-east-1). See this script for more information. The sage-igenomes S3 bucket has been configured to openly available, but files cannot be downloaded out of us-east-1 to avoid egress charges. You can check the conf/igenomes.config file in each nf-core pipeline to figure out the mapping between genome IDs (i.e. for --genome) and iGenomes prefixes (example).

Human Genome Builds
- Homo_sapiens/Ensembl/GRCh37
- Homo_sapiens/GATK/GRCh37
- Homo_sapiens/UCSC/hg19
- Homo_sapiens/GATK/GRCh38
- Homo_sapiens/NCBI/GRCh38
- Homo_sapiens/UCSC/hg38
Mouse Genome Builds
- Mus_musculus/Ensembl/GRCm38
- Mus_musculus/UCSC/mm10

Config file

See config file on GitHub

// Config profile metadata
params {
    config_profile_description = 'The Sage Bionetworks Nextflow Config Profile'
    config_profile_contact     = 'Bruno Grande (@BrunoGrandePhD)'
    config_profile_url         = 'https://github.com/Sage-Bionetworks-Workflows'

    // Leverage us-east-1 mirror of select human and mouse genomes
    igenomes_base              = 's3://sage-igenomes/igenomes'
    cpus                       = 4
    max_cpus                   = 32
    max_memory                 = 128.GB
    max_time                   = 240.h
    single_cpu_mem             = 6.GB
}

// Increase time limit to allow file transfers to finish
// The default is 12 hours, which results in timeouts
threadPool.FileTransfer.maxAwait = '24 hour'

// Configure Nextflow to be more reliable on AWS
aws {
    region = "us-east-1"
    client {
        uploadMaxThreads = 4
    }
    batch {
        retryMode            = 'built-in'
        maxParallelTransfers = 1
        maxTransferAttempts  = 10
        delayBetweenAttempts = '60 sec'
    }
}

// Adjust default resource allocations (see `../docs/sage.md`)

process {

    resourceLimits = [
        memory: 128.GB,
        cpus: 32,
        time: 240.h
    ]

    maxErrors      = '-1'
    maxRetries     = 5
    // Enable retries globally for certain exit codes
    errorStrategy  = { task.attempt <= 5 ? 'retry' : 'finish' }

    cpus           = { 1 * factor(task, 2) }
    memory         = { 6.GB * factor(task, 1) }
    time           = { 24.h * factor(task, 1) }

    // Process-specific resource requirements
    withLabel: process_single {
        cpus   = { 1 * factor(task, 2) }
        memory = { 6.GB * factor(task, 1) }
        time   = { 24.h * factor(task, 1) }
    }
    withLabel: process_low {
        cpus   = { 2 * factor(task, 2) }
        memory = { 12.GB * factor(task, 1) }
        time   = { 24.h * factor(task, 1) }
    }
    withLabel: process_medium {
        cpus   = { 8 * factor(task, 2) }
        memory = { 32.GB * factor(task, 1) }
        time   = { 48.h * factor(task, 1) }
    }
    withLabel: process_high {
        cpus   = { 16 * factor(task, 2) }
        memory = { 64.GB * factor(task, 1) }
        time   = { 96.h * factor(task, 1) }
    }
    withLabel: process_long {
        time = { 96.h * factor(task, 1) }
    }
    withLabel: 'process_high_memory|memory_max' {
        memory = { 128.GB * factor(task, 1) }
    }
    withLabel: cpus_max {
        cpus = { 32 * factor(task, 2) }
    }
}

// Function to finely control the increase of the resource allocation
def factor(task, slow_factor = 1) {
    if ( task.exitStatus in [143,137,104,134,139,247] ) {
        return Math.ceil( task.attempt / slow_factor) as int
    } else {
        return 1 as int
    }
}

nf-core/configs: sage
Edit

nf-core/configs: Sage Bionetworks Global Configuration

Additional information about iGenomes

Config file

homepage

contact

get in touch

nf-core/configs: sage Edit

nf-core/configs: Sage Bionetworks Global Configuration

Additional information about iGenomes

Config file

homepage

contact

get in touch

nf-core/configs: sage
Edit