Nextflow supports fetching nearly everything it needs to run a pipeline over the web automatically: pipeline code, software requirements, reference genomes, and even remote data sources.
Nextflow can run your analysis on a system that has no internet connection. There are just a few extra steps required to get everything you need available locally.
You will need to fetch everything you need on a system that does have an internet connection (typically your personal computer). Then, transfer these to your offline system using the methods you have available.
First of all, you need to have Nextflow installed on your system. We do this by installing it locally on a machine that does have an internet connection, and then transferring to the offline system.
- Start by installing Nextflow locally.
do not use the
-allpackage, as this does not allow the use of custom plugins.
- Kick off a pipeline locally so that Nextflow fetches the required plugins. It does not need to run to completion.
- Copy the Nextflow binary and
$HOME/.nextflowfolder to your offline environment.
- In your Nextflow configuration file, specify each plugin that you downloaded, both name and version, including default plugins. This will prevent Nextflow from trying to download newer versions of plugins.
- Add the following environment variable in your
To run a pipeline offline, you need the pipeline code, the software dependencies, and the shared nf-core/configs configuration profiles. We have created a helper tool as part of the nf-core package to automate this for you.
On a computer with an internet connection, run
nf-core download <pipeline> to download the pipeline and config profiles.
Add the argument
--container singularity to also fetch the singularity container(s).
The pipeline and requirements will be downloaded, configured with their relative paths, and packaged into a
.tar.gz file by default.
This can then be transferred to your offline system and unpacked.
Inside, you will see directories called
workflow (the pipeline files),
config (a copy of nf-core/configs), and (if you used
--container singularity) a directory called
The pipeline code is adjusted by the download tool to expect these relative paths, so as long as you keep them together it should work as is.
To run the pipeline, simply use
nextflow run <download_directory>/workflow [pipeline flags].
If you are downloading directly to the offline storage (eg. a head node with internet access whilst compute nodes are offline), you can use the
--singularity-cache-only option for
nf-core download and set the
$NXF_SINGULARITY_CACHEDIR environment variable.
This downloads the singularity images to the
$NXF_SINGULARITY_CACHEDIR folder and does not copy them into the target downloaded pipeline folder.
This reduces total disk space usage and is faster.
For more information, see the documentation for
Some pipelines require reference genomes and have built-in integration with AWS-iGenomes. If you wish to use these references, you must download and transfer them to your offline cluster. Once transferred, follow the reference genomes documentation to configure the base path for the references.
Here is a bytesize talk explaining the necessary steps to run pipelines offline.