Material originally written for the Nextflow Camp 2019, Barcelona 2019-09-19: “Getting started with nf-core” (see programme).
Updated for the nf-core Hackathon 2020, London 2020-03 (see event).
Updated for the Elixir workshop on November 2021 (see event).
Updated during the March 2022 hackathon.
The nf-core community provides a range of tools to help new users get to grips with Nextflow - both by providing complete pipelines that can be used out of the box, and also by helping developers with best practices.
Companion tools can create a bare-bones pipeline from a template scattered with
TODO pointers and CI with linting tools check code quality.
Guidelines and documentation help to get Nextflow newbies on their feet in no time.
Best of all, the nf-core community is always on hand to help.
In this tutorial, we discuss the best-practice guidelines developed by the nf-core community, why they’re important and give insight into the best tips and tricks for budding Nextflow pipeline users. ✨
What is nf-core
nf-core is a community-led project to develop a set of best-practice pipelines built using Nextflow. Pipelines are governed by a set of guidelines, enforced by community code reviews and automatic linting (code testing). A suite of helper tools aim to help people run and develop pipelines.
What this tutorial will cover
This tutorial attempts to give an overview of how nf-core works:
- What are the most commonly used nf-core tools.
- Listing pipelines available in the nf-core project.
- How to run nf-core pipelines.
- How to troubleshoot nf-core pipelines.
Where to get help
The beauty of nf-core is that there is lots of help on offer! The main place for this is Slack - an instant messaging service.
One additional tool which we like a lot is TLDR - it gives concise command line reference through example commands for most linux tools, including
git and more.
There are many clients, but raylee/tldr is arguably the simplest - just a single bash script.
Installing the nf-core helper tools
Much of this tutorial will make use of the
nf-core command line tool.
This has been developed to provide a range of additional functionality for the project such as pipeline creation, testing and more.
Or this command to install the
If using conda, first set up Bioconda as described in the bioconda docs (especially setting the channel order), create and activate an environment and then install nf-core:
To update the package you can run the following command
The nf-core/tools source code is available at https://github.com/nf-core/tools - if you prefer, you can clone this repository and install the code locally:
Once installed, you can check that everything is working by printing the help:
You will also need to install Prettier for formatting your code. To do so, you can either use the following command with conda:
Besides, you can also add a comment with
@nf-core-bot fix linting in your Pull Request and prettier will be used to apply the required fixes to your code.
Exercise 1 (installation)
- Install nf-core/tools
- Use the help flag to list the available commands
Listing available nf-core pipelines
As you saw from the
--help output, the tool has a range of sub-commands.
The simplest is
nf-core list, which lists all available nf-core pipelines.
The output shows the latest version number, when that was released.
If the pipeline has been pulled locally using Nextflow, it tells you when that was and whether you have the latest version.
If you supply additional keywords after the command, the listed pipelines will be filtered.
Note that this searches more than just the displayed output, including keywords and description text.
--sort flag allows you to sort the list (default is by most recently released) and
--json returns the complete list, without any filtering, in JSON output for programmatic use.
The nf-core pipelines currently available and under development are also listed on the nf-core website, in the pipelines page.
Exercise 2 (listing pipelines)
- Use the help flag to print the list command usage
- List all available nf-core pipelines
- Sort pipelines alphabetically, then by popularity (stars)
- Fetch one of the pipelines using
nf-core listto see if the pipeline you pulled is up to date
- Filter pipelines for those that work with RNA
- Save these pipeline details to a JSON file
Running nf-core pipelines
Software requirements for nf-core pipelines
In order to run nf-core pipelines, you will need to have Nextflow installed (https://www.nextflow.io). The only other requirement is a software packaging tool: Conda, Docker or Singularity. In theory it is possible to run the pipelines with software installed by other methods (e.g. environment modules, or manual installation), but this is not recommended. Most people find either Docker or Singularity containers the best options, as conda environments cannot guarantee 100% reproducibility.
Fetching pipeline code
Unless you are actively developing pipeline code, we recommend using the Nextflow built-in functionality to fetch nf-core pipelines.
Nextflow will automatically fetch the pipeline code when you run
nextflow run nf-core/PIPELINE.
For the best reproducibility, it is good to explicitly reference the pipeline version number that you wish to use with the
If not specified, Nextflow will fetch the default branch.
For pipelines with a stable release this the default branch is
master - this branch contains code from the latest release.
For pipelines in early development that don’t have any releases, the default branch is
If you would like to run the latest development code, use
Note that once pulled, Nextflow will use the local cached version for subsequent runs.
-latest flag when running the pipeline to always fetch the latest version.
Alternatively, you can force Nextflow to pull a pipeline again using the
nextflow pull command:
Usage instructions and documentation
You can find general documentation and instructions for Nextflow and nf-core on the nf-core website: https://nf-co.re/.
Pipeline-specific documentation is bundled with each pipeline in the
This can be read either locally, on GitHub, or on the nf-core website.
Each pipeline has its own webpage at
https://nf-co.re/<pipeline_name> (e.g. nf-co.re/rnaseq), including
Output documentation and
In addition to this documentation, each pipeline comes with basic command line reference.
This can be seen by running the pipeline with the
--help flag, for example:
Example results of a pipeline run on full-sized test data can be browsed on the pipeline page, under the
aws results tab.
Nextflow can load pipeline configurations from multiple locations. To make it easy to apply a group of options on the command line, Nextflow uses the concept of config profiles. nf-core pipelines load configuration in the following order:
- Pipeline: Default ‘base’ config
- Always loaded. Contains pipeline-specific parameters and “sensible defaults” for things like computational requirements
- Does not specify any method for software packaging. If nothing else is specified, Nextflow will expect all software to be available on the command line.
- Pipeline: Core config profiles
- All nf-core pipelines come with some generic config profiles. The most commonly used ones are for software packaging:
conda. To ensure reproducibility across different compute infrastructures, it is recommended to use containers instead of conda environments.
- Other core profiles are
- All nf-core pipelines come with some generic config profiles. The most commonly used ones are for software packaging:
- nf-core/configs: Server profiles
- At run time, nf-core pipelines fetch configuration profiles from the configs remote repository. The profiles here are specific to clusters at different institutions.
- Because this is loaded at run time, anyone can add a profile here for their system and it will be immediately available for all nf-core pipelines.
- Personal configuration under
- Local config files given to Nextflow with the
- Command line configuration.
Multiple comma-separate config profiles can be specified in one go, so the following commands are perfectly valid:
Note that the order in which config profiles are specified matters. Their priority increases from left to right.
Our tip: Be clever with multiple Nextflow configuration locations. For example, use
-profilefor your cluster configuration,
~/.nextflow/configfor your personal config such as
params.emailand a working directory
custom.configprovided to the run with
-c custom.config) file for reproducible run-specific configuration.
To know more about Nextflow configurations you can check the pipeline configuration tutorial.
Running pipelines with test data
test config profile is a bit of a special case.
Whereas all other config profiles tell Nextflow how to run on different computational systems, the
test profile configures each
nf-core pipeline to run without any other command line flags.
It specifies URLs for test data and all required parameters.
Because of this, you can test any nf-core pipeline with the following command:
Note that you will typically still need to combine this with a configuration profile for your system - e.g.
-profile test,docker. Running with the test profile is a great way to confirm that you have Nextflow configured properly for your system before attempting to run with real data.
The nf-core launch command
Most nf-core pipelines have a number of flags that need to be passed on the command line: some mandatory, some optional.
To make it easier to launch pipelines, these parameters are described in a JSON file bundled with the pipeline.
nf-core launch command uses this to build an interactive command-line wizard which walks through the different options with descriptions of each, showing the default value and prompting for values.
Once all prompts have been answered, non-default values are saved to a
params.json file which can be supplied to Nextflow to run the pipeline. Optionally, the Nextflow command can be launched there and then.
To use the launch feature, just specify the pipeline name:
Using nf-core pipelines offline
Many of the techniques and resources described above require an active internet connection at run time - pipeline files, configuration profiles and software containers are all dynamically fetched when the pipeline is launched. This can be a problem for people using secure computing resources that do not have connections to the internet.
To help with this, the
nf-core download command automates the fetching of required files for running nf-core pipelines offline.
The command can download a specific release of a pipeline with
--release and fetch the singularity container if
--singularity is passed (this needs Singularity to be installed).
All files are saved to a single directory, ready to be transferred to the cluster where the pipeline will be executed.
To know more about running pipelines offline you can check the pipeline configuration tutorial.
Exercise 3 (using pipelines)
- Install required dependencies (Nextflow, Docker)
- Print the command-line usage instructions for the nf-core/rnaseq pipeline
- In a new directory, run the nf-core/rnaseq pipeline with the provided test data
- Try launching the RNA pipeline using the
- Download the nf-core/rnaseq pipeline for offline use using the
Troubleshooting nf-core pipelines
Not everything always runs smoothly and you might be getting some errors when running nf-core pipelines. Here are some step-by-step tips that can help you troubleshoot your errors.
- Start small: each nf-core pipeline comes with small test data that are checked by continuous integration and for each pipeline release.
- Start by running the pipeline tests as described above. If these tests fail, there is a good chance that you are missing some of the components needed to run Nextflow pipelines.
- Nextflow: check that you have the latest version installed.
- Check that you have docker/singularity/conda installed and that you are using the right docker/singularity/conda/custom profile.
- Check the troubleshooting docs.
- Categorize the type of error. Check the Nextflow low to figure out if the error occurs:
- Before the first process
- In the first process
- During the pipeline run
- Problems with the process output
- Read the Nextflow log. Check the work directory for the
.command.logfiles for more information.
- Search the nf-core slack, google. Ask for help in the corresponding nf-core slack channel.
- Report a pipeline bug on the nf-core GitHub if none of the above steps helps.
Here is a bytesize talk with a step by step explanation on how to troubleshoot failing pipelines.
We hope that this nf-core tutorial has been helpful! Remember that there is more in-depth documentation on many of these topics available on the nf-core website. If in doubt, please ask for help on Slack.
If you have any suggestions for how to improve this tutorial, or spot any mistakes, please create an issue or pull request on the nf-core/website repository.