Introduction

To keep all the nf-core pipelines up-to-date with the latest version of the community standards, we have implemented a synchronisation tool. This ensures that updates to the community standards are propagated to all nf-core pipelines.

There are three topics covered in this documentation page:

  1. Merging automated PRs
  2. Manual synchronisation
  3. Setting up a pipeline for syncing retrospectively

How template synchronisation works

The nf-core helper tools have a subcommand for synchronising a pipeline with the nf-core template (nf-core sync). Although this can be run manually, it is usually only used by the GitHub Actions automation: when a new version of nf-core/tools is released it runs for all nf-core pipelines and automatically opens pull-requests (PRs) with the necessary changes required to update every pipeline. These pull requests then need to be manually resolved and merged by the pipeline maintainers.

Behind the scenes, this synchronisation is done by using git. Each repository has a special TEMPLATE branch which contains only the "vanilla" code made by the nf-core create tool. The synchronisation tool fetches the essential variables needed to recreate the pipeline and uses this to trigger a nf-core create --no-git command with the latest version of the template. The result from this is then compared against what is stored in the TEMPLATE branch and committed. When merging from the TEMPLATE branch back into the main dev branch of the pipeline, git should be clever enough to know what has changed since the template was first used, and therefore, it will only present the relevant changes.

For this to work in practice, the TEMPLATE branch needs to have a shared git history with the master branch of the pipeline. The nf-core create command initially does this by enforcing a first commit to the master branch before any development has taken place. If the pipeline was not created by the nf-core create command, this has to be set up manually. For instructions on this, see Setting up a pipeline for syncing retrospectively.

Merging automated PRs

When a new release of tools is created, each pipeline will get an automated pull-request (PR) opened to merge the changes to the template into the pipeline.

If there are no merge conflicts on the PR, then that's great! If you are happy with the changes, feel free to just merge it into the dev branch directly. However, it is quite likely that the PR is quite big with a lot of merge conflicts. You're going to have to resolve and merge these manually. Sorry about this, but there's no way around it..

You should not be actively working on the main nf-core repository, so we need to bring these changes to your personal fork. The steps we need to do are:

  1. Pull the nf-core/<pipeline> TEMPLATE changes to your fork
  2. Resolve the merge conflicts
  3. Push these updates to your fork on GitHub
  4. Make a PR from your fork to the main nf-core repo

Once you have committed and pushed the updates to your fork and merged these in to the nf-core repository, the automated PR will close itself and show as merged. You will not need to touch it.

Pull the changes to your fork

On the command line, go to the directory where you have checked out your fork of the pipeline repository. Add the nf-core fork as a git remote called upstream:

git remote add upstream https://github.com/nf-core/<pipeline>.git

Next, check out a new branch to make these changes in:

git checkout -b merging-template-updates

Finally, pull the TEMPLATE branch from the upstream repo:

git pull upstream TEMPLATE

Resolving merge conflicts

You will probably get a tonne of log messages telling you about merge conflicts:

$ git pull upstream TEMPLATE

remote: Enumerating objects: 33, done.
remote: Counting objects: 100% (33/33), done.
remote: Compressing objects: 100% (18/18), done.
remote: Total 33 (delta 15), reused 33 (delta 15), pack-reused 0
Unpacking objects: 100% (33/33), done.
From github.com:nf-core/rnaseq
 * branch            TEMPLATE   -> FETCH_HEAD
   55d617e..2d7814a  TEMPLATE   -> upstream/TEMPLATE
Auto-merging nextflow.config
CONFLICT (content): Merge conflict in nextflow.config
Auto-merging main.nf
CONFLICT (content): Merge conflict in main.nf
Auto-merging environment.yml
CONFLICT (content): Merge conflict in environment.yml
...

If you look at the current status, you will see the files that have merge conflicts that need resolving (Unmerged paths):

$ git status

On branch merging-template-updates
You have unmerged paths.
  (fix conflicts and run "git commit")
  (use "git merge --abort" to abort the merge)

Changes to be committed:

    modified:   .github/ISSUE_TEMPLATE/bug_report.md
    modified:   .github/ISSUE_TEMPLATE/feature_request.md
    modified:   .github/markdownlint.yml
    modified:   .gitignore
    new file:   bin/markdown_to_html.py
    deleted:    bin/markdown_to_html.r
    deleted:    conf/awsbatch.config

Unmerged paths:
  (use "git add/rm <file>..." as appropriate to mark resolution)

    both modified:   .github/CONTRIBUTING.md
    both modified:   .github/PULL_REQUEST_TEMPLATE.md
    both added:      .github/workflows/branch.yml
    both added:      .github/workflows/ci.yml
    both added:      .github/workflows/linting.yml
    deleted by them: .travis.yml
    both modified:   CHANGELOG.md
    both modified:   CODE_OF_CONDUCT.md
    both modified:   Dockerfile
    both modified:   README.md
    both modified:   assets/multiqc_config.yaml
    both modified:   bin/scrape_software_versions.py
    both modified:   conf/base.config
    both modified:   conf/igenomes.config
    both modified:   conf/test.config
    both modified:   docs/output.md
    both modified:   docs/usage.md
    both modified:   environment.yml
    both modified:   main.nf
    both modified:   nextflow.config

You now need to go through each of these files to resolve every merge conflict. Most code editors have tools to help with this, for example Atom and VSCode have built-in support.

Be careful when resolving conflicts. Most of the time you will want to use the version from the TEMPLATE branch, but be aware that some of this new template code may need to be customised by your pipeline. In other words, you may need to manually combine the two versions in to one new code block.

If you have any doubts, ask for help on the nf-core Slack.

Pushing the resolved changes to your fork

When all merge conflicts have been resolved and all files are staged, you can commit and push these changes as with any other new code:

git commit -m "Merged changes from nf-core template"
git push --set-upstream origin merging-template-updates

Merging to the nf-core repository

Once the changes are on your fork, you can make a pull request to the main nf-core repository for the pipeline. This should be reviewed and merged as usual. You should see in the commit history on the PR that there is a commit by the @nf-core-bot user, with the same commit hash found in the automated TEMPLATE PR.

Once your fork is merged, the automated PR will also show as merged and will close automatically.

Manual synchronisation

There are rare cases, when the synchronisation needs to be triggered manually, i.e. it was not executed during an nf-core/tools release on Github, or when you want to perform a targeted sync.

You can do so by running the nf-core sync command:

cd my_pipeline
git checkout dev # or your most up to date branch
nf-core sync .

Note that the sync command assumes that you have a branch called TEMPLATE, so you may need to pull this from the upstream nf-core repository if you are working on a fork:

git remote add upstream https://github.com/nf-core/PIPELINE.git
git checkout --track upstream/TEMPLATE

Remember to go back to your dev branch as above before running nf-core sync.

Much of the merging process should then be the same as described above with the automated pull requests.

Setting up a pipeline for syncing retrospectively

This section describes how to set up a correct TEMPLATE branch in the case your pipeline was not created with a TEMPLATE branch from the beginning. If you created a pipeline with the nf-core create command, you should be all ready to go and can skip this step. Otherwise proceed with caution. It is probably a good idea to make sure you have all your local changes pushed to github and you could even make a local backup clone of your repository before proceeding.

You should also consider the option to restart your pipeline project by running the nf-core create command and simply copy in the modifications you need into the newly created pipeline.

Step-by-step procedure

This walkthrough assumes that you are working directly with the head nf-core fork of the pipeline. It is possible (and potentially safer) to do this on your own fork instead, it's up to you.

First clone your pipeline into a new directory (in case we mess things up):

mkdir TMPDIR
cd TMPDIR
git clone https://github.com/nf-core/YOURPIPELINENAME.git

Then create the new TEMPLATE branch and delete all your files in order to have a completely empty branch:

cd pipeline_root_dir
git checkout --orphan TEMPLATE && git rm -rf '*'

Make sure your branch is completely empty by checking the status of git status:

$ git status
On branch TEMPLATE

No commits yet

nothing to commit (create/copy files and use "git add" to track)

Regenerate your pipeline from scratch using the most recent template:

Make sure you are within your pipeline root directory before running these commands.

nf-core create --no-git

If your pipeline already has versioned releases (eg. you are not currently on 1.0dev), then specify the version number that you are currently on:

nf-core create --no-git --new-version 1.3dev

The version you choose should match the branch that you intend to merge with. If you already have a release, you should probably be merging in to dev eventually, so use the version number specified there.

Follow the prompts to fill in the pipeline name, description and author(s). Make sure that you take the exact text that you already have already used in your pipeline's nextflow.config file (manifest.name etc.), if these have already been written.

This creates a new directory YOURPIPELINENAME with the template pipeline files in it. Now move these files into your root git directory:

mv nf-core-YOURPIPELINENAME/* .
mv nf-core-YOURPIPELINENAME/.[!.]* .
rmdir nf-core-YOURPIPELINENAME

Now make sure the newly created files are in the correct place. It should look similar to this:

$ git status
On branch TEMPLATE

No commits yet

Untracked files:
  (use "git add <file>..." to include in what will be committed)

  .gitattributes
  .github/
  .gitignore
  .travis.yml
  CHANGELOG.md
  CODE_OF_CONDUCT.md
  Dockerfile
  LICENSE
  README.md
  assets/
  bin/
  conf/
  docs/
  environment.yml
  main.nf
  nextflow.config

nothing added to commit but untracked files present (use "git add" to track)

If it all looks good, then commit these files:

git add .
git commit -m "Initial template commit"

For the nf-core bot to be able to access your TEMPLATE branch, you need to push it to the upstream repository (https://github.com/nf-core).

git push --set-upstream origin TEMPLATE

Merge TEMPLATE into main branches

The only remaining step is unfortunately a rather tedious one. You have to merge the TEMPLATE branch into your main pipeline branches, manually resolving all merge conflicts.

If your pipeline is in early development, you can do this with master branch directly. If not, it's better to do this in a branch and then you can make a pull-request to dev / master when ready.

git checkout dev
git checkout -b template_merge
git merge TEMPLATE --allow-unrelated-histories

You can try extra flags such as -Xignore-space-at-eol if you find that the merge command shows entire files as being new.

You'll probably see a lot of merge conflicts:

Auto-merging nextflow.config
CONFLICT (add/add): Merge conflict in nextflow.config
Auto-merging main.nf
CONFLICT (add/add): Merge conflict in main.nf
Auto-merging environment.yml
CONFLICT (add/add): Merge conflict in environment.yml
Auto-merging docs/usage.md
CONFLICT (add/add): Merge conflict in docs/usage.md

Go through each file resolving the merge conflicts carefully. Many text editors have plugins to help with this task. The Atom GitHub package is one example of an excellent interface to manage merge conflicts (see the docs).

It's highly recommended to use a visual tool to help you with this, as it's easy to make mistakes if handling the merge markers manually when there are so many to deal with.

Once you have resolved all merge conflicts, you can commit the changes and push to the GitHub repo:

git commit -m "Merged vanilla TEMPLATE branch into main pipeline"
git push origin template_merge

The final task is to create a pull request with your changes so that they are included in the upstream repository. Once your commits are finally merged into the master branch, all future automatic template syncing should work.

When new releases of nf-core/tools and it's associated template are released, pull-requests will automatically be created to merge updates in to your pipeline for you.

That's it, you're done! Congratulations!