Common utility functions for the nf-core python package.


Bases: CachedSession

Class to provide a single session for interacting with the GitHub API for a run. Inherits the requests_cache.CachedSession and adds additional functionality, such as automatically setting up GitHub authentication if we can.

get(url, **kwargs)

Initialise the session if we haven’t already, then call the superclass get method.


Initialise the object.

Only do this when it’s actually being used (due to global import)

log_content_headers(request, post_data=None)

Try to dump everything to the console, useful when things go wrong.

request_retry(url, post_data=None)

Try to fetch a URL, keep retrying if we get a certain return code.

Used in nf-core sync code because we get 403 errors: too many simultaneous requests See


Run a GET request, raise a nice exception with lots of logging if it fails.


Try to automatically set up GitHub authentication


Bases: object

Object to hold information about a local pipeline.

  • Parameters: path (str) – The path to the nf-core pipeline directory.


The parsed conda configuration file content (environment.yml).

  • Type: dict


The conda package(s) information, based on the API requests to Anaconda cloud.

  • Type: dict


The Nextflow pipeline configuration file content.

  • Type: dict


A list of files found during the linting process.

  • Type: list


The git sha for the repo commit / current GitHub pull-request ($GITHUB_PR_COMMIT)

  • Type: str


The minimum required Nextflow version to run the pipeline.

  • Type: str


Path to the pipeline directory.

  • Type: str


The pipeline name, without the nf-core tag, for example hlatyping.

  • Type: str


A PipelineSchema object

  • Type: obj


Convenience function to get full path to a file in the pipeline


Get a list of all files in the pipeline


Run core load functions


Try to load the pipeline environment.yml file, if it exists


Get the nextflow config for this pipeline

Once loaded, set a few convienence reference class attributes

nf_core.utils.anaconda_package(dep, dep_channels=None)

Query conda package information.

Sends a HTTP GET request to the Anaconda remote API.

  • Parameters:
    • dep (str) – A conda package name.
    • dep_channels (list) – list of conda channels to use
  • Raises:
    • A LookupError**,** if the connection fails or times out or gives an unexpected status code
    • A ValueError**,** if the package name can not be found (**404**)

nf_core.utils.check_if_outdated(current_version=None, remote_version=None, source_url='')

Check if the current version of nf-core is outdated


Overwrite default PyYAML output to make Prettier YAML linting happy


nf_core.utils.fetch_wf_config(wf_path, cache_config=True)

Uses Nextflow to retrieve the the configuration variables from a Nextflow workflow.

  • Parameters:
    • wf_path (str) – Nextflow workflow file system path.
    • cache_config (bool) – cache configuration or not (def. True)
  • Returns: Workflow configuration settings.
  • Return type: dict


Calculates the md5sum for a file on the disk.

  • Parameters: fname (str) – Path to a local file.

nf_core.utils.get_biocontainer_tag(package, version)

Given a bioconda package and version, looks for Docker and Singularity containers using the biocontaineres API, e.g.:{tool}/versions/{tool}-{version} Returns the most recent container versions by default.

package: A bioconda package name.
package: str
version: Version of the bioconda package
version: str

  • Raises:
    • A LookupError**,** if the connection fails or times out or gives an unexpected status code
    • A ValueError**,** if the package name can not be found (**404**)

nf_core.utils.get_first_available_path(directory, paths)

nf_core.utils.get_repo_releases_branches(pipeline, wfs)

Fetches details of a nf-core workflow to download.

  • Parameters:
    • pipeline (str) – GitHub repo username/repo
    • wfs – A nf_core.list.Workflows() object, where get_remote_workflows() has been called.
  • Returns: Array of releases, Array of branches
  • Return type: wf_releases, wf_branches (tuple)
  • Raises: LockupError**,** if the pipeline can not be found.


Check file path to see if it is a binary file


Checks if the specified directory have the minimum required files (‘’, ‘nextflow.config’) for a pipeline directory

  • Parameters: wf_path (str) – The directory to be inspected
  • Raises: UserWarning – If one of the files are missing

nf_core.utils.is_relative_to(path1, path2)

Checks if a path is relative to another.

Should mimic Path.is_relative_to which not available in Python < 3.9

path1 (Path | str): The path that could be a subpath path2 (Path | str): The path the could be the superpath


Parse the nf-core.yml configuration file

Look for a file called either .nf-core.yml or .nf-core.yaml

Also looks for the deprecated file .nf-core-lint.yml/yaml and issues a warning that this file will be deprecated in the future

Returns the loaded config dict or False, if the file couldn’t be loaded


Run a Nextflow command and capture the output. Handle errors nicely

nf_core.utils.parse_anaconda_licence(anaconda_response, version=None)

Given a response from the anaconda API using anaconda_package, parse the software licences.

Returns: Set of licence types


Query PyPI package information.

Sends a HTTP GET request to the PyPI remote API.

  • Parameters: dep (str) – A PyPI package name.
  • Raises:
    • A LookupError**,** if the connection fails or times out
    • A ValueError**,** if the package name can not be found


Return a ‘es’ if the input is not one or has not the length of one.


Return an s if the input is not one or has not the length of one.


Return ‘ies’ if the input is not one or has not the length of one, else ‘y’.

nf_core.utils.poll_nfcore_web_api(api_url, post_data=None)

Poll the nf-core website API

Takes argument api_url for URL

Expects API reponse to be valid JSON and contain a top-level ‘status’ key.

nf_core.utils.prompt_pipeline_release_branch(wf_releases, wf_branches)

Prompt for pipeline release / branch

  • Parameters:
    • wf_releases (array) – Array of repo releases as returned by the GitHub API
    • wf_branches (array) – Array of repo branches, as returned by the GitHub API
  • Returns: Selected release / branch name
  • Return type: choice (str)


Prompt for the pipeline name with questionary

  • Parameters: wfs – A nf_core.list.Workflows() object, where get_remote_workflows() has been called.
  • Returns: GitHub repo - username/repo
  • Return type: pipeline (str)
  • Raises: AssertionError**,** if pipeline cannot be found


Check if any environment variables are set to force Rich to use coloured output


Creates a directory for files that need to be kept between sessions

Currently only used for keeping local copies of modules repos


Sets up local caching for faster remote HTTP requests.

Caching directory will be set up in the user’s home directory under a .config/nf-core/cache_* subdir.

Uses requests_cache monkey patching. Also returns the config dict so that we can use the same setup with a Session.


Sorts a nested dictionary recursively

nf_core.utils.strip_ansi_codes(string, replace_with='')

Strip ANSI colouring codes from a string to return plain text.

From Stack Overflow:

nf_core.utils.validate_file_md5(file_name, expected_md5hex)

Validates the md5 checksum of a file on disk.

  • Parameters:
    • file_name (str) – Path to a local file.
    • expected (str) – The expected md5sum.
  • Raises: IOError**,** if the md5sum does not match the remote sum.

nf_core.utils.wait_cli_function(poll_func, refresh_per_second=20)

Display a command-line spinner while calling a function repeatedly.

Keep waiting until that function returns True

  • Parameters:
    • poll_func (function) – Function to call
    • refresh_per_second (int) – Refresh this many times per second. Default: 20.
  • Returns: None. Just sits in an infite loop until the function returns True.