nf-core/taxprofiler
Edit

Highly parallelised multi-taxonomic profiling of shotgun short- and long-read metagenomic data

classificationilluminalong-readsmetagenomicsmicrobiomenanoporepathogenprofilingshotguntaxonomic-classificationtaxonomic-profiling

This is the development version of the pipeline.

Launch development version https://github.com/nf-core/taxprofiler

nf-core/taxprofiler developer checklists

This page can act as a reference for new developers who wish to contribute to the nf-core/taxprofiler code base, and act as a reviewing checklist for reviewers.

Adding a new profiler contribution workflow

Note: does not have to be in this precise order

Warning

Before starting, always make sure you’re on a fork and a new branch!

Outside of pipeline repository

Make full test database
- See taxprofiler branch of test datasets for instructions how to get raw FASTAs
- Test database locally
- Once built and test, upload to iGenomes s3 (Ask James)
- Update database_full_vX.X.csv and README to include the new s3 URI and instructions
- Open PR against test-datasets, taxprofiler branch
Add a MultiQC module
Make a Taxpasta module
Add the database building module to nf-core/createtaxdb (where possible)

New nf-test procedure and specifications

Procedure

When writing a new pipeline-level nf-tests for nf-core/taxprofiler, we recommend the following procedure:

Run the test profile locally to have a copy of the expected output files and the results directory structure
Check the expected results directories and contents of files are expected

Check that the results directory reflects parameters specified in the test config itself
One or two files per directory should be enough

Write the base nf-test file structure, assuming all files are stable (following the specifications below)
Run nf-test --tag <test_name> --profile +docker once to write the first snapshot
Run command above to get the diff of unstable files
Update the assertions in each directories .match() snapshot for unstable files

Specifications

Write the test files following the specifications below.

The necessary files are follows:

New test files should go under tests/
Test file should be called <test_config_name>.nf.test
Snapshot file should be called <test_config_name>.nf.test.snap

nf-test file contents:

Test file header
- Specify name as: Test <config name>
- Specify two tags: pipeline and <config_name>
- Specify profile as <config_name>
Test block
- Specify the name of the test block as test("-profile <config name>")
When block
- Specify when block with single param, outdir
  - All other parameters should be specified in the config .conf file itself
Then block
- Specify on the first line, a stable_name_all variable to list all file names with the nft-utils getAllFilesFromDir function
- For each top-level output directory under results (typically, one per tool), specify a stable_content_<dir name> variable (exceptions: multiqc and pipeline_info) in alphabetical order
  - Use syntax: def stable_content_<dir name> = getAllFilesFromDir(params.outdir, relative: false, includeDir: false, include: ["<dir name>/**"], ignoreFile: 'tests/.nftignore')
  - If no stable files, leave comment for that directory
    - All files unstable // <dir name>: all unstable files, see stable_name_all
    - Partly unstable (using custom assertions): // <dir name>: partly unstable files, see custom assertions
  - Make sure relative: false in function, to capture md5sums
assertAll block
- Use the removeNextflowVersion function
- Check existance of nf_core_taxprofiler_software_mqc_versions.yml file
- Check existance of multiqc_report.html file
- Snapshot stable_name_all with a .match() name of all_files
- For each results directory:
  - Add a comment of the directory name
  - Specify an assert closure with a snapshot and .match() name of directory_name (typically the tool name all lower case). Exceptions: MultiQC, Pipeline info.
  - Each .match() should be in alphabetical order (same as order in the --outdir)
  - Inside .match() assertion, secify the stable_content_<dir name> variable (if available), then:
    - For each unstable file, specify specific path in .nftignore including the tool directory prefix
    - For each unstable file, if the contents are partly stable, specify an alternative method of file checks (e.g. sorted file, file size check, contains string, nft-plugin function)
    - For each unstable file assertion, include a string before the assertion itself with the file name and type of check (with what being checked for)
    - If multiple assertions in snapshot, ensure closing .match() and closing {} are one and two less indents as the assertions

Reviewing:

*.conf local test run
- File contents of all files are as expected
*.nf.test
- All test names, tags, profiles correct
- Structure matches structure described above
- All output directories of the -profile test_<name> are covered via a stable_contents_* variable and an assertions
- Non-stable directories not using stable_contents_* replaced with a comment
- All unstable files covered in custom snapshots
- Custom assertion specifies right file name in both sentence and in the file check itself
*.nf.test.snap
- All .match() sections defined in *.nf.test represented
- No empty match() blocks
- All files in --outdir listed in each .match() section (except if explicitly excluded due to completely unstable files)
  - Comparing with the output of tree <--outdir> can be helpful!
- No empty md5sums (d41d8cd98f00b204e9800998ecf8427e)
- No custom boolean assertions set as false

On this page

nf-core/taxprofiler Edit

nf-core/taxprofiler developer checklists

Adding a new profiler contribution workflow

Outside of pipeline repository

New nf-test procedure and specifications

Procedure

Specifications

nf-core/taxprofiler
Edit