This project focuses on nf-core/cellpainting, a Nextflow pipeline for scalable, reproducible image-based profiling of Cell Painting assays. The pipeline takes high-content microscopy images, applies illumination correction, extracts morphological features with CellProfiler, and converts the results to analysis-ready Parquet files with CytoTable.
The pipeline already has a working core (illumination correction → assay-development QC → CellProfiler analysis → CytoTable conversion → MultiQC), but it is pre-1.0 and there is a long list of high-value features and polish work to do before a first release. The Boston hackathon is a great chance to broaden the contributor base and knock out a batch of well-scoped issues across the stack.
Goal
Make meaningful progress toward a 1.0 release of nf-core/cellpainting by closing a batch of open issues — adding new analysis modules from the cytomining ecosystem, tightening the test suite, and improving downstream usability of the pipeline outputs.
What participants will do
Each contributor will:
- Pick an issue from the nf-core/cellpainting issue tracker and assign themselves.
- Discuss the approach with the project lead and other contributors at the table.
- Implement the change on a feature branch (new module, test improvement, or documentation update).
- Run nf-test and lint locally with the
testprofile. - Open a Pull Request for review.
Suggested tasks
Good first issue
- #40 — Switch CytoTable grouping from per-site to per-plate. Self-contained Nextflow refactor: regroup the CytoTable invocation by
(batch, plate)and update the test snapshot. Touches one module, one workflow file, and the docs.
Test infrastructure
- #39 — Add a
test_fullprofile that runs a full plate from the minimal test dataset. Generate a full-plate samplesheet forBR00117035, host it on thecellpaintingbranch ofnf-core/test-datasets, and wire up the AWS megatest profile. - #38 — Replace coarse md5 snapshots with intelligent assertions for CellProfiler outputs. Move from blanket file-ignore lists to nf-test path/line assertions that allow non-deterministic fields (run timestamps, work-dir paths, PNG metadata) without losing coverage.
- #15 — Add SQLite Cell Painting Gallery example data. Subset the
BR00117035.sqlitefrom the Cell Painting Gallery and contribute it tonf-core/test-datasets.
New modules — cytomining downstream stack
- #9 —
pycytominer_annotate, #10 —pycytominer_aggregate, #11 —pycytominer_normalize, #12 —pycytominer_featureselect. Add the pycytominer operations as nf-core modules and wire them into the analysis subworkflow downstream of CytoTable. - #8 —
cosmicqc. Add coSMicQC as a single-cell QC step.
New modules — imaging
- #2 —
cellpose. Integrate Cellpose segmentation as an alternative to CellProfiler primary/secondary objects, feeding label matrices into a re-configured analysis pipeline. - #5 —
fiji_stitchsegmentedimages. Stitch per-site assay-development overlays into a pseudo-plate view for at-a-glance visual QC.
Who should join?
This project is a good fit for people interested in:
- Nextflow / nf-core pipeline development (any experience level)
- Image-based profiling, high-content imaging, or Cell Painting assays
- The cytomining ecosystem (CellProfiler, CytoTable, pycytominer, coSMicQC)
- Test engineering with nf-test
- Bringing a new pipeline to its first release
You do not need a biology background — most of the open issues are pure pipeline-engineering tasks.
Recommended preparation
- Basic familiarity with Git and GitHub (forks, branches, pull requests)
- Some exposure to Nextflow and nf-core (Hello nextflow and Hello nf-core are great primers)
- Working Docker or Singularity install for running
-profile test,docker - Optional reading: the Cell Painting protocol and the JUMP Cell Painting Consortium overview for context on the data