🇿🇦 Stellenbosch University

Hackathon - March 2025

This is a subpage of the main event page: Hackathon - March 2025.

See the main event page for event registration and information about the event.

Location
South Africa University

BMRI building, Francie Van Zijl Dr, Tygerberg, Cape Town, 7505

https://www.sun.ac.za/english/faculty/healthsciences/

Attendees from outside Stellenbosch University are absolutely welcome to attend. Please don’t hesitate to send us an email in case there are any doubts.

Organizers:

Venue

BMRI building meeting room 1016, close to the stairs on the first floor.

Site-specific Instructions

Event overview

  1. Primary Goal

    • Introduce participants to intermediate topics in Nextflow, nf-core, and bioinformatics, with an emphasis on open and reproducible research.
  2. Focus Areas

    • Setting up and running pipelines
    • Reproducibility: version control, containerization, interoperability
    • Pipeline optimization and debugging
    • AI and collaboration in open science
    • Infrastructure and cloud computing
  3. Expected Outcomes

    • Gain skills in running and optimizing nf-core pipelines for their research needs
    • Receive an introduction to Seqera AI, Seqera Cloud, Seqera Data Studios, and Seqera Containers
    • Learn how to apply for Google Cloud credits

Hackathon Participant Guidelines

  1. 💻 Laptop (Mac, Linux, or Windows with Linux subsystem/dual-boot), charger & adapter if needed.
  2. Eduroam Wi-Fi access (visiting students will be accommodated).
  3. Basic bash scripting and command-line experience.
  4. Some knowledge of Nextflow and/or nf-core (experience running pipelines not required).
  5. Access to a server, cloud, or cluster (CHPC, Hemera, Khaos, Ilifu, etc.) OR a laptop with at least 8GB RAM.
  6. Familiarity with GitHub and version control (must have a GitHub account).
  7. Basic understanding of containerization.
  8. Bringing your own study data is optional but useful for focused discussions.

Schedule

TIME24-Mar-2525-Mar-2526-Mar-25
0900 -1000Arrival, Setup and IntroductionsRecap and OverviewRecap and Overview
1000 -1030Brief Intro to Nextflow and nf-coreInterest groups and relevant pipelinesThe Seqera Platform: AI, cloud, containers, data studio
1030 -1100Possibilities of integration in projects
1100 -1200LUNCHLUNCHLUNCH
1200 -1300Setting up Nextflow and nf-core pipelinesIntroduction to nf-core pipeline specific resourcesCloud computing and How to apply for Google cloud credits
1300 -1400Q&A | Running generic pipelinesQ&A | Running specific pipelinesQ&A | Pipeline troubleshooting, closing of hackathon
1400 -1530Running generic pipelinesRunning specific pipelinesQ&A | Closing of hackathon

Interest Groups

💡 Why are the generic pipelines recommended for all participants? The generic pipelines provide foundational knowledge of nf-core infrastructure and are especially useful for participants who do not yet have access to their own data.

All participants will first run three generic nf-core pipelines (on Day 1) before choosing an interest group:

  1. Generic pipelines
  2. Transcriptomics pipelines
  3. Metagenomics pipelines
  4. Human Genomics pipelines
Interest GroupPipelineUsageInputFunction
Genericnf-core/demoFor testing Nextflow installation and nf-core configuration.None (uses synthetic data bundled in the pipeline).Validates the environment, Nextflow installation, and nf-core compatibility using small synthetic datasets.
Genericnf-core/fetchngsDownloads raw sequencing data from public repositories like ENA or SRA.Accession IDs (e.g., SRR/ERR/DRR or project IDs).Automates the retrieval of public sequencing datasets for downstream analysis.
Genericnf-core/readsimulatorSimulates sequencing reads from reference genomes.Reference genome sequences (.fasta) and optional parameters for sequencing simulation (e.g., error rates).Generates synthetic sequencing datasets for testing or benchmarking bioinformatics workflows.
Transcriptomicsnf-core/rnaseqProcesses RNA-Seq data for gene and transcript quantification.Raw RNA-Seq reads (.fastq.gz) and a reference genome or transcriptome (.fasta/.gtf).Performs RNA-Seq alignment, quantification, and QC for transcriptomics studies.
Metagenomicsnf-core/taxprofilerPerforms taxonomic profiling of metagenomic datasets.Metagenomic sequencing reads (.fastq.gz) and optionally a reference database.Analyzes microbial community composition from metagenomic sequencing data.
Human Genomicsnf-core/sarekProcesses whole-genome or targeted sequencing data for germline or somatic variant analysis.Raw sequencing reads (.fastq.gz), reference genome (.fasta), and optional panel of normals (PoN).Supports cancer research and precision medicine through variant calling and annotation.

Certificate Requirements

To receive a participation certificate, participants must:

  • Attend at least one full day of the hackathon
  • Actively engage in hackathon activities and provide a progress report