Introduction

nf-core/eager is a bioinformatics best-practice analysis pipeline for ancient DNA data analysis.

The pipeline uses Nextflow, a bioinformatics workflow tool. It pre-processes raw data from FastQ inputs, aligns the reads and performs extensive quality-control on the results. It comes with docker / singularity containers making installation trivial and results highly reproducible.

Pipeline steps

  • Create reference genome indices (optional)
    • BWA
    • Samtools Index
    • Sequence Dictionary
  • QC with FastQC
  • AdapterRemoval for read clipping and merging
  • Read mapping with BWA, BWA Mem or CircularMapper
  • Samtools sort, index, stats & conversion to BAM
  • DeDup or MarkDuplicates read deduplication
  • QualiMap BAM QC Checking
  • Preseq Library Complexity Estimation
  • DamageProfiler damage profiling
  • BAM Clipping for UDG+/UDGhalf protocols
  • PMDTools damage filtering / assessment

Documentation

The nf-core/eager pipeline comes with documentation about the pipeline, found in the docs/ directory:

  1. Installation
  2. Pipeline configuration
  3. Running the pipeline
  4. Output and how to interpret the results
  5. Troubleshooting

Credits

This pipeline was written by Alexander Peltzer (apeltzer), with major contributions from Stephen Clayton, ideas and documentation from James Fellows-Yates, Raphael Eisenhofer and Judith Neukamm. If you want to contribute, please open an issue and ask to be added to the project - happy to do so and everyone is welcome to contribute here!