Methylation (Bisulfite-Sequencing) analysis pipeline using Bismark or bwa-meth + MethylDackel
nf-core/methylseq is a bioinformatics analysis pipeline used for Methylation (Bisulfite) sequencing data. It pre-processes raw data from FastQ inputs, aligns the reads and performs extensive quality-control on the results.
The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker / Singularity containers making installation trivial and results highly reproducible.
On release, automated continuous integration tests run the pipeline on a full-sized dataset on the AWS cloud infrastructure. This ensures that the pipeline runs on AWS, has sensible resource allocation defaults set to run on real-world datasets, and permits the persistent storage of results to benchmark between pipeline releases and other analysis sources.The results obtained from the full-sized test can be viewed on the nf-core website.
The pipeline allows you to choose between running either Bismark or bwa-meth / MethylDackel.
Choose between workflows by using
--aligner bismark (default, uses bowtie2 for alignment),
--aligner bismark_hisat or
|Generate Reference Genome Index (optional)
|Merge re-sequenced FastQ files
|Raw data QC
|Adapter sequence trimming
|Extract methylation calls
First, prepare a samplesheet with your input data that looks as follows:
Each row represents a fastq file (single-end) or a pair of fastq files (paired end).
Now, you can run the pipeline using:
Please provide pipeline parameters via the CLI or Nextflow
-params-file option. Custom config files including those
provided by the
-c Nextflow option can be used to provide any configuration except for parameters;
To see the results of an example test run with a full size dataset refer to the results tab on the nf-core website pipeline page. For more details about the output files and reports, please refer to the output documentation.
- Main author:
- Phil Ewels (@ewels)
Contributions and Support
If you would like to contribute to this pipeline, please see the contributing guidelines.
If you use nf-core/methylseq for your analysis, please cite it using the following doi: 10.5281/zenodo.1343417
An extensive list of references for the tools used by the pipeline can be found in the
You can cite the
nf-core publication as follows:
The nf-core framework for community-curated bioinformatics pipelines.
Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.
Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.