nf-core/smrnaseq is a bioinformatics best-practice analysis pipeline used for small RNA sequencing data.

The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker containers making installation trivial and results highly reproducible.

Pipeline summary

  1. Raw read QC (FastQC)
  2. Adapter trimming (Trim Galore!)
    1. Insert Size calculation
    2. Collapse reads (seqcsluter)
  3. Alignment against miRBase mature miRNA (Bowtie1)
  4. Alignment against miRBase hairpin
    1. Unaligned reads from step 3 (Bowtie1)
    2. Collapsed reads from step 2.2 (Bowtie1)
  5. Post-alignment processing of miRBase hairpin
    1. Basic statistics from step 3 and step 4.1 (SAMtools)
    2. Analysis on miRBase hairpin counts (edgeR)
      • TMM normalization and a table of top expression hairpin
      • MDS plot clustering samples
      • Heatmap of sample similarities
    3. miRNA and isomiR annotation from step 4.1 (mirtop)
  6. Alignment against host reference genome (Bowtie1)
    1. Post-alignment processing of alignment against host reference genome (SAMtools)
  7. miRNA quality control (mirtrace)
  8. Present QC for raw read, alignment, and expression results (MultiQC)


The nf-core/smrnaseq pipeline comes with documentation about the pipeline, found in the docs/ directory:

  1. Installation
  2. Pipeline configuration
  3. Running the pipeline
  4. Output and how to interpret the results
  5. Troubleshooting


nf-core/smrnaseq was originally written for use at the National Genomics Infrastructure at SciLifeLab in Stockholm, Sweden, by Phil Ewels (@ewels), Chuan Wang (@chuan-wang) and Rickard Hammarén (@Hammarn). Updated by Lorena Pantano (@lpantano) from MIT.


You can cite the nf-core pre-print as follows:
Ewels PA, Peltzer A, Fillinger S, Alneberg JA, Patel H, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. nf-core: Community curated bioinformatics pipelines. bioRxiv. 2019. p. 610741. doi: 10.1101/610741.