nf-core/rnaseq
RNA sequencing analysis pipeline using STAR, RSEM, HISAT2 or Salmon with gene/isoform counts and extensive quality control.
Version history
What’s Changed
- Bump pipeline version to 3.15.0dev by @drpatelh in #1180
- update qualimap/rnaseq by @maxulysse in #1186
- Delete lib directory and replace with utils_* subworkflows by @drpatelh in #1197
- Replace modules.config with more modular config files per module/subworkflow/workflow by @drpatelh in #1199
- Important! Template update for nf-core/tools v2.12 by @nf-core-bot in #1201
- Remove lib directory and modules.config by @drpatelh in #1206
- CHORES: Update all modules by @maxulysse in #1212
- Use pseudoalignment subworkflow components from nf-core/modules by @pinin4fjords in #1210
- Pass transcriptome fasta through to samtools stats by @pinin4fjords in #1213
- Bump umitools + delocalise prepareforrsem by @pinin4fjords in #1214
- Delocalise catadditionalfasta by @pinin4fjords in #1216
- chore: Emiller88 => edmundmiller by @edmundmiller in #1217
- Important! Template update for nf-core/tools v2.13 by @nf-core-bot in #1218
- nf-test at the pipeline level by @adamrtalbot in #1220
- Reuse bbsplit index and don’t keep overwriting by @pinin4fjords in #1226
- Important! Template update for nf-core/tools v2.13.1 by @nf-core-bot in #1229
- Add all nf-core modules and CI/CD and all nf-tests to everything by @adamrtalbot in #1221
- Update sortmerna usage by @maxulysse in #1231
- Adding tests for star_genomegenerate_igenomes by @maxulysse in #1232
- Switch to genomecov from nf-core by @pinin4fjords in #1234
- Add nf-test to local module GTF_FILTER by @adamrtalbot in #1236
- nf-test: utils_nfcore_rnaseq_pipeline tests by @adamrtalbot in #1235
- add nf-tests for star_align_igenomes by @maxulysse in #1233
- nf-test for PREPROCESS_TRANSCRIPTS_FASTA_GENCODE by @adamrtalbot in #1238
- Fix concurrency error in Github CI workflow by @adamrtalbot in #1237
- ALIGN_STAR - add nf-test tests by @maxulysse in #1239
- Swtich to dupradar from nf-core by @pinin4fjords in #1242
- Add nf-tests for deseq2_qc local module by @adamrtalbot in #1241
- Fix genomes params usage by @maxulysse in #1240
- Add gtf2bed tests by @pinin4fjords in #1244
- nf test quantify rsem by @adamrtalbot in #1245
- nf-test for module MULTIQC_CUSTOM_BIOTYPE by @adamrtalbot in #1243
- nf-test prepare_genome by @maxulysse in #1247
- Replace deseq2qc paths by @adamrtalbot in #1251
- Make README usage consistent with docs/usage.md by @pinin4fjords in #1228
- Remove all tags.yml files by @adamrtalbot in #1250
- Include nf-tests for rsem_merge_counts module by @robsyme in #1249
- nf-test quantify pseudoalignment by @adamrtalbot in #1246
- Update CHANGELOG by @drpatelh in #1260
- Try to fix CI pipeline AGAIN by @adamrtalbot in #1262
- Small updates noticed during code review by @drpatelh in #1265
- Delete unecessary tags from nf.test files for modules and subworkflows by @drpatelh in #1266
- Add more tests for PREPARE_GENOME by @adamrtalbot in #1261
- Fixup: Add GHA files back into include statement of Github workflow change detection by @adamrtalbot in #1264
- Update trimming subworkflow to include more tests by @adamrtalbot in #1271
- Delocalise pseudo quant workflow by @pinin4fjords in #1278
- Update bam_markduplicates_picard subworkflow by @adamrtalbot in #1274
- Simple pipeline level nf-tests by @adamrtalbot in #1272
- Add new testing strategy based on nf-test files by @adamrtalbot in #1253
- Add psueudoaligner pipeline level tests to test suite by @adamrtalbot in #1279
- Reorganise pipeline tests into flat structure by @adamrtalbot in #1280
- Fix CHANGELOG error by @adamrtalbot in #1282
- Increase contents of default.main.nf.test.snap by @adamrtalbot in #1283
- Improved ext.args consolidation for STAR and TRIMGALORE by @MatthiasZepper in #1248
- Fix genomeAttribute usage by @maxulysse in #1252
- fix(subworkflow): update utils_nfcore_pipeline by @maxulysse in #1293
- Important! Template update for nf-core/tools v2.14.1 by @nf-core-bot in #1297
- Add missing files from Tximport processing by @pinin4fjords in #1302
- Remove redundant gene TPM outputs by @pinin4fjords in #1304
- Strip problematic ifEmpty() by @pinin4fjords in #1317
- Reinstate oncomplete error messages by @pinin4fjords in #1319
- Reinstate pseudoalignment subworkflow config by @pinin4fjords in #1310
- Document FASTP sampling by @pinin4fjords in #1309
- Fix issues with unzipping of GTF/ GFF files without absolute paths by @pinin4fjords in #1312
- Clarify infer strandedness step in subway map and text by @maxulysse in #1307
- Remove push and release triggers from CI by @adamrtalbot in #1321
- Use Github Action to detect file changes instead of custom Python code by @adamrtalbot in #1322
- Update actions/checkout to v4 by @adamrtalbot in #1323
- Overhaul strandedness detection / comparison by @pinin4fjords in #1306
- Move Conda dependencies for local modules to individual environment file by @drpatelh in #1326
- Minor fixes to strandedness settings and messaging by @pinin4fjords in #1325
- Fix tags entries and rename pipeline level tests by @drpatelh in #1324
- Remove tags from all nf-test files by @drpatelh in #1329
- Add nf-test for STAR-RSEM and HISAT2 aligners by @adamrtalbot in #1328
- Update all nf-core/modules and subworkflows by @drpatelh in #1330
- add stub for local modules by @maxulysse in #1331
- Adding stubs everywhere by @maxulysse in #1334
- Various MultiQC issues: FastQC sections for raw and trimmed reads // umi-tools dedup and extraction plots, custom content styling. by @MatthiasZepper in #1308
- Update Azure Batch guidance by @adamrtalbot in #1340
- Add reference recommendations to usage docs by @lazappi in #1314
- Add rename in the MultiQC report for samples without techreps by @pinin4fjords in #1341
- Use nf-core/setup-nf-test action for portability by @adamrtalbot in #1336
- Factor out preprocessing by @pinin4fjords in #1342
- Fix preprocessing call by @pinin4fjords in #1345
- Reduce resource usage for sort process in bedtools/genomecov by @pinin4fjords in #1350
- Correct conditional for salmon indexing in preprocessing workflow by @pinin4fjords in #1353
- Fix curves in subway map by @maxulysse in #1355
- Assorted fixes to MultiQC usage by @pinin4fjords in #1352
- Work around anchor issue in multiqc by @pinin4fjords in #1357
- Adding stubs at all level by @maxulysse in #1335
- Update test_full.config to restore a static URI for megatests by @pinin4fjords in #1358
- Revert multiqc workaround by @pinin4fjords in #1359
- Move multiqc module prefix for nf-test to module by @pinin4fjords in #1362
- Animate subway map by @maxulysse in #1361
- Snapshot all files (content or file name when not stable) by @maxulysse in #1360
- Fixing failing tests and updating modules once more by @maxulysse in #1363
- Clarify design formula and blind dispersion estimation by @pmoris in #1367
- Clarify docs on different tximport count files by @pmoris in #1366
- Bump versions for 3.15.0 by @pinin4fjords in #1370
- Apply Maxime’s changelog edits by @pinin4fjords in #1371
- Bump tximeta/tximport for gene table row names fix by @pinin4fjords in #1372
- [skip ci] Dev -> Master for 3.15.0 by @drpatelh in #1258
New Contributors
Full Changelog: 3.14.0…3.15.0
What’s Changed
- Bump versions for work on 3.14.0 milestone by @pinin4fjords in #1129
- Update action tower launch to v2 by @adamrtalbot in #1135
- Update FastQC and UMItools modules by @mahesh-panchal in #1138
- MultiQC dupRadar custom plot: specify plot_type explicitly by @vladsavelyev in #1137
- Important! Template update for nf-core/tools v2.11 by @nf-core-bot in #1141
- Revert “Update FastQC and UMItools modules” by @pinin4fjords in #1148
- Patch modules to fix #1103 by @drpatelh in #1149
- Interface to kmer size for pseudoaligners by @pinin4fjords in #1144
- Move fasta check back to Groovy by @pinin4fjords in #1143
- Be more flexible on attribute values in GTFs by @pinin4fjords in #1150
- fix to #1150: reinstate conditional by @pinin4fjords in #1151
- Bump container versions for tools using Docker V1 manifest by @drpatelh in #1152
- Prerelease 3.14.0 fixes by @pinin4fjords in #1154
- Add slash to outdir for cloud tests to fix Azure validation… by @drpatelh in #1157
- Bump MultiQC version from 1.17 -> 1.19 by @drpatelh in #1159
- Final prerelease fixes to fix Cloud CI by @drpatelh in #1160
- Dev -> master for 3.14.0 release by @drpatelh in #1156
New Contributors
- @vladsavelyev made their first contribution in #1137
Full Changelog: 3.13.2…3.14.0
What’s Changed
- Fix pipeline failure when transcript_fasta not provided and skip_gtf_filter is set to TRUE by @RHReynolds in #1126
- Enlarge the sampling range for column determination in FilterGTF script. by @MatthiasZepper in #1127
- Overhaul tximport.r, output length tables by @pinin4fjords in #1123
- Ensure pseudoaligner is set if pseudoalignment is not skipped by @pinin4fjords in #1124
- Dev -> master for 3.13.2 release by @pinin4fjords in #1128
New Contributors
- @RHReynolds made their first contribution in #1126
Full Changelog: 3.13.1…3.13.2
What’s Changed
- Changes for 3.13.1 patch release incl. igenomes star fix by @pinin4fjords in #1121
- Dev -> master for 3.13.1 release by @pinin4fjords in #1122
Full Changelog: 3.13.0…3.13.1
What’s Changed
- Display a warning when ‘—extra_star_align_args’ are used with RSEM by @MatthiasZepper in #1049
- Update public_aws_ecr.config by @maxulysse in #1048
- Remove public_aws_ecr profile by @adamrtalbot in #1051
- Important! Template update for nf-core/tools v2.9 by @nf-core-bot in #1053
- Update credits for subway map by @maxulysse in #1057
- Use nf-validation plugin for parameter and samplesheet validation by @drpatelh in #1058
- fix copy paste typo by @hmehlan in #1062
- Update untar by @pinin4fjords in #1068
- README.md: Added ref to downstream analyses by @smoe in #1060
- Update the CODE_OF_CONDUCT and CONTRIBUTING with nf-core template 2.10 by @adamrtalbot in #1088
- Reorganise arguments for clearer syntax by @adamrtalbot in #1091
- Reorganise local modules into subfolder/main.nf for consistency by @adamrtalbot in #1083
- Important! Template update for nf-core/tools v2.10 by @nf-core-bot in #1078
- Update usage.md for igenomes warning by @pinin4fjords in #1073
- Update all nf-core/modules in pipeline by @drpatelh in #1093
- update config to enable usage of a custom config by @maxulysse in #1108
- Kallisto quantification by @pinin4fjords in #1106
- Expand GTF filtering to remove rows with empty transcript ID when required, fix STAR GTF usage by @pinin4fjords in #1107
- Pre-release fixes for 3.13.0 by @pinin4fjords in #1114
- Maxime feedback by @pinin4fjords in #1116
- FIX: Subway map by @maxulysse in #1117
- final updates on subway map by @maxulysse in #1120
- Dev -> Master for 3.13.0 release by @drpatelh in #1113
New Contributors
- @hmehlan made their first contribution in #1062
- @pinin4fjords made their first contribution in #1068
- @smoe made their first contribution in #1060
Full Changelog: 3.12.0…3.13.0
[3.12.0] - 2023-06-02
Credits
Special thanks to the following for their contributions to the release:
Thank you to everyone else that has contributed by reporting bugs, enhancements or in any other way, shape or form.
Enhancements & fixes
- [#1011] - FastQ files from UMI-tools not being passed to fastp
- [#1018] - Ability to skip both alignment and pseudo-alignment to only run pre-processing QC steps.
- PR #1016 - Updated pipeline template to nf-core/tools 2.8
- PR #1025 - Add
public_aws_ecr.config
to source mulled containers when usingpublic.ecr.aws
Docker Biocontainer registry - PR #1038 - Updated error log for count values when supplying
--additional_fasta
- PR #1042 - revert samtools_sort modules to no memory assignement
Parameters
Old parameter | New parameter |
---|---|
--skip_pseudo_alignment |
NB: Parameter has been updated if both old and new parameter information is present. NB: Parameter has been added if just the new parameter information is present. NB: Parameter has been removed if new parameter information isn’t present.
Software dependencies
Dependency | Old version | New version |
---|---|---|
fastp | 0.23.2 | 0.23.4 |
samtools | 1.16.1 | 1.17 |
NB: Dependency has been updated if both old and new version information is present.
NB: Dependency has been added if just the new version information is present.
NB: Dependency has been removed if new version information isn’t present.
[3.11.2] - 2023-04-25
Credits
Special thanks to the following for their contributions to the release:
Thank you to everyone else that has contributed by reporting bugs, enhancements or in any other way, shape or form.
Enhancements & fixes
- [#1003] -
FASTQ_SUBSAMPLE_FQ_SALMON:SALMON_INDEX
is launched multiple times and fails
[3.11.1] - 2023-03-31
Credits
Special thanks to the following for their code contributions to the release:
Enhancements & fixes
- [#987] - Fix issue with incorrect cacheing of test datasets during CI/CD
- [#988] -
DESEQ2_QC_STAR_SALMON
fails when sample names have many components - Remove
wait: false
option from Tower Actions which is the default - Fix release trigger for full-sized multi-cloud tests
- Adding
[ci fast]
to commit message now skips all tests except for standard-profile test
pipeline run
[3.11.0] - 2023-03-30
Credits
Special thanks to the following for their code contributions to the release:
Thank you to everyone else that has contributed by reporting bugs, enhancements or in any other way, shape or form.
Enhancements & fixes
- Add infrastructure and CI for multi-cloud full-sized tests run via Nextflow Tower (see #981)
- Added fastp support.
- Users can now select between
--trimmer trimgalore
(default) and--trimmer fastp
. - Trim Galore! specific pipeline parameters have been deprecated:
--clip_r1
,--clip_r2
,--three_prime_clip_r1
,--three_prime_clip_r2
and--trim_nextseq
- Any additional options can now be specified via the
--extra_trimgalore_args
and--extra_fastp_args
parameters, respectively.
- Users can now select between
- [#663] - Alternative trimming step for polyA/T removal
- [#781] - Add Warning for poly(A) libraries
- [#878] - Allow tabs in fasta header when creating decoys for salmon index
- [#931] - Save transcriptome BAM files when using
--save_umi_intermeds
/--save_align_intermeds
- [#934] - Union of
ext.args
andparams.extra_star_align_args
prevents parameter clashes in the STAR module - [#940] - Bugfix in
salmon_summarizedexperiment.r
to ensurerbind
doesn’t fail whenrowdata
has notx
column. - [#944] - Read clipping using clip_r1, clip_r2, three_prime_clip_r1, three_prime_clip_r2 disabled in 3.10
- [#956] - Implement ‘auto’ as default strandedness argument in
fastq_dir_to_samplesheet.py
script - [#960] - Failure with awsbatch when running processes that are using
executor: local
- [#961] - Add warnings to STDOUT for all skipped and failed strandedness check samples
- [#975] -
SALMON_INDEX
runs when using--aligner star_rsem
even if samples have explicit strandedness - Remove HISAT2 from automated AWS full-sized tests
Parameters
Old parameter | New parameter |
---|---|
--trimmer | |
--extra_trimgalore_args | |
--clip_r1 | |
--clip_r2 | |
--three_prime_clip_r1 | |
--three_prime_clip_r2 | |
--tracedir | |
--trim_nextseq |
NB: Parameter has been updated if both old and new parameter information is present. NB: Parameter has been added if just the new parameter information is present. NB: Parameter has been removed if new parameter information isn’t present.
Software dependencies
Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.
Dependency | Old version | New version |
---|---|---|
fastp | 0.23.2 | |
multiqc | 1.13 | 1.14 |
picard | 2.27.4 | 3.0.0 |
salmon | 1.9.0 | 1.10.1 |
umi_tools | 1.1.2 | 1.1.4 |
NB: Dependency has been updated if both old and new version information is present.
NB: Dependency has been added if just the new version information is present.
NB: Dependency has been removed if new version information isn’t present.
[3.10.1] - 2023-01-05
Enhancements & fixes
[3.10] - 2022-12-21
Enhancements & fixes
- Bump minimum Nextflow version from
21.10.3
->22.10.1
- Updated pipeline template to nf-core/tools 2.7.2
- [#729] - Add ‘auto’ option to samplesheet to automatically detect strandedness for samples
- [#889] - Document valid options for
--genome
parameter - [#891] - Skip MarkDuplicates when UMIs are used
- [#896] - Remove
copyTo
call for iGenomes README - [#897] - Use
--skip_preseq
by default - [#898] - Documentation on salmon decoy-aware index creation, gcbias and seqbias
- [#900] - Add
--recursive
option tofastq_dir_to_samplesheet.py
script - [#902] -
check_samplesheet.py
script doesn’t output optional columns in samplesheet - [#907] - Add
--extra_star_align_args
and--extra_salmon_quant_args
parameter - [#912] - Add UMI deduplication before quantification in tube map
Parameters
Old parameter | New parameter |
---|---|
--enable_conda | |
--extra_star_align_args | |
--extra_salmon_quant_args |
NB: Parameter has been updated if both old and new parameter information is present. NB: Parameter has been added if just the new parameter information is present. NB: Parameter has been removed if new parameter information isn’t present.
Software dependencies
Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.
Dependency | Old version | New version |
---|---|---|
bbmap | 38.93 | 39.01 |
bioconductor-dupradar | 1.18.0 | 1.28.0 |
bioconductor-summarizedexperiment | 1.20.0 | 1.24.0 |
bioconductor-tximeta | 1.8.0 | 1.12.0 |
fq | 0.9.1 | |
salmon | 1.5.2 | 1.9.0 |
samtools | 1.15.1 | 1.16.1 |
NB: Dependency has been updated if both old and new version information is present. NB: Dependency has been added if just the new version information is present. NB: Dependency has been removed if version information isn’t present.
[3.9] - 2022-09-30
Enhancements & fixes
- [#746] - Add
tin.py
output to MultiQC report - [#841] - Turn
--deseq2_vst
on by default - [#853] - Pipeline fails at email step: Failed to invoke
workflow.onComplete
event handler - [#857] - Missing parameter required by StringTie if using STAR as aligner
- [#862] - Filter samples that have no reads after trimming
- [#864] - Pre-process transcripts fasta when using
--gencode
- Expose additional arguments to UMI-tools as pipeline params:
--umitools_bc_pattern2
is required if the UMI is located on read 2.--umitools_umi_separator
will often be needed in conjunction with--skip_umi_extract
as most other tools such as Illumina’sBCL Convert
use a colon instead of an underscore to separate the UMIs. The--umitools_grouping_method
allows to fine-tune handling of similar but non-identical UMIs. - Updated pipeline template to nf-core/tools 2.5.1
Parameters
Old parameter | New parameter |
---|---|
--umitools_bc_pattern2 | |
--umitools_umi_separator | |
--umitools_grouping_method |
Software dependencies
Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.
Dependency | Old version | New version |
---|---|---|
hisat2 | 2.2.0 | 2.2.1 |
multiqc | 1.11 | 1.13 |
picard | 2.26.10 | 2.27.4 |
NB: Dependency has been updated if both old and new version information is present. NB: Dependency has been added if just the new version information is present. NB: Dependency has been removed if version information isn’t present.
[3.8] - 2022-05-25
⚠️ Major enhancements
Fixed a well hidden bug in the UMI processing mode of the pipeline when using --with_umi --aligner star_salmon
as reported by Lars Roed Ingerslev. Paired-end BAM files were not appropriately name sorted after umi_tools dedup
which ultimately resulted in incorrect reading and quantification with Salmon. If you have used previous versions of the pipeline to analyse paired-end UMI data it will need to be reprocessed using this version of the pipeline. See #828 for more context.
Enhancements & fixes
- [#824] - Add explicit docs for usage of featureCounts in the pipeline
- [#825] - Pipeline fails due to trimming related removal of all reads from a sample
- [#827] - Control generation of —output-stats when running umi-tools dedup
- [#828] - Filter BAM output of UMI-tools dedup before passing to Salmon quant
- Updated pipeline template to nf-core/tools 2.4.1
Parameters
Old parameter | New parameter |
---|---|
--min_trimmed_reads | |
--umitools_dedup_stats |
[3.7] - 2022-05-03
⚠️ Major enhancements
- Updated default STAR version to latest available (
2.7.10a
; see #808) - Vanilla Linux Docker container changed from
biocontainers/biocontainers:v1.2.0_cv1
toubuntu:20.04
to fix issues observed on GCP (see #764)
Enhancements & fixes
- [#762] - Explicitly set
--skip_bbsplit false
with--bbsplit_fasta_list
to use BBSplit - [#764] - Test fails when using GCP due to missing tools in the basic biocontainer
- [#765] - Add docs for the usage of nf-core/rnaseq with prokaryotic data
- [#775] - Incorrect columns in Salmon transcript files
- [#791] - Add outputs for umitools dedup summary stats
- [#797] - Add
--skip_umi_extract
to account for pre-existing UMIs header embeddings. - [#798] - Decompress transcript fasta error
- [#799] - Issue with using
--retain_unpaired
with theFASTQC_UMITOOLS_TRIMGALORE:TRIMGALORE
module - [#802] -
--bam_csi_index
error generated if--skip_alignment
specified - [#808] - Auto-detect usage of Illumina iGenomes reference
- [#809] - Add metro map for pipeline
- [#814] - Use decimal values for
--min_mapped_reads
- Updated pipeline template to nf-core/tools 2.3.2
Parameters
Old parameter | New parameter |
---|---|
--skip_umi_extract |
Software dependencies
Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.
Dependency | Old version | New version |
---|---|---|
samtools | 1.14 | 1.15.1 |
star | 2.6.1d | 2.7.10a |
stringtie | 2.1.7 | 2.2.1 |
NB: Dependency has been updated if both old and new version information is present. NB: Dependency has been added if just the new version information is present. NB: Dependency has been removed if version information isn’t present.
[3.6] - 2022-03-04
Enhancements & fixes
- nf-core/tools#1415 - Make
--outdir
a mandatory parameter - [#734] - Is a vulnerable picard still used ? log4j vulnerability
- [#744] - Auto-detect and raise error if CSI is required for BAM indexing
- [#750] - Optionally ignore R1 / R2 after UMI extraction process
- [#752] - How to set publishing mode for all processes?
- [#753] - Add warning when user provides
--transcript_fasta
- [#754] - DESeq2 QC issue linked to
--count_col
parameter - [#755] - Rename RSEM_PREPAREREFERENCE_TRANSCRIPTS process
- [#759] - Empty lines in samplesheet.csv cause a crash
- [#769] - Do not run RSeQC tin.py by default
Parameters
Old parameter | New parameter |
---|---|
--publish_dir_mode | |
--umi_discard_read |
NB: Parameter has been updated if both old and new parameter information is present.
NB: Parameter has been added if just the new parameter information is present.
NB: Parameter has been removed if new parameter information isn’t present.
[3.5] - 2021-12-17
Enhancements & fixes
- Port pipeline to the updated Nextflow DSL2 syntax adopted on nf-core/modules
- Removed
--publish_dir_mode
as it is no longer required for the new syntax
- Removed
- Bump minimum Nextflow version from
21.04.0
->21.10.3
- Updated pipeline template to nf-core/tools 2.2
- [#664] - Conflict of library names for technical replicates
- [#720] - KeyError ‘gene_id’ in salmon_tx2gene.py
- [#724] - Deal with warnings generated when native NF processes are used
- [#725] - Untar needs
--no-same-owner
on DNAnexus - [#727] - Fix transcriptome staging issues on DNAnexus for rsem/prepareference
- [#728] - Add RSeQC TIN.py as a quality metric for the pipeline
[3.4] - 2021-10-05
Enhancements & fixes
- Software version(s) will now be reported for every module imported during a given pipeline execution
- Added
python3
shebang to appropriate scripts inbin/
directory - [#407] - Filter mouse reads from PDX samples
- [#570] - Update SortMeRNA to use SilvaDB 138 (for commercial use)
- [#690] - Error with post-trimmed read 2 sample names from FastQC in MultiQC
- [#693] - Cutadapt version missing from MultiQC report
- [#697] - pipeline_report.{txt,html} missing from pipeline_info directory
- [#705] - Sample sheet error check false positive
Parameters
Old parameter | New parameter |
---|---|
--bbsplit_fasta_list | |
--bbsplit_index | |
--save_bbsplit_reads | |
--skip_bbsplit |
NB: Parameter has been updated if both old and new parameter information is present. NB: Parameter has been added if just the new parameter information is present. NB: Parameter has been removed if parameter information isn’t present.
Software dependencies
Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.
Dependency | Old version | New version |
---|---|---|
bbmap | 38.93 | |
hisat2 | 2.2.0 | 2.2.1 |
picard | 2.23.9 | 2.25.7 |
salmon | 1.4.0 | 1.5.2 |
samtools | 1.12 | 1.13 |
sortmerna | 4.2.0 | 4.3.4 |
trim-galore | 0.6.6 | 0.6.7 |
NB: Dependency has been updated if both old and new version information is present. NB: Dependency has been added if just the new version information is present. NB: Dependency has been removed if version information isn’t present.
[3.3] - 2021-07-29
Enhancements & fixes
- Updated pipeline template to nf-core/tools 2.1
- [#556] - Genome index isn’t recreated with —additional_fasta unless —star_index false
- [#668] - Salmon quant with UMI-tools does not work
- [#674] - Launch pipeline regex fails
Software dependencies
Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.
Dependency | Old version | New version |
---|---|---|
samtools | 1.10 | 1.12 |
stringtie | 2.1.4 | 2.1.7 |
umi_tools | 1.1.1 | 1.1.2 |
NB: Dependency has been updated if both old and new version information is present. NB: Dependency has been added if just the new version information is present. NB: Dependency has been removed if version information isn’t present.
[3.2] - 2021-06-18
Enhancements & fixes
- Removed workflow to download data from public databases in favour of using nf-core/fetchngs
- Added a stand-alone Python script
bin/fastq_dir_to_samplesheet.py
to auto-create samplesheet from a directory of FastQ files - Added docs about overwriting default container definitions to use latest versions e.g. Pangolin
- [#637] - Add
--salmon_quant_libtype
parameter to provide the--libType
option to salmon quantification - [#645] - Remove trailing slash from
params.igenomes_base
- [#649] - DESeq2 fails with only one sample
- [#652] - Results files have incorrect file names
- [nf-core/viralrecon#201] - Conditional include are not expected to work
Parameters
Old parameter | New parameter |
---|---|
--public_data_ids | |
--skip_sra_fastq_download | |
--salmon_quant_libtype |
NB: Parameter has been updated if both old and new parameter information is present. NB: Parameter has been added if just the new parameter information is present. NB: Parameter has been removed if parameter information isn’t present.
[3.1] - 2021-05-13
⚠️ Major enhancements
- Samplesheet format has changed from
group,replicate,fastq_1,fastq_2,strandedness
tosample,fastq_1,fastq_2,strandedness
- This gives users the flexibility to name their samples however they wish (see #550)
- PCA generated by DESeq2 will now be monochrome and will not be grouped by using the replicate id
- Updated Nextflow version to
v21.04.0
(see nextflow#572) - Restructure pipeline scripts into
modules/
,subworkflows/
andworkflows/
directories
Enhancements & fixes
- Updated pipeline template to nf-core/tools
1.14
- Initial implementation of a standardised samplesheet JSON schema to use with user interfaces and for validation
- Only FastQ files that require to be concatenated will be passed to
CAT_FASTQ
process - [#449] -
--genomeSAindexNbases
will now be auto-calculated before building STAR indices - [#460] - Auto-detect and bypass featureCounts execution if biotype doesn’t exist in GTF
- [#544] - Update test-dataset for pipeline
- [#553] - Make tximport output files using all the samples; identified by @j-andrews7
- [#561] - Add gene symbols to merged output; identified by @grst
- [#563] - samplesheet.csv merge error
- [#567] - Update docs to mention trimgalore core usage nuances
- [#568] -
--star_index
argument is ignored with--aligner star_rsem
option - [#569] - nextflow edge release documentation for running 3.0
- [#575] - Remove duplicated salmon output files
- [#576] - umi_tools dedup : Run before salmon to dedup counts
- [#582] - Generate a separate bigwig tracks for each strand
- [#583] - Samtools error during run requires use of BAM CSI index
- [#585] - Clarify salmon uncertainty for some transcripts
- [#604] - Additional fasta with GENCODE annotation results in biotype error
- [#610] - save R objects as RDS
- [#619] - implicit declaration of the workflow in main
- [#629] - Add and fix EditorConfig linting in entire pipeline
- [nf-core/modules#423] - Replace
publish_by_id
module option topublish_by_meta
- [nextflow#2060] - Pipeline execution hang when native task fail to be submitted
Parameters
Old parameter | New parameter |
---|---|
--hisat_build_memory | --hisat2_build_memory |
--gtf_count_type | --featurecounts_feature_type |
--gtf_group_features_type | --featurecounts_group_type |
--bam_csi_index | |
--schema_ignore_params | |
--show_hidden_params | |
--validate_params | |
--clusterOptions |
NB: Parameter has been updated if both old and new parameter information is present. NB: Parameter has been added if just the new parameter information is present. NB: Parameter has been removed if parameter information isn’t present.
Software dependencies
Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.
Dependency | Old version | New version |
---|---|---|
bedtools | 2.29.2 | 2.30.0 |
multiqc | 1.9 | 1.10.1 |
preseq | 2.0.3 | 3.1.2 |
NB: Dependency has been updated if both old and new version information is present. NB: Dependency has been added if just the new version information is present. NB: Dependency has been removed if version information isn’t present.
[3.0] - 2020-12-15
⚠️ Major enhancements
- You will need to install Nextflow
>=20.11.0-edge
to run the pipeline. If you are using Singularity, then features introduced in that release now enable the pipeline to directly download Singularity images hosted by Biocontainers as opposed to performing a conversion from Docker images (see #496). - The previous default of aligning BAM files using STAR and quantifying using featureCounts (
--aligner star
) has been removed. The new default is to align with STAR and quantify using Salmon (--aligner star_salmon
).- This decision was made primarily because of the limitations of featureCounts to appropriately quantify gene expression data. Please see Zhao et al., 2015 and Soneson et al., 2015).
- For similar reasons, quantification will not be performed if using
--aligner hisat2
due to the lack of an appropriate option to calculate accurate expression estimates from HISAT2 derived genomic alignments.- This pipeline option is still available for those who have a preference for the alignment, QC and other types of downstream analysis compatible with the output of HISAT2. No gene-level quantification results will be generated.
- In a future release we hope to add back quantitation for HISAT2 using different tools.
Enhancements & fixes
- Updated pipeline template to nf-core/tools
1.12.1
- Bumped Nextflow version
20.07.1
->20.11.0-edge
- Added UCSC
bedClip
module to restrict bedGraph file coordinates to chromosome boundaries - Check if Bioconda and conda-forge channels are set-up correctly when running with
-profile conda
- Use
rsem-prepare-reference
and notgffread
to create transcriptome fasta file - [#494] - Issue running rnaseq v2.0 (DSL2) with test profile
- [#496] - Direct download of Singularity images via HTTPS
- [#498] - Significantly different versions of STAR in star_rsem (2.7.6a) and star (2.6.1d)
- [#499] - Use of salmon counts for DESeq2
- [#500, #509] - Error with AWS batch params
- [#511] - rsem/star index fails with large genome
- [#515] - Add decoy-aware indexing for salmon
- [#516] - Unexpected error [InvocationTargetException]
- [#525] - sra_ids_to_runinfo.py UnicodeEncodeError
Parameters
Old parameter | New parameter |
---|---|
--fc_extra_attributes | --gtf_extra_attributes |
--fc_group_features | --gtf_group_features |
--fc_count_type | --gtf_count_type |
--fc_group_features_type | --gtf_group_features_type |
--singularity_pull_docker_container | |
--skip_featurecounts |
NB: Parameter has been updated if both old and new parameter information is present.
NB: Parameter has been added if just the new parameter information is present.
NB: Parameter has been removed if parameter information isn’t present.
Software dependencies
Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.
Dependency | Old version | New version |
---|---|---|
bioconductor-summarizedexperiment | 1.18.1 | 1.20.0 |
bioconductor-tximeta | 1.6.3 | 1.8.0 |
picard | 2.23.8 | 2.23.9 |
requests | 2.24.0 | |
salmon | 1.3.0 | 1.4.0 |
ucsc-bedclip | 377 | |
umi_tools | 1.0.1 | 1.1.1 |
NB: Dependency has been updated if both old and new version information is present.
NB: Dependency has been added if just the new version information is present.
NB: Dependency has been removed if version information isn’t present.
[2.0] - 2020-11-12
Major enhancements
- Pipeline has been re-implemented in Nextflow DSL2
- All software containers are now exclusively obtained from Biocontainers
- Added a separate workflow to download FastQ files via SRA, ENA or GEO ids and to auto-create the input samplesheet (
ENA FTP
; see--public_data_ids
parameter) - Added and refined a Groovy
lib/
of functions that include the automatic rendering of parameters defined in the JSON schema for the help and summary log information - Replace edgeR with DESeq2 for the generation of PCA and heatmaps (also included in the MultiQC report)
- Creation of bigWig coverage files using BEDTools and bedGraphToBigWig
- [#70] - Added new genome mapping and quantification route with RSEM via the
--aligner star_rsem
parameter - [#72] - Samples skipped due to low alignment reported in the MultiQC report
- [#73, #435] - UMI barcode support
- [#91] - Ability to concatenate multiple runs of the same samples via the input samplesheet
- [#123] - The primary input for the pipeline has changed from
--reads
glob to samplesheet--input
. See usage docs. - [#197] - Samples failing strand-specificity checks reported in the MultiQC report
- [#227] - Removal of ribosomal RNA via SortMeRNA
- [#419] - Add
--additional_fasta
parameter to provide ERCC spike-ins, transgenes such as GFP or CAR-T as additional sequences to align to
Other enhancements & fixes
- Updated pipeline template to nf-core/tools
1.11
- Optimise MultiQC configuration for faster run-time on huge sample numbers
- Add information about SILVA licensing when removing rRNA to
usage.md
- Fixed ansi colours for pipeline summary, added summary logs of alignment results
- [#281] - Add nag to cite the pipeline in summary
- [#302] - Fixed MDS plot axis labels
- [#338] - Add option for turning on/off STAR command line option (—sjdbGTFfile)
- [#344] - Added multi-core TrimGalore support
- [#351] - Fixes missing Qualimap parameter
-p
- [#353] - Fixes an issue where MultiQC fails to run with
--skip_biotype_qc
option - [#357] - Fixes broken links
- [#362] - Fix error with gzipped annotation file
- [#384] - Changed SortMeRNA reference dbs path to use stable URLs (v4.2.0)
- [#396] - Deterministic mapping for STAR aligner
- [#412] - Fix Qualimap not being passed on correct strand-specificity parameter
- [#413] - Fix STAR unmapped reads not output
- [#434] - Fix typo reported for work-dir
- [#437] - FastQC uses correct number of threads now
- [#440] - Fixed issue where featureCounts process fails when setting
--fc_count_type
to gene - [#452] - Fix
--gff
input bug - [#345] - Fixes label name in FastQC process
- [#391] - Make publishDir mode configurable
- [#431] - Update AWS GitHub actions workflow with organization level secrets
- [#435] - Fix a bug where gzipped references were not extracted when
--additional_fasta
was not specified - [#435] - Fix a bug where merging of RSEM output would fail if only one fastq provided as input
- [#435] - Correct RSEM output name (was saving counts but calling them TPMs; now saving both properly labelled)
- [#436] - Fix a bug where the RSEM reference could not be built
- [#458] - Fix
TMP_DIR
for process MarkDuplicates and Qualimap
Parameters
Updated
Old parameter | New parameter |
---|---|
--reads | --input |
--igenomesIgnore | --igenomes_ignore |
--removeRiboRNA | --remove_ribo_rna |
--rRNA_database_manifest | --ribo_database_manifest |
--save_nonrRNA_reads | --save_non_ribo_reads |
--saveAlignedIntermediates | --save_align_intermeds |
--saveReference | --save_reference |
--saveTrimmed | --save_trimmed |
--saveUnaligned | --save_unaligned |
--skipAlignment | --skip_alignment |
--skipBiotypeQC | --skip_biotype_qc |
--skipDupRadar | --skip_dupradar |
--skipFastQC | --skip_fastqc |
--skipMultiQC | --skip_multiqc |
--skipPreseq | --skip_preseq |
--skipQC | --skip_qc |
--skipQualimap | --skip_qualimap |
--skipRseQC | --skip_rseqc |
--skipTrimming | --skip_trimming |
--stringTieIgnoreGTF | --stringtie_ignore_gtf |
Added
--additional_fasta
- FASTA file to concatenate to genome FASTA file e.g. containing spike-in sequences--deseq2_vst
- Use vst transformation instead of rlog with DESeq2--enable_conda
- Run this workflow with Conda. You can also use ‘-profile conda’ instead of providing this parameter--min_mapped_reads
- Minimum percentage of uniquely mapped reads below which samples are removed from further processing--multiqc_title
- MultiQC report title. Printed as page header, used for filename if not otherwise specified--public_data_ids
- File containing SRA/ENA/GEO identifiers one per line in order to download their associated FastQ files--publish_dir_mode
- Method used to save pipeline results to output directory--rsem_index
- Path to directory or tar.gz archive for pre-built RSEM index--rseqc_modules
- Specify the RSeQC modules to run--save_merged_fastq
- Save FastQ files after merging re-sequenced libraries in the results directory--save_umi_intermeds
- If this option is specified, intermediate FastQ and BAM files produced by UMI-tools are also saved in the results directory--skip_bigwig
- Skip bigWig file creation--skip_deseq2_qc
- Skip DESeq2 PCA and heatmap plotting--skip_featurecounts
- Skip featureCounts--skip_markduplicates
- Skip picard MarkDuplicates step--skip_sra_fastq_download
- Only download metadata for public data database ids and don’t download the FastQ files--skip_stringtie
- Skip StringTie--star_ignore_sjdbgtf
- See #338--umitools_bc_pattern
- The UMI barcode pattern to use e.g. ‘NNNNNN’ indicates that the first 6 nucleotides of the read are from the UMI--umitools_extract_method
- UMI pattern to use. Can be either ‘string’ (default) or ‘regex’--with_umi
- Enable UMI-based read deduplication
Removed
--awsqueue
can now be provided via nf-core/configs if using AWS--awsregion
can now be provided via nf-core/configs if using AWS--compressedReference
now auto-detected--markdup_java_options
in favour of updating centrally on nf-core/modules--project
parameter from old NGI template--readPaths
is not required since these are provided from the input samplesheet--sampleLevel
not required--singleEnd
is now auto-detected from the input samplesheet--skipEdgeR
qc not performed by DESeq2 instead--star_memory
in favour of updating centrally on nf-core/modules if required- Strandedness is now specified at the sample-level via the input samplesheet
--forwardStranded
--reverseStranded
--unStranded
--pico
Software dependencies
Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.
Dependency | Old version | New version |
---|---|---|
bioconductor-dupradar | 1.14.0 | 1.18.0 |
bioconductor-summarizedexperiment | 1.14.0 | 1.18.1 |
bioconductor-tximeta | 1.2.2 | 1.6.3 |
fastqc | 0.11.8 | 0.11.9 |
gffread | 0.11.4 | 0.12.1 |
hisat2 | 2.1.0 | 2.2.0 |
multiqc | 1.7 | 1.9 |
picard | 2.21.1 | 2.23.8 |
qualimap | 2.2.2c | 2.2.2d |
r-base | 3.6.1 | 4.0.3 |
salmon | 0.14.2 | 1.3.0 |
samtools | 1.9 | 1.10 |
sortmerna | 2.1b | 4.2.0 |
stringtie | 2.0 | 2.1.4 |
subread | 1.6.4 | 2.0.1 |
trim-galore | 0.6.4 | 0.6.6 |
bedtools | - | 2.29.2 |
bioconductor-biocparallel | - | 1.22.0 |
bioconductor-complexheatmap | - | 2.4.2 |
bioconductor-deseq2 | - | 1.28.0 |
bioconductor-tximport | - | 1.16.0 |
perl | - | 5.26.2 |
python | - | 3.8.3 |
r-ggplot2 | - | 3.3.2 |
r-optparse | - | 1.6.6 |
r-pheatmap | - | 1.0.12 |
r-rcolorbrewer | - | 1.1_2 |
rsem | - | 1.3.3 |
ucsc-bedgraphtobigwig | - | 377 |
umi_tools | - | 1.0.1 |
bioconductor-edger | - | - |
deeptools | - | - |
matplotlib | - | - |
r-data.table | - | - |
r-gplots | - | - |
r-markdown | - | - |
NB: Dependency has been updated if both old and new version information is present.
NB: Dependency has been added if just the new version information is present.
NB: Dependency has been removed if version information isn’t present.
Version 1.4.2
- Minor version release for keeping Git History in sync
- No changes with respect to 1.4.1 on pipeline level
Version 1.4.1
Major novel changes include:
- Update
igenomes.config
with NCBIGRCh38
and most recent UCSC genomes - Set
autoMounts = true
by default forsingularity
profile
Pipeline enhancements & fixes
Major novel changes include:
-
Support for Salmon as an alternative method to STAR and HISAT2
-
Several improvements in
featureCounts
handling of types other thanexon
. It is possible now to handle nuclearRNAseq data. Nuclear RNA has un-spliced RNA, and the whole transcript, including the introns, needs to be counted, e.g. by specifying--fc_count_type transcript
. -
Support for outputting unaligned data to results folders.
-
Added options to skip several steps
- Skip trimming using
--skipTrimming
- Skip BiotypeQC using
--skipBiotypeQC
- Skip Alignment using
--skipAlignment
to only use pseudo-alignment using Salmon
- Skip trimming using
Documentation updates
- Adjust wording of skipped samples in pipeline output
- Fixed link to guidelines #203
- Add
Citation
andQuick Start
section toREADME.md
- Add in Documentation of the
--gff
parameter
Reporting Updates
- Generate MultiQC plots in the results directory #200
- Get MultiQC to save plots as standalone files
- Get MultiQC to write out the software versions in a
.csv
file #185 - Use
file
instead ofnew File
to createpipeline_report.{html,txt}
files, and properly create subfolders
Pipeline enhancements & fixes
- Restore
SummarizedExperimment
object creation in the salmon_merge process avoiding increasing memory with sample size. - Fix sample names in feature counts and dupRadar to remove suffixes added in other processes
- Removed
genebody_coverage
process #195 - Implemented Pearsons correlation instead of Euclidean distance #146
- Add
--stringTieIgnoreGTF
parameter #206 - Removed unused
stringtie
channels forMultiQC
- Integrate changes in
nf-core/tools v1.6
template which resolved #90 - Moved process
convertGFFtoGTF
beforemakeSTARindex
#215 - Change all boolean parameters from
snake_case
tocamelCase
and vice versa for value parameters - Add SM ReadGroup info for QualiMap compatibility#238
- Obtain edgeR + dupRadar version information #198 and #112
- Add
--gencode
option for compatibility of Salmon and featureCounts biotypes with GENCODE gene annotations - Added functionality to accept compressed reference data in the pipeline
- Check that gtf features are on chromosomes that exist in the genome fasta file #274
- Maintain all gff features upon gtf conversion (keeps
gene_biotype
orgene_type
to makefeatureCounts
happy) - Add SortMeRNA as an optional step to allow rRNA removal #280
- Minimal adjustment of memory and CPU constraints for clusters with locked memory / CPU relation
- Cleaned up usage,
parameters.settings.json
and thenextflow.config
Dependency Updates
- Dependency list is now sorted appropriately
- Force matplotlib=3.0.3
Updated Packages
- Picard 2.20.0 -> 2.21.1
- bioconductor-dupradar 1.12.1 -> 1.14.0
- bioconductor-edger 3.24.3 -> 3.26.5
- gffread 0.9.12 -> 0.11.4
- trim-galore 0.6.1 -> 0.6.4
- gffread 0.9.12 -> 0.11.4
- rseqc 3.0.0 -> 3.0.1
- R-Base 3.5 -> 3.6.1
Added / Removed Packages
Pipeline Updates
- Added configurable options to specify group attributes for featureCounts #144
- Added support for RSeqC 3.0 #148
- Added a
parameters.settings.json
file for use with the newnf-core launch
helper tool. - Centralized all configuration profiles using nf-core/configs
- Fixed all centralized configs for offline usage
- Hide %dup in multiqc report
Bug fixes
- Fixing HISAT2 Index Building for large reference genomes #153
- Fixing HISAT2 BAM sorting using more memory than available on the system
- Fixing MarkDuplicates memory consumption issues following #179
Dependency Updates
- RSeQC 2.6.4 -> 3.0.0
- Picard 2.18.15 -> 2.18.23
- r-data.table 1.11.4 -> 1.12.0
- r-markdown 0.8 -> 0.9
- csvtk 0.15.0 -> 0.17.0
- stringtie 1.3.4 -> 1.3.5
- subread 1.6.2 -> 1.6.4
- gffread 0.9.9 -> 0.9.12
- multiqc 1.6 -> 1.7
Pipeline updates
- Removed some outdated documentation about non-existent features
- Config refactoring and code cleaning
- Added a
--fcExtraAttributes
option to specify more than ENSEMBL gene names infeatureCounts
- Remove legacy rseqc
strandRule
config code. #119 - Added STRINGTIE ballgown output to results folder #125
- HiSAT index build now requests
200GB
memory, enough to use the exons / splice junction option for building.- Added documentation about the
--hisatBuildMemory
option.
- Added documentation about the
- BAM indices are stored and re-used between processes #71
Bug Fixes
Pipeline updates
- Wrote docs and made minor tweaks to the
--skip_qc
and associated options - Removed the depreciated
uppmax-modules
config profile - Updated the
hebbe
config profile to use the newwithName
syntax too - Use new
workflow.manifest
variables in the pipeline script - Updated minimum nextflow version to
0.32.0
Software updates
- FastQC
0.11.7
>0.11.8
- STAR
2.6.1a
>2.6.1b
- Picard
2.18.11
>2.18.14
- Deeptools
3.1.1
>3.1.4
Bug Fixes
Initial release of nf-core/rnaseq! 🎉
This release marks the point where the pipeline was moved from SciLifeLab/NGI-RNAseq over to the new nf-core community, at nf-core/rnaseq. You can view the previous changelog at SciLifeLab/NGI-RNAseq/CHANGELOG.md
In addition to porting to the new nf-core community, the pipeline has had a number of major changes in this version. There have been 157 commits by 16 different contributors covering 70 different files in the pipeline: 7,357 additions and 8,236 deletions!
In summary, the main changes are:
- Rebranding and renaming throughout the pipeline to nf-core
- Updating many parts of the pipeline config and style to meet nf-core standards
- Support for GFF files in addition to GTF files
- Just use
--gff
instead of--gtf
when specifying a file path
- Just use
- New command line options to skip various quality control steps
- More safety checks when launching a pipeline
- Several new sanity checks - for example, that the specified reference genome exists
- Improved performance with memory usage (especially STAR and Picard)
- New BigWig file outputs for plotting coverage across the genome
- Refactored gene body coverage calculation, now much faster and using much less memory
- Bugfixes in the MultiQC process to avoid edge cases where it wouldn’t run
- MultiQC report now automatically attached to the email sent when the pipeline completes
- New testing method, with data on GitHub
- Now run pipeline with
-profile test
instead of using bash scripts
- Now run pipeline with
- Rewritten continuous integration tests with Travis CI
- New explicit support for Singularity containers
- Improved MultiQC support for DupRadar and featureCounts
- Now works for all users instead of just NGI Stockholm
- New configuration for use on AWS batch
- Updated config syntax to support latest versions of Nextflow
- Built-in support for a number of new local HPC systems
- CCGA, GIS, UCT HEX, updates to UPPMAX, CFC, BINAC, Hebbe, c3se
- Slightly improved documentation (more updates to come)
- Updated software packages
…and many more minor tweaks.
Thanks to everyone who has worked on this release!