nf-core/spatialvi
Edit

Pipeline for processing spatially-resolved gene counts with spatial coordinates and image data. Designed for 10x Genomics Visium transcriptomics.

10x-genomics10xgenomicsimage-processingmicroscopyrna-seqsingle-cellspatialspatial-transcriptomicssttranscriptomicsvisium

This is the development version of the pipeline.

Launch development version https://github.com/nf-core/spatialvi

Introduction

This document describes the output produced by the pipeline. Most of the output is contained within HTML reports created with Quarto, but there are also other files which you can either take and analyse further by yourself or explore interactively with e.g. TissUUmaps.

The directories listed below will be created in the results directory after the pipeline has finished. Results for individual samples will be created in subdirectories following the <OUTDIR>/<SAMPLE>/ structure. All paths are relative to the top-level results directory.

The pipeline is built using Nextflow and processes data using the following steps:

Space Ranger
Data
Reports
Workflow reporting
- Pipeline information - Report metrics generated during the workflow execution

Space Ranger

Output files

<SAMPLE>/spaceranger/
- outs/spatial/tissue_[hi/low]res_image.png: High and low resolution images.
- outs/spatial/tissue_positions_list.csv: Spot barcodes and their array positions.
- outs/spatial/scalefactors_json.json: Scale conversion factors for the spots.
- outs/filtered_feature_bc_matrix/barcodes.tsv.gz: List of barcode IDs.
- outs/filtered_feature_bc_matrix/features.tsv.gz: List of feature IDs.
- outs/filtered_feature_bc_matrix/matrix.mtx.gz: Matrix of UMIs, barcodes and features.

All files produced by Space Ranger are currently published as output of this pipeline, regardless if they’re being used downstream or not; you can find more information about these files at the 10X website.

Data

Output files

<SAMPLE>/data/
- sdata_processed.zarr: Processed data in SpatialData format.
- adata_processed.h5ad: Processed data in AnnData format.
- spatially_variable_genes.csv: List of spatially variable genes.

Data in .zarr and .h5ad formats as processed by the pipeline, which can be used for further downstream analyses if desired; unprocessed data is also present in these files. It can also be used by the TissUUmaps browser-based tool for visualisation and exploration, allowing you to delve into the data in an interactive way. The list of spatially variable genes are added as a convenience if you want to explore them in e.g. Excel.

Reports

Output files

<SAMPLE>/reports/
- _extensions/: Quarto nf-core extension, common to all reports.

Quality controls and filtering

Output files

<SAMPLE>/reports/
- quality_controls.html: Rendered HTML report.
- quality_controls.yml: YAML file containing parameters used in the report.
- quality_controls.qmd: Quarto document used for rendering the report.

Report containing analyses related to quality controls and filtering of spatial data. Spots are filtered based on total counts, number of expressed genes as well as presence in tissue; you can find more details in the report itself.

Clustering

Output files

<SAMPLE>/reports/
- clustering.html: Rendered HTML report.
- clustering.yml: YAML file containing parameters used in the report.
- clustering.qmd: Quarto document used for rendering the report.

Report containing analyses related to normalisation, dimensionality reduction, clustering and spatial visualisation. Leiden clustering is currently the only option; you can find more details in the report itself.

Spatially variable genes

Output files

<SAMPLE>/reports/
- spatially_variable_genes.html: Rendered HTML report.
- spatially_variable_genes.yml: YAML file containing parameters used in the report.
- spatially_variable_genes.qmd: Quarto document used for rendering the report.

Report containing analyses related to differential expression testing and spatially varying genes. The Moran 1 score is currently the only option for spatial testing; you can find more details in the report itself.

Workflow reporting

Pipeline information

Output files

pipeline_info/
- Reports generated by Nextflow: execution_report.html, execution_timeline.html, execution_trace.txt and pipeline_dag.dot/pipeline_dag.svg.
- Reports generated by the pipeline: pipeline_report.html, pipeline_report.txt and software_versions.yml. The pipeline_report* files will only be present if the --email / --email_on_fail parameter’s are used when running the pipeline.
- Reformatted samplesheet files used as input to the pipeline: samplesheet.valid.csv.
- Parameters used by the pipeline run: params.json.
multiqc/
- Report generated by MultiQC: multiqc_report.html.
- Data and plots generated by MultiQC: multiqc_data/ and multiqc_plots/.

Nextflow provides excellent functionality for generating various reports relevant to the running and execution of the pipeline. This will allow you to troubleshoot errors with the running of the pipeline, and also provide you with other information such as launch commands, run times and resource usage.

On this page

nf-core/spatialvi Edit

Introduction

Space Ranger

Data

Reports

Quality controls and filtering

Clustering

Spatially variable genes

Workflow reporting

Pipeline information

nf-core/spatialvi
Edit