nf-core/spatialxe

10x-genomicsimage-processingspatialspatial-data-analysisspatial-transcriptomicstranscriptomicsxenium

This is the development version of the pipeline.

Launch development version https://github.com/nf-core/spatialxe

Define where the pipeline should find input data and save output data.

Path to comma-separated file containing information about the Xenium experiment. (eg; meta,path-to-xenium-bundle,path-to-morphology.ome.tif))

required

type: string

pattern: ^\S+\.csv$

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required

type: string

Mode in which the pipeline is to be run. Either image-based segmentation, coordinate-based segmentation, segmentation-free analysis or data preview.

required

type: string

Segmentation method to run.

type: string

Path to gene panel JSON file to use for relabeling transcripts with the correct gene.

type: string

Path to qupath segmentation file in GeoJSON format.

type: string

Image alignment file containing similarity transform matrix. (e.g., the _imagealignment.csv file exported from Xenium Explorer)

type: string

Model to use for running or starting training.

type: string

StarDist pretrained model for cell segmentation (e.g., ‘2D_versatile_fluo’, ‘2D_versatile_he’).

type: string

default: 2D_versatile_fluo

StarDist pretrained model for nuclei segmentation.

type: string

default: 2D_versatile_fluo

StarDist object probability threshold. Lower values detect more objects.

type: number

StarDist non-maximum suppression threshold. Lower values reduce overlapping detections.

type: number

StarDist tiling for large images (e.g., ‘4 4’). Reduces memory usage.

type: string

default: 8 8

Prior segmentation mask from other segmentation methods.

type: string

Fasta file for the probe sequences used in the xenium experiment.

type: string

Path to the directory containing genomic features (.gff) and fasta (.fa) files used as reference annotations.

type: string

Gene synonyms that may have been counted as off-targets but simply differ in name.

type: string

Email address for completion summary.

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

MultiQC report title. Printed as page header, used for filename if not otherwise specified.

type: string

Options for the segmentation layer of the spatialxe pipeline

Whether to run the qc layer in the pipeline.

type: boolean

default: true

Whether to run the off-target probe tracking.

type: boolean

Whether to run refinement on the image-based segmentation methods. Runs coordinate-based methods after the initial image-based segmentation run.

type: boolean

Whether to relabel genes with gene_panel.json file. True when gene_panel is provided.

type: boolean

Whether to run vanilla xeniumranger workflow.

type: boolean

Whether to only run nucleus segmentation.

type: boolean

Whether to only run nucleus segmentation.

type: boolean

Nuclei boundary expansion distance in µm. Default: 5 (Min: 0, Max: 15 if either boundary-stain or interior-stain are enabled and 100 if nucleus-expansion only)

type: integer

default: 5

Minimum intensity in photoelectrons (pe) to filter nuclei. Default: 100. (appropriate range of values is 0 to 99th percentile of image stack or 1000, whichever is larger)

type: integer

default: 100

Specify the name of the interior stain to use or disable. Supported for cell segmentation staining workflow output bundles. Possible options are: “18S” (default) or “disable”

type: boolean

default: true

Specify the name of the boundary stain to use or disable. Supported for cell segmentation staining workflow output bundles. Possible options are: “ATP1A1/CD45/E-Cadherin” (default) or “disable”

type: boolean

default: true

Enable GPU acceleration (set automatically by the gpu profile).

type: boolean

AWS Batch queue for GPU tasks (e.g., Segger, ProSeg).

type: string

AWS Batch queue for Cellpose (single large GPU).

type: string

Pre-downscale morphology image to avoid Cellpose OOM on large images.

type: boolean

Whether to enhance the morphology.ome.tif file.

type: boolean

Device used for training. (e.g., cuda for GPU or cpu)

type: string

Method for KNN computation. (e.g., cuda for GPU-based computation)

type: string

Number of data-loader workers for Segger.

type: integer

default: 4

Path to a pre-trained Segger model checkpoint.

type: string

Preset value for the proseg segmentation method.

type: string

default: xenium

List of image-based segmentation methods.

type: array

List of transcript-based segmentation methods.

type: array

List of segmentation-free methods.

type: array

Regex used to identify or match negative control samples in a dataset.

type: string

List of features to be passed to the ficture method. (eg: TP53,OCIAD1,BCAS3,SOX)

type: string

Whether to filter the transcripts.parquet file before running Baysor segmentation.

type: boolean

Baysor —scale parameter for non-tiled runs.

type: integer

default: 30

Path to Baysor config TOML file (optional).

type: string

Enable tiled Baysor segmentation (divide transcripts into patches, run Baysor per patch, stitch results).

type: boolean

default: true

Tile width in microns for Baysor tiling.

type: integer

default: 1200

Overlap between Baysor patches in microns.

type: integer

default: 200

Balance transcripts across tiles by merging sparse tiles.

type: boolean

default: true

Baysor —scale for tiled runs (larger to compensate for EM on smaller tiles).

type: integer

default: 39

Minimum molecules per cell (—min-molecules-per-cell) for tiled Baysor.

type: integer

default: 120

Post-stitch cell filtering threshold: minimum transcripts per cell.

type: integer

default: 50

Prior segmentation type for Baysor. ‘cells’ uses Xenium bundle cell_id column; ‘cellpose’ uses Cellpose mask as image prior.

type: string

Baysor prior-segmentation-confidence (0-1).

type: number

default: 0.2

Minimum Q-Score to pass filtering.

type: number

default: 20

only keep transcripts whose x-coordinate is greater than specified limit, if no limit is specified, the default minimum value will be 0.0

type: number

only keep transcripts whose x-coordinate is less than specified limit, if no limit is specified, the default value will retain all transcripts since Xenium slide is <24000 microns in x and y (default: 24000.0)

type: number

only keep transcripts whose y-coordinate is greater than specified limit, if no limit is specified, the default minimum value will be 0.0

type: number

only keep transcripts whose y-coordinate is less than specified limit, if no limit is specified, the default value will retain all transcripts since Xenium slide is <24000 microns in x and y (default: 24000.0)

type: number

Enable tiled segmentation for large datasets. Divides transcripts into overlapping patches, runs segmentation in parallel per patch, then stitches results.

type: boolean

Grid layout for tiling (rows x cols), e.g. ‘3x3’, ‘4x4’.

type: string

default: 3x3

Overlap between adjacent patches in microns.

type: integer

default: 50

Post-stitch cell size filtering method. Options: ‘empirical’ (IQR-based), ‘distribution’ (z-score), ‘both’, or null to disable.

type: string

IQR multiplier for empirical cell size filtering during stitching.

type: number

default: 3

Z-score threshold for distribution-based cell size filtering during stitching.

type: number

default: 4

Number of tiles along the x axis for cell-type separability.

type: integer

default: 2

Number of tiles along the y axis for cell-type separability.

type: integer

default: 2

Width of the tiles in pixels

type: integer

default: 120

Height of the tiles in pixels

type: integer

default: 120

Number of samples to process per training batch

type: integer

default: 4

Number of devices (GPUs) to use during training

type: integer

default: 4

Number of training epochs

type: integer

default: 200

Number of samples to process per batch during prediction

type: integer

default: 1

Whether to use connected components for grouping transcripts without direct nucleus association

type: boolean

Process only one sample at a time from a multi-sample samplesheet.

type: boolean

Number of sample(s) to process at a time from a multi-sample samplesheet. Works if buffered_samples is true.

type: integer

default: 1

Restrict parallelizing a process. Eg. restrict running cellpose cell and nuclei segmentation together if the resources are limited.

type: boolean

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden

type: string

default: master

Base directory for Institutional configs.

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/configs/master

Institutional config name.

hidden

type: string

Institutional config description.

hidden

type: string

Institutional config contact information.

hidden

type: string

Institutional config URL link.

hidden

type: string

Base path / URL for data used in the test profiles.

type: string

default: https://raw.githubusercontent.com/nf-core/test-datasets/spatialxe

Less common options for the pipeline, typically set in a config file.

Display help text.

hidden

type: boolean

Display version and exit.

hidden

type: boolean

Method used to save pipeline results to output directory.

hidden

type: string

Email address for completion summary, only when pipeline fails.

hidden

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Send plain-text email instead of HTML.

hidden

type: boolean

File size limit when attaching MultiQC reports to summary emails.

hidden

type: string

default: 25.MB

pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Do not use coloured log outputs.

hidden

type: boolean

Incoming hook URL for messaging service

hidden

type: string

Custom config file to supply to MultiQC.

hidden

type: string

Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file

hidden

type: string

Custom MultiQC yaml file containing HTML including a methods description.

type: string

Boolean whether to validate parameters against the schema at runtime

hidden

type: boolean

default: true

Do not use coloured log outputs

hidden

type: boolean

Base URL or local path to location of pipeline test dataset files

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/test-datasets/

Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss

hidden

type: string

Display the full detailed help message.

type: boolean

Display hidden parameters in the help message (only works when —help or —help_full are provided).

type: boolean

On this page

nf-core/spatialxe

Input/output options

Segmentation options

Institutional config options

Generic options