In this demo, we will rely on a dataset from Janesick et al. (2023), which includes same-section Visium and Xenium measurements on human breast cancer tissue.
32.2 Setup
We start out by retrieving these datasets from the OSF repository, and reading them into R as separate SpatialExperiment objects:
# also retrieve cell subpopulation labelsdf<-read.csv(file.path(td, "annotation.csv"))xen$anno<-df$Annotation[match(xen$cell_id, df$Barcode)]
We’ll also do some data wrangling to simplify spatial coordinate names, and use gene symbols (rather than ensembl identifiers) as feature names for both data:
To align Xenium and Visium sections, we use the affine transformation matrix provided by 10x Genomics, which was obtained by registration of Xenium onto Visium in Python with the Fiji (10xGenomics 2023) Java plug-in.
An affine map is generally composed of a linear map (scaling and rotation) and a translation, and can be apply using basic matrix multiplication and vector addition, specifically:
\[\mathbf{y}=A\mathbf{x}+\mathbf{b}\]
where \(A\) denotes the linear map, \(b\) the translation, and \(\mathbf{x}\) and \(\mathbf{y}\) correspond to original and transformed coordinates, respectively.
In R, this translates to the following operations:
Binning single-cell resolution spatial data into spots can be useful for checking correlations between technical replicates of the same technology, identifying artifacts across technologies, and checking cell density (number of cells per spot). In general, it is possible to bin at the transcript-level (sub-cellular) or cell-level.
To aggregate single cell-level data from Xenium at the spot level, we first carry out a fixed-radius neighborhood search using RANN (c.f. Chapter 19) to identify, for ever spot, cells whose centroid lies within a \(\sim130\)um distance (Visium spot diameter of 55um, divided by two 2, divided by 0.2115 = Xenium px size in um):
Code
# do a fixed-radius search to get cell # centroids that fall on a given spotnns<-nn2( searchtype="radius", radius=55/2/0.2125, k=200, data=spatialCoords(xen), query=spatialCoords(vis))
Let’s count and visualize the number of cells that overlap each spot:
Code
# get cell indices and number of cells per spotvis$n_cells<-rowSums((idx<-nns$nn.idx)>0)plotSpots(vis, annotate="n_cells")
Next, we can aggregate single cell-level Xenium data into pseudo-spots. In addition, we propagate the Visium data’s spatial coordinates, excluding spots without any overlapping cells:
Because we’ve aligned the Xenium to the Visium data, we can also propagate the Visium data’s imgData (low resolution H&E staining) to the object containing pseudo-spot Xenium data:
Next, let’s compute some standard quality control metrics on both, the Visium and pseudo-spot Xenium data. Besides dataset-specific metrics, we also specify the subset of genes that are shared between both datasets in order to obtain comparable metrics:
In order to perform joint spatial clustering of both modalities, we construct array coordinates for Xenium pseudo-spots that are offset from Visium spots:
Code
# offset the spatial location for joint clustering# of Visium and adjacent pseudo-spot Xenium dataobj$array_row<-c(ar<-vis[, cs]$array_row, 100+ar)obj$array_col<-c(ac<-vis[, cs]$array_col, 100+ac)
Next, we run a standard pipeline to perform log-library size normalization (using scater), principal component analysis (PCA), harmony integration, and dimension reduction (UMAP). Notably, the feature selection step that typically precedes PCA is skipped here, as the Xenium experiment includes only a curated selection of \(\sim300\) targets by design.
The spatialCluster() function clusters the spots, and adds the predicted cluster labels to the object. The authors recommend running with at least 10,000 iterations (nrep=1e4); we use fewer iterations in this demo for the sake of runtime. (Note that a random seed must be set (set.seed()) for the results to be reproducible.)
From here on out, both datasets could be analyzed together and/or independently, e.g., in order to identify cluster markers, annotate cell subpopulations etc.
Janesick, Amanda, Robert Shelansky, Andrew D. Gottscho, Florian Wagner, Stephen R. Williams, Morgane Rouault, Ghezal Beliakoff, et al. 2023. “High Resolution Mapping of the Tumor Microenvironment Using Integrated Single-Cell, Spatial and in Situ Analysis.”Nature Communications 14 (1): 8353.