Beyond this book

The analytical landscape of spatial omics is vast and continuously developing. Here, we outline additional analysis tasks and topics that have not been covered elsewhere.

Throughout this book, we have emphasized the reproducibility standards provided by the R/Bioconductor ecosystem. However, certain tasks – particularly those involving machine and deep learning or intensive image analysis – frequently leverage the strengths of existing Python infrastructure. We consider these ecosystems complementary, and hope to address omissions in the future.

Finally, we view this e-book as a living resource and a community effort. If you are a keen developer or researcher with expertise in these (or other) analytical tasks, we welcome contributions to expand these sections into full chapters, whether to provide deeper theory or to include code examples. Please refer to the contribution guidelines in Appendix A — Contributing.

Methodological foundations

OSTA is primarily intended as a practical guide to spatial omics analysis workflows using existing, maintained software tools. While we introduce key mathematical and statistical concepts needed to understand analysis choices, e.g. relating to spatial resolution, data representation, quality control, normalization, clustering, spatial statistics, and cross-platform integration, we do not attempt to provide detailed formal mathematical or statistical treatment of each method.

For readers interested in further details on mathematical and statistical foundations, e.g., for the purpose of new method development, we recommend consulting additional resources including primary method papers and more specialized resources on statistical modeling, spatial statistics, machine learning, and image analysis.

Further reading

An Introduction to Statistical Learning by G. James, D. Witten, T. Hastie, and R. Tibshirani (Springer (2013), ISBN: 978-0-387-84858-7) provides an introduction to statistical learning methods, including linear regression, classification, resampling methods, and unsupervised learning. It is freely available online with practical applications in R and, more recently, Python. For specialized resources relating to spatial statistics, see 32 Spatial statistics.
Modern Statistics for Modern Biology by S. Holmes and W. Huber (Cambridge University Press (2019), ISBN: 9781108705295) provides a practical introduction to the statistical thinking and computational methods needed to analyze modern biological data. Rather than developing theory from first principles, it equips readers with the concepts and tools they need to understand, perform, and interpret contemporary data analyses.
Deep Learning with Python by F. Chollet (Manning Publications (2021), ISBN: 9781617296864) provides a practical introduction to deep learning and generative AI, from core neural network concepts to modern architectures such as transformers and diffusion models. Through hands-on projects and code examples, it develops the intuition and skills needed to build and understand state-of-the-art AI systems.
Introduction to Data Science by R. Irizarry (Chapman & Hall/CRC (2024), ISBN: 9781032116556) consists of six parts: Summary Statistics, Probability, Statistical Inference, Linear Models, High-Dimensional Data, and Machine Learning. Foundational concepts are introduced through examples, followed by applied case studies that integrate these ideas. Chapters are designed for single-lecture delivery and include practical exercises in R; all data and the book’s source code are available online.

Integration

Reconciling molecular measurements across multiple tissue sections or diverse platforms is essential for atlas building and comparative studies. In R/Bioconductor, linear methods like harmony (Korsunsky et al. 2019) have been shown to perform well for scRNA-seq data (Luecken et al. 2022). And, the CellMixS (Lütge et al. 2021) package implements several metrics to evaluate batch effects and correction.

In R/CRAN, seurat implements different options, including canonical correlation analysis (CCA) and also harmony (parameter defaults have changed between major releases). The variational autoencoder-based Python tool scvi-tools is another popular choice. And, many methods can be adapted to multi-modal data, e.g., by combining low-dimensional embeddings across modalities.

For ST data, BayesSpace (Zhao et al. 2021) integrates sections through joint spatial clustering. More recently, implicitly spatially aware integration frameworks such as PRECAST (W. Liu et al. 2023) have been developed to explicitly model spatial autocorrelation across slices, trading off batch correction and preservation of tissue architecture.

Further reading

Hu et al. (2024) present a multi-task benchmark that includes clustering, spatial alignment, and integration of ST data.
Luecken et al. (2022) have systematically benchmark batch correction methods for scRNA-seq data; notably, the evaluation metrics presented here are worth considering also during day-to-day data analysis (not only benchmarking).
The OSCA chapter on correcting batch effects provides motivations and demonstrates the use of linear regression and MNN for correction; consecutive chapters cover diagnostics (removing technical vs. preserving biological variation) and downstream analyses (differential expression and abundance analysis).
The single-cell best practices Python book chapter on data integration provides more extensive theoretical background, including a formal categorization of methodologies (graph-based, deep learning, and more). The chapter also demonstrates popular Python tools, including scvi-tools’s scANVI (Xu et al. 2021).

Trajectory inference

Trajectory inference (TI) aims to reconstruct dynamic biological processes by ordering cells or spots along paths of minimal transcriptional change, inferring a continuous progression known as pseudotime.

Foundational R/Bioconductor methods like monocle (Trapnell et al. 2014; Qiu et al. 2017) and slingshot (Street et al. 2018) remain popular also for spatially resolved data. Alternatively, spatially aware DR (see 28 Dimensionality reduction) can be used to obtain smoothed embeddings for downstream TI.

By contrast, ST data offer a means to reconstruct of pseudo-space-time, which reconciles transcriptional similarity with physical proximity. In Python spatially aware frameworks include SpaceFlow (Ren et al. 2022) that uses spatially regularized graph networks to learn spatially-coherent expression patterns for TI from ST data; stlearn (Pham et al. 2023), which penalizes transitions between physically distant points; spaTrack (Shen et al. 2025), an optimal transport-based approach to identify plausible paths.

Further reading

The OSCA book chapter on trajectory analysis demonstrates slingshot, how to identify genes that exhibit dynamic trends, and how to estimate trajectory roots.
The single-cell best practices Python book chapter on pseudotemporal ordering of scRNA-seq data uses diffusion maps for inference; subsequent chapters cover RNA velocity and lineage tracing.
Cannoodt et al. (2016) review TI in its early days, yet nicely summarize key aspects of TI modeling, evaluation, and application.
Saelens et al. (2019) provide the most comprehensive benchmark of single-cell TI methods to date, including a wide variety of synthetic and real datasets, trajectory topologies, and both R and Python tools.

CNV inference

Copy number variations (CNV), or alterations (CNA), aims to infer computationally genomic segments that have been duplicated or deleted from transcriptomics data. This strategy is particularly interesting in cancer, in order to identify malignant cells and, in spatially resolved data, map subclonal architecture.

In Python, the Broad Institute’s inferCNV is perhaps the most widely known tool for this task; while discontinued, infercnvpy represents a replacement with improved scalability and scanpy interoperability. In R/Bioconductor, infercnv provides a reticulate-based interface.

Briefly, inferCNV‘s approach is to, for every cell, compute the average gene expression over moving chromosomal windows, and to compare these to a set of ’normal’ reference cells. Results are typically captured as a heatmap where rows = cells and columns = genomic regions; the global loss/gain patterns can be compared with known driver-mutations in the scientific literature.

Further reading

Erickson et al. (2022) have adapted scRNA-seq CNV inference to 10x Genomics Visium data, reconstructing CNV-based clonal evolution in prostate cancer.
Jensen et al. (2025) have adapted the approach to imaging-based ST data, demonstrating applicability to different platforms (Xenium, CosMx, etc.).
Schmid et al. (2025) benchmark CNV inference methods for scRNA-seq data.

Multi-modality

Paralleling past technological developments around single-cell omics, spatial multi-omics approaches are sprouting by now, including spatial co-profiling of RNA with proteins (e.g., spatial-CITE-seq (Y. Liu et al. 2023)) and with the epigenome (e.g., spatial-ATAC-RNA-seq (Zhang et al. 2023)).

Simultaneous capture is by now also supported by commercial in situ platforms, such as 10x Genomics’ Xenium and Bruker’s CosMx, which enable co-detection of RNA and a curated set of protein targets. Microfluidic-based methods like DBiT-seq (Liu et al. 2020) and SPOTS (Ben-Chetrit et al. 2023) have further expanded these capabilities to high-throughput sequencing.

Computationally, the challenge lies in the joint latent representation of disparate data types. In R/Bioconductor, MOFA2 (Velten et al. 2022) provides a factor analysis framework for multi-modal integration. The MultiAssayExperiment class offers foundational infrastructure to manage synchronicity between linked data layers. In Python, tools such as scvi-tools’s MultiVI (Ashuach et al. 2023) and SpatialGlue (Long et al. 2024) use deep learning (e.g., graph neural networks) to reconcile these layers while preserving spatial context.

Further reading

Vandereyken et al. (2023) review biotechnology for spatial multi-omics; and, Liu et al. (2024) review multi-modal data integration. More recently, Isik et al. (2026) review the computational landscape and challenges around integrating multi-modal spatial omics and biological imaging data, from statistical to deep learning-based approaches.
Argelaguet et al. (2021) nicely summarize computational concepts of single-cell data integration, distinguishing between horizontal (same features), vertical (same observations), and diagonal (both different) tasks.
The single-cell best practices Python book provides two chapters on multi-omics, including integration of paired and unpaired datasets (same vs. different measurement entities).

Appendix

References

Argelaguet, Ricard, Anna S E Cuomo, Oliver Stegle, and John C Marioni. 2021. “Computational principles and challenges in single-cell data integration.” Nature Biotechnology 39: 1202–15. https://doi.org/10.1038/s41587-021-00895-7.

Ashuach, Tal, Mariano I Gabitto, Rohan V Koodli, Giuseppe-Antonio Saldi, Michael I Jordan, and Nir Yosef. 2023. “MultiVI: deep generative model for the integration of multimodal data.” Nature Methods 20 (8): 1222–31. https://doi.org/10.1038/s41592-023-01909-9.

Ben-Chetrit, Nir, Xiang Niu, Ariel D Swett, et al. 2023. “Integration of whole transcriptome spatial profiling with protein markers.” Nature Biotechnology 41 (6): 788–93. https://doi.org/10.1038/s41587-022-01536-3.

Cannoodt, Robrecht, Wouter Saelens, and Yvan Saeys. 2016. “Computational methods for trajectory inference from single-cell transcriptomics.” European Journal of Immunology 46 (11): 2496–506. https://doi.org/10.1002/eji.201646347.

Erickson, Andrew, Mengxiao He, Emelie Berglund, et al. 2022. “Spatially resolved clonal copy number alterations in benign and malignant tissue.” Nature 608 (7922): 360–67. https://doi.org/10.1038/s41586-022-05023-2.

Hu, Yunfei, Manfei Xie, Yikang Li, et al. 2024. “Benchmarking Clustering, Alignment, and Integration Methods for Spatial Transcriptomics.” Genome Biology 25 (212). https://doi.org/10.1186/s13059-024-03361-0.

Isik, Esra Busra, Yusuf Hakan Usta, Haozhe Liu, et al. 2026. “Multimodal spatial omics: From data acquisition to computational integration.” arXiv, ahead of print. https://doi.org/10.48550/arXiv.2601.12381.

Jensen, Augusta Elisabeth Vang, Helena Crowell, Anna Pascual Reguant, et al. 2025. “In situ inference of copy number variations in image-based spatial transcriptomics.” bioRxiv, 2025.07.02.662761. https://doi.org/10.1101/2025.07.02.662761.

Korsunsky, Ilya, Nghia Millard, Jean Fan, et al. 2019. “Fast, sensitive and accurate integration of single-cell data with Harmony.” Nature Methods 16 (12): 1289–96. https://doi.org/10.1038/s41592-019-0619-0.

Liu, Wei, Xu Liao, Ziye Luo, et al. 2023. “Probabilistic Embedding, Clustering, and Alignment for Integrating Spatial Transcriptomics Data with PRECAST.” Nature Communications 14 (296). https://doi.org/10.1038/s41467-023-35947-w.

Liu, Xiaojie, Ting Peng, Miaochun Xu, et al. 2024. “Spatial multi-omics: deciphering technological landscape of integration of multi-omics and its applications.” Journal of Hematology & Oncology 17 (1): 72. https://doi.org/10.1186/s13045-024-01596-9.

Liu, Yang, Marcello DiStasio, Graham Su, et al. 2023. “High-plex protein and whole transcriptome co-mapping at cellular resolution with spatial CITE-seq.” Nature Biotechnology 41 (10): 1405–9. https://doi.org/10.1038/s41587-023-01676-0.

Liu, Yang, Mingyu Yang, Yanxiang Deng, et al. 2020. “High-Spatial-Resolution Multi-Omics Sequencing via Deterministic Barcoding in Tissue.” Cell, ahead of print. https://doi.org/10.1016/j.cell.2020.10.026.

Long, Yahui, Kok Siong Ang, Raman Sethi, et al. 2024. “Deciphering spatial domains from spatial multi-omics with SpatialGlue.” Nature Methods 21 (9): 1658–67. https://doi.org/10.1038/s41592-024-02316-4.

Luecken, Malte D, M Büttner, K Chaichoompu, et al. 2022. “Benchmarking atlas-level data integration in single-cell genomics.” Nature Methods 19 (1): 41–50. https://doi.org/10.1038/s41592-021-01336-8.

Lütge, Almut, Joanna Zyprych-Walczak, Urszula Brykczynska Kunzmann, et al. 2021. “CellMixS: quantifying and visualizing batch effects in single-cell RNA-seq data.” Life Science Alliance 4 (6): e202001004. https://doi.org/10.26508/lsa.202001004.

Pham, Duy, Xiao Tan, Brad Balderson, et al. 2023. “Robust mapping of spatiotemporal trajectories and cell–cell interactions in healthy and diseased tissues.” Nature Communications 14 (1): 1–25. https://doi.org/10.1038/s41467-023-43120-6.

Qiu, Xiaojie, Qi Mao, Ying Tang, et al. 2017. “Reversed graph embedding resolves complex single-cell trajectories.” Nature Methods 14 (10): 979–82. https://doi.org/10.1038/nmeth.4402.

Ren, Honglei, Benjamin L Walker, Zixuan Cang, and Qing Nie. 2022. “Identifying multicellular spatiotemporal organization of cells with SpaceFlow.” Nature Communications 13 (1): 4076. https://doi.org/10.1038/s41467-022-31739-w.

Saelens, Wouter, Robrecht Cannoodt, Helena Todorov, and Yvan Saeys. 2019. “A comparison of single-cell trajectory inference methods.” Nature Biotechnology 37 (5): 547–54. https://doi.org/10.1038/s41587-019-0071-9.

Schmid, Katharina T, Aikaterini Symeonidi, Dmytro Hlushchenko, et al. 2025. “Benchmarking scRNA-seq copy number variation callers.” Nature Communications 16 (1): 8777. https://doi.org/10.1038/s41467-025-62359-9.

Shen, Xunan, Lulu Zuo, Zhongfei Ye, et al. 2025. “Inferring cell trajectories of spatial transcriptomics via optimal transport analysis.” Cell Systems 16 (2): 101194. https://doi.org/10.1016/j.cels.2025.101194.

Street, Kelly, Davide Risso, Russell B Fletcher, et al. 2018. “Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics.” BMC Genomics 19 (1): 477. https://doi.org/10.1186/s12864-018-4772-0.

Trapnell, Cole, Davide Cacchiarelli, Jonna Grimsby, et al. 2014. “The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells.” Nature Biotechnology 32 (4): 381–86. https://doi.org/10.1038/nbt.2859.

Vandereyken, Katy, Alejandro Sifrim, Bernard Thienpont, and Thierry Voet. 2023. “Methods and applications for single-cell and spatial multi-omics.” Nature Reviews Genetics 24 (8): 494–515. https://doi.org/10.1038/s41576-023-00580-2.

Velten, Britta, Jana M Braunger, Ricard Argelaguet, et al. 2022. “Identifying temporal and spatial patterns of variation from multimodal data using MEFISTO.” Nature Methods, 1–8. https://doi.org/10.1038/s41592-021-01343-9.

Xu, Chenling, Romain Lopez, Edouard Mehlman, Jeffrey Regier, Michael I Jordan, and Nir Yosef. 2021. “Probabilistic harmonization and annotation of single-cell transcriptomics data with deep generative models.” Molecular Systems Biology 17 (1): e9620. https://doi.org/10.15252/msb.20209620.

Zhang, Di, Yanxiang Deng, Petra Kukanja, et al. 2023. “Spatial epigenome–transcriptome co-profiling of mammalian tissues.” Nature 616: 113–22. https://doi.org/10.1038/s41586-023-05795-1.

Zhao, Edward, Matthew R. Stone, Xing Ren, et al. 2021. “Spatial Transcriptomics at Subspot Resolution with BayesSpace.” Nature Biotechnology 39: 1375–84. https://doi.org/10.1038/s41587-021-00935-2.