Beyond this book

The analytical landscape of spatial omics is vast and continuously developing. Here, we outline additional analysis tasks and topics that have not been covered elsewhere.

Throughout this book, we have emphasized the reproducibility standards provided by the R/Bioconductor ecosystem. However, certain tasks – particularly those involving machine and deep learning or intensive image analysis – frequently leverage the strengths of existing Python infrastructure. We consider these ecosystems complementary, and hope to address omissions in the future.

Finally, we view this e-book as a living resource and a community effort. If you are a keen developer or researcher with expertise in these (or other) analytical tasks, we welcome contributions to expand these sections into full chapters, whether to provide deeper theory or to include code examples. Please refer to our contribution guidelines.

Integration

Reconciling molecular measurements across multiple tissue sections or diverse platforms is essential for atlas building and comparative studies. In R/Bioconductor, linear methods like harmony (Korsunsky et al. 2019) have been shown to perform well for scRNA-seq data (Luecken et al. 2022). And, the CellMixS (Lütge et al. 2021) package implements several metrics to evaluate batch effects and correction.

In R/CRAN, seurat implements different options, including canonical correlation analysis (CCA) and also harmony (parameter defaults have changed between major releases). The variational autoencoder-based Python tool scvi-tools is another popular choice. And, many methods can be adapted to multi-modal data, e.g., by combining low-dimensional embeddings across modalities.

For ST data, BayesSpace (Zhao et al. 2021) integrates sections through joint spatial clustering. More recently, implicitly spatially aware integration frameworks such as PRECAST (W. Liu et al. 2023) have been developed to explicitly model spatial autocorrelation across slices, trading off batch correction and preservation of tissue architecture.

TipFurther reading
  • Hu et al. (2024) present a multi-task benchmark that includes clustering, spatial alignment, and integration of ST data.

  • Luecken et al. (2022) have systematically benchmark batch correction methods for scRNA-seq data; notably, the evaluation metrics presented here are worth considering also during day-to-day data analysis (not only benchmarking).

  • The OSCA chapter on correcting batch effects provides motivations and demonstrates the use of linear regression and MNN for correction; consecutive chapters cover diagnostics (removing technical vs. preserving biological variation) and downstream analyses (differential expression and abundance analysis).

  • The single-cell best practices Python book chapter on data integration provides more extensive theoretical background, including a formal categorization of methodologies (graph-based, deep learning, and more). The chapter also demonstrates popular Python tools, including scvi-tools’s scANVI (Xu et al. 2021).

Trajectory inference

Trajectory inference (TI) aims to reconstruct dynamic biological processes by ordering cells or spots along paths of minimal transcriptional change, inferring a continuous progression known as pseudotime.

Foundational R/Bioconductor methods like monocle (Trapnell et al. 2014; Qiu et al. 2017) and slingshot (Street et al. 2018) remain popular also for spatially resolved data. Alternatively, spatially aware DR (see 28  Dimensionality reduction) can be used to obtain smoothed embeddings for downstream TI.

By contrast, ST data offer a means to reconstruct of pseudo-space-time, which reconciles transcriptional similarity with physical proximity. In Python spatially aware frameworks include SpaceFlow (Ren et al. 2022) that uses spatially regularized graph networks to learn spatially-coherent expression patterns for TI from ST data; stlearn (Pham et al. 2023), which penalizes transitions between physically distant points; spaTrack (Shen et al. 2025), an optimal transport-based approach to identify plausible paths.

TipFurther reading
  • The OSCA book chapter on trajectory analysis demonstrates slingshot, how to identify genes that exhibit dynamic trends, and how to estimate trajectory roots.

  • The single-cell best practices Python book chapter on pseudotemporal ordering of scRNA-seq data uses diffusion maps for inference; subsequent chapters cover RNA velocity and lineage tracing.

  • Cannoodt et al. (2016) review TI in its early days, yet nicely summarize key aspects of TI modeling, evaluation, and application.

  • Saelens et al. (2019) provide the most comprehensive benchmark of single-cell TI methods to date, including a wide variety of synthetic and real datasets, trajectory topologies, and both R and Python tools.

CNV inference

Copy number variations (CNV), or alterations (CNA), aims to infer computationally genomic segments that have been duplicated or deleted from transcriptomics data. This strategy is particularly interesting in cancer, in order to identify malignant cells and, in spatially resolved data, map subclonal architecture.

In Python, the Broad Institute’s inferCNV is perhaps the most widely known tool for this task; while discontinued, infercnvpy represents a replacement with improved scalability and scanpy interoperability. In R/Bioconductor, infercnv provides a reticulate-based interface.

Briefly, inferCNV‘s approach is to, for every cell, compute the average gene expression over moving chromosomal windows, and to compare these to a set of ’normal’ reference cells. Results are typically captured as a heatmap where rows = cells and columns = genomic regions; the global loss/gain patterns can be compared with known driver-mutations in the scientific literature.

TipFurther reading
  • Erickson et al. (2022) have adapted scRNA-seq CNV inference to 10x Genomics Visium data, reconstructing CNV-based clonal evolution in prostate cancer.

  • Jensen et al. (2025) have adapted the approach to imaging-based ST data, demonstrating applicability to different platforms (Xenium, CosMx, etc.).

  • Schmid et al. (2025) benchmark CNV inference methods for scRNA-seq data.

Foundation models

Foundation models (FMs) represent a modern paradigm where massive deep learning architectures (e.g., transformers) are pre-trained on millions of single-cell profiles, biological images, etc., to learn generalizable representations. All of these models are developed and implemented in Python, although R infrastructure to interface with pre-trained models are underway, i.e., making use of model weights and embeddings for downstream analysis.

Examples (mostly) mentioned across different chapters include Prov-GigaPath (Xu et al. 2024) (histopathology; 33  Image analysis); Geneformer (Theodoris et al. 2023), scGPT (Cui et al. 2024), scFoundation (Hao et al. 2024), and Novae (Blampey et al. 2025) (omics; 29  Clustering & annotation). Not an FM but worth mentioning: CellPLM (Wen et al. 2023) is a pre-trained “cell language model” with cells = tokens and tissues = sentences (an idea inspired by large language models); the model supports several downstream tasks such as denoising and annotation of scRNA-seq data, imputation of ST data, and perturbation prediction.

Another popular task not mentioned in previous chapter is the prediction of molecular changes upon perturbation (e.g., disease, treatment response); however, deep learning-based approaches for this task have been shown to not (yet) outperform simple linear baselines (Ahlmann-Eltze et al. 2025). Out-performance by simpler methods has also been demonstrated by Kedzierska et al. (2025) in the context of cell type annotation.

TipFurther reading
  • Szałata et al. (2024) describe, review, and discuss future directions of transformers in single-cell omics: a popular architecture for FMs.

  • Ahlmann-Eltze et al. (2025) demonstrates that linear models yield better performance than more sophisticated deep learning-based approaches.

  • Ahlmann-Eltze et al. (2026) review tasks and challenges around representation learning of scRNA-seq data, including transformer-based FMs, autoencoders, and more.

Multi-modality

Paralleling past technological developments around single-cell omics, spatial multi-omics approaches are sprouting by now, including spatial co-profiling of RNA with proteins (e.g., spatial-CITE-seq (Y. Liu et al. 2023)) and with the epigenome (e.g., spatial-ATAC-RNA-seq (Zhang et al. 2023)).

Simultaneous capture is by now also supported by commercial in situ platforms, such as 10x Genomics’ Xenium and Bruker’s CosMx, which enable co-detection of RNA and a curated set of protein targets. Microfluidic-based methods like DBiT-seq (Liu et al. 2020) and SPOTS (Ben-Chetrit et al. 2023) have further expanded these capabilities to high-throughput sequencing.

Computationally, the challenge lies in the joint latent representation of disparate data types. In R/Bioconductor, MOFA2 (Velten et al. 2022) provides a factor analysis framework for multi-modal integration. The MultiAssayExperiment class offers foundational infrastructure to manage synchronicity between linked data layers. In Python, tools such as scvi-tools’s MultiVI (Ashuach et al. 2023) and SpatialGlue (Long et al. 2024) use deep learning (e.g., graph neural networks) to reconcile these layers while preserving spatial context.

TipFurther reading
  • Vandereyken et al. (2023) review biotechnology for spatial multi-omics; and, Liu et al. (2024) review multi-modal data integration. More recently, Isik et al. (2026) review the computational landscape and challenges around integrating multi-modal spatial omics and biological imaging data, from statistical to deep learning-based approaches.

  • Argelaguet et al. (2021) nicely summarize computational concepts of single-cell data integration, distinguishing between horizontal (same features), vertical (same observations), and diagonal (both different) tasks.

  • The single-cell best practices Python book provides two chapters on multi-omics, including integration of paired and unpaired datasets (same vs. different measurement entities).

Appendix

References

Ahlmann-Eltze, Constantin, Florian Barkmann, Jan Lause, Valentina Boeva, and Dmitry Kobak. 2026. Representation learning of single-cell RNA-seq data.” RNA (New York, N.Y.), ahead of print. https://doi.org/10.1261/rna.080889.125.
Ahlmann-Eltze, Constantin, Wolfgang Huber, and Simon Anders. 2025. Deep-learning-based gene perturbation effect prediction does not yet outperform simple linear baselines.” Nature Methods 22 (8): 1657–61. https://doi.org/10.1038/s41592-025-02772-6.
Argelaguet, Ricard, Anna S E Cuomo, Oliver Stegle, and John C Marioni. 2021. Computational principles and challenges in single-cell data integration.” Nature Biotechnology 39: 1202–15. https://doi.org/10.1038/s41587-021-00895-7.
Ashuach, Tal, Mariano I Gabitto, Rohan V Koodli, Giuseppe-Antonio Saldi, Michael I Jordan, and Nir Yosef. 2023. MultiVI: deep generative model for the integration of multimodal data.” Nature Methods 20 (8): 1222–31. https://doi.org/10.1038/s41592-023-01909-9.
Ben-Chetrit, Nir, Xiang Niu, Ariel D Swett, et al. 2023. Integration of whole transcriptome spatial profiling with protein markers.” Nature Biotechnology 41 (6): 788–93. https://doi.org/10.1038/s41587-022-01536-3.
Blampey, Quentin, Hakim Benkirane, Nadège Bercovici, et al. 2025. Novae: a graph-based foundation model for spatial transcriptomics data.” Nature Methods 22 (12): 2539–50. https://doi.org/10.1038/s41592-025-02899-6.
Cannoodt, Robrecht, Wouter Saelens, and Yvan Saeys. 2016. Computational methods for trajectory inference from single-cell transcriptomics.” European Journal of Immunology 46 (11): 2496–506. https://doi.org/10.1002/eji.201646347.
Cui, Haotian, Chloe Wang, Hassaan Maan, et al. 2024. scGPT: toward building a foundation model for single-cell multi-omics using generative AI.” Nature Methods 21 (8): 1470–80. https://doi.org/10.1038/s41592-024-02201-0.
Erickson, Andrew, Mengxiao He, Emelie Berglund, et al. 2022. Spatially resolved clonal copy number alterations in benign and malignant tissue.” Nature 608 (7922): 360–67. https://doi.org/10.1038/s41586-022-05023-2.
Hao, Minsheng, Jing Gong, Xin Zeng, et al. 2024. Large-scale foundation model on single-cell transcriptomics.” Nature Methods 21 (8): 1481–91. https://doi.org/10.1038/s41592-024-02305-7.
Hu, Yunfei, Manfei Xie, Yikang Li, et al. 2024. “Benchmarking Clustering, Alignment, and Integration Methods for Spatial Transcriptomics.” Genome Biology 25 (212). https://doi.org/10.1186/s13059-024-03361-0.
Isik, Esra Busra, Yusuf Hakan Usta, Haozhe Liu, et al. 2026. Multimodal spatial omics: From data acquisition to computational integration.” arXiv, ahead of print. https://doi.org/10.48550/arXiv.2601.12381.
Jensen, Augusta Elisabeth Vang, Helena Crowell, Anna Pascual Reguant, et al. 2025. In situ inference of copy number variations in image-based spatial transcriptomics.” bioRxiv, 2025.07.02.662761. https://doi.org/10.1101/2025.07.02.662761.
Kedzierska, Kasia Z, Lorin Crawford, Ava P Amini, and Alex X Lu. 2025. Zero-shot evaluation reveals limitations of single-cell foundation models.” Genome Biology 26 (1): 101. https://doi.org/10.1186/s13059-025-03574-x.
Korsunsky, Ilya, Nghia Millard, Jean Fan, et al. 2019. Fast, sensitive and accurate integration of single-cell data with Harmony.” Nature Methods 16 (12): 1289–96. https://doi.org/10.1038/s41592-019-0619-0.
Liu, Wei, Xu Liao, Ziye Luo, et al. 2023. “Probabilistic Embedding, Clustering, and Alignment for Integrating Spatial Transcriptomics Data with PRECAST.” Nature Communications 14 (296). https://doi.org/10.1038/s41467-023-35947-w.
Liu, Xiaojie, Ting Peng, Miaochun Xu, et al. 2024. Spatial multi-omics: deciphering technological landscape of integration of multi-omics and its applications.” Journal of Hematology & Oncology 17 (1): 72. https://doi.org/10.1186/s13045-024-01596-9.
Liu, Yang, Marcello DiStasio, Graham Su, et al. 2023. High-plex protein and whole transcriptome co-mapping at cellular resolution with spatial CITE-seq.” Nature Biotechnology 41 (10): 1405–9. https://doi.org/10.1038/s41587-023-01676-0.
Liu, Yang, Mingyu Yang, Yanxiang Deng, et al. 2020. High-Spatial-Resolution Multi-Omics Sequencing via Deterministic Barcoding in Tissue.” Cell, ahead of print. https://doi.org/10.1016/j.cell.2020.10.026.
Long, Yahui, Kok Siong Ang, Raman Sethi, et al. 2024. Deciphering spatial domains from spatial multi-omics with SpatialGlue.” Nature Methods 21 (9): 1658–67. https://doi.org/10.1038/s41592-024-02316-4.
Luecken, Malte D, M Büttner, K Chaichoompu, et al. 2022. Benchmarking atlas-level data integration in single-cell genomics.” Nature Methods 19 (1): 41–50. https://doi.org/10.1038/s41592-021-01336-8.
Lütge, Almut, Joanna Zyprych-Walczak, Urszula Brykczynska Kunzmann, et al. 2021. CellMixS: quantifying and visualizing batch effects in single-cell RNA-seq data.” Life Science Alliance 4 (6): e202001004. https://doi.org/10.26508/lsa.202001004.
Pham, Duy, Xiao Tan, Brad Balderson, et al. 2023. Robust mapping of spatiotemporal trajectories and cell–cell interactions in healthy and diseased tissues.” Nature Communications 14 (1): 1–25. https://doi.org/10.1038/s41467-023-43120-6.
Qiu, Xiaojie, Qi Mao, Ying Tang, et al. 2017. Reversed graph embedding resolves complex single-cell trajectories.” Nature Methods 14 (10): 979–82. https://doi.org/10.1038/nmeth.4402.
Ren, Honglei, Benjamin L Walker, Zixuan Cang, and Qing Nie. 2022. Identifying multicellular spatiotemporal organization of cells with SpaceFlow.” Nature Communications 13 (1): 4076. https://doi.org/10.1038/s41467-022-31739-w.
Saelens, Wouter, Robrecht Cannoodt, Helena Todorov, and Yvan Saeys. 2019. A comparison of single-cell trajectory inference methods.” Nature Biotechnology 37 (5): 547–54. https://doi.org/10.1038/s41587-019-0071-9.
Schmid, Katharina T, Aikaterini Symeonidi, Dmytro Hlushchenko, et al. 2025. Benchmarking scRNA-seq copy number variation callers.” Nature Communications 16 (1): 8777. https://doi.org/10.1038/s41467-025-62359-9.
Shen, Xunan, Lulu Zuo, Zhongfei Ye, et al. 2025. Inferring cell trajectories of spatial transcriptomics via optimal transport analysis.” Cell Systems 16 (2): 101194. https://doi.org/10.1016/j.cels.2025.101194.
Street, Kelly, Davide Risso, Russell B Fletcher, et al. 2018. Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics.” BMC Genomics 19 (1): 477. https://doi.org/10.1186/s12864-018-4772-0.
Szałata, Artur, Karin Hrovatin, Sören Becker, et al. 2024. Transformers in single-cell omics: a review and new perspectives.” Nature Methods 21 (8): 1430–43. https://doi.org/10.1038/s41592-024-02353-z.
Theodoris, Christina V, Ling Xiao, Anant Chopra, et al. 2023. Transfer learning enables predictions in network biology.” Nature 618 (7965): 616–24. https://doi.org/10.1038/s41586-023-06139-9.
Trapnell, Cole, Davide Cacchiarelli, Jonna Grimsby, et al. 2014. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells.” Nature Biotechnology 32 (4): 381–86. https://doi.org/10.1038/nbt.2859.
Vandereyken, Katy, Alejandro Sifrim, Bernard Thienpont, and Thierry Voet. 2023. Methods and applications for single-cell and spatial multi-omics.” Nature Reviews Genetics 24 (8): 494–515. https://doi.org/10.1038/s41576-023-00580-2.
Velten, Britta, Jana M Braunger, Ricard Argelaguet, et al. 2022. Identifying temporal and spatial patterns of variation from multimodal data using MEFISTO.” Nature Methods, 1–8. https://doi.org/10.1038/s41592-021-01343-9.
Wen, Hongzhi, Wenzhuo Tang, Xinnan Dai, et al. 2023. CellPLM: Pre-training of cell language model beyond single cells.” bioRxiv, 2023.10.03.560734. https://doi.org/10.1101/2023.10.03.560734.
Xu, Chenling, Romain Lopez, Edouard Mehlman, Jeffrey Regier, Michael I Jordan, and Nir Yosef. 2021. Probabilistic harmonization and annotation of single-cell transcriptomics data with deep generative models.” Molecular Systems Biology 17 (1): e9620. https://doi.org/10.15252/msb.20209620.
Xu, Hanwen, Naoto Usuyama, Jaspreet Bagga, et al. 2024. “A Whole-Slide Foundation Model for Digital Pathology from Real-World Data.” Nature 630 (8015): 181–88. https://doi.org/10.1038/s41586-024-07441-w.
Zhang, Di, Yanxiang Deng, Petra Kukanja, et al. 2023. Spatial epigenome–transcriptome co-profiling of mammalian tissues.” Nature 616: 113–22. https://doi.org/10.1038/s41586-023-05795-1.
Zhao, Edward, Matthew R. Stone, Xing Ren, et al. 2021. “Spatial Transcriptomics at Subspot Resolution with BayesSpace.” Nature Biotechnology 39: 1375–84. https://doi.org/10.1038/s41587-021-00935-2.
Back to top