We develop statistical and computational methods for discovery-based analyses of single-cell and spatial transcriptomics data, including open-source software implementations. We are especially focused on (i) methods for unsupervised analyses, for example feature selection methods to identify spatially variable genes (nnSVG) or clustering methods to identify spatial domains, and (ii) methods for differential analyses, including differential spatial localization between conditions. A key focus of this work is computational scalability, allowing methods to be applied to large datasets consisting of gene expression measurements for thousands of genes in thousands of cells or spatial measurement locations.
We implement methods as open-source software packages, primarily as R packages through the Bioconductor project, including code examples, documentation, and tutorials. In addition, we develop resources for comprehensive analysis workflows, for example data structures for storing and manipulating spatial transcriptomics data in R (SpatialExperiment), and an online book containing reproducible code examples and discussion on integrated analysis workflows for spatial transcriptomics data in R/Bioconductor (OSTA).
Numerous methods have been developed for analyses of single-cell and spatial transcriptomics and other types of high-throughput genomic data. This creates challenges for researchers analyzing these data, who need to select appropriate analysis methods and connect these into workflows. We perform benchmarking studies that aim to rigorously evaluate the performance of methods for specific analysis tasks and data types, including comparisons against baseline methods and taking into account computational performance and scalability.
We are interested in collaborative analyses of single-cell and spatial transcriptomics data from experiments in areas of biological research including neuroscience, immunology, and cancer biology. These projects provide an opportunity to contribute to biological results, as well as providing guidance on the most useful areas and ideas for future methodological development, for example by identifying steps in analysis workflows where existing methods are inefficient or computationally infeasible.
Throughout our research, we follow principles of reproducible research and open science, including through the development of open-source software packages, reproducible online resources including tutorials and code examples, and code and data resources to reproduce results in published papers.