Chapter 1 Introduction
This book provides details on several key data preprocessing procedures for computational analyses of spatial transcriptomics data from the 10x Genomics Visium platform.
1.1 Spatial transcriptomics
Spatial transcriptomics refers to high-throughput genomic technologies that enable the measurement of large sets of genes at a large number of spatial locations (e.g. up to thousands of genes at thousands of measurement locations), usually on two-dimensional tissue sections. A number of technological platforms have been developed. Spatially-resolved transcriptomics was named the Method of the Year 2020 by Nature Methods, and has found widespread application in numerous biological systems.
Several different technological platforms are now available. These can be grouped into sequencing-based platforms and molecule-based (or imaging-based) platforms. For recent reviews of available platforms, computational analysis methods, and outstanding challenges, see Bressan et al. (2023) or Moses et al. (2022).
In this book, we focus on the 10x Genomics Visium platform, which is an example of a sequencing-based platform and is currently one of the most widely-used platforms.
Each platform brings specific challenges regarding computational analyses, including data preprocessing procedures as well as downstream analyses. In this book, we provide detailed instructions on data preprocessing procedures for 10x Genomics Visium data.
1.2 10x Genomics Visium
The 10x Genomics Visium platform measures transcriptome-wide gene expression at a grid of spatial locations (referred to as spots) on a tissue capture area. Either fresh-frozen or formalin-fixed paraffin-embedded (FFPE) tissue may be used. Each spot contains millions of spatially-barcoded capture oligonucleotides, which bind to mRNAs from the tissue. A cDNA library is then generated for sequencing, which includes the spatial barcodes, allowing reads to be mapped back to their spatial locations.
The array dimensions are 6.5mm x 6.5mm, with around 5000 barcoded spots. Spots are 55µm in diameter and spaced 100µm center-to-center in a hexagonal grid arrangement. The number of cells per spot depends on the tissue cell density, e.g. around 0-10 for human brain or ~50 for mouse brain. Each Visium slide contains four tissue capture areas. The following figure provides an illustration.
Histology images generated from hematoxylin and eosin (H&E) staining can be used to identify anatomical and cell morphological features such as the number of cells per spot.