library(SpatialExperiment)
library(STexampleData)
# load object
<- Visium_humanDLPFC() spe
5 Load data
5.1 Overview
In the following chapters, we apply analysis methods to spatial transcriptomics datasets that are formatted as SpatialExperiment
objects or objects from other Bioconductor data classes (see Chapter 3).
Here, we load a 10x Genomics Visium dataset that will be used in several of the following chapters.
This dataset has previously been preprocessed using data preprocessing procedures with tools outside R and saved in SpatialExperiment
format. For more details on data preprocessing procedures for the 10x Genomics Visium platform, see the related online book Visium Data Preprocessing.
This dataset is available for download in SpatialExperiment
format from the STexampleData Bioconductor package.
5.2 Dataset
This dataset consists of one sample (Visium capture area) from one donor, consisting of postmortem human brain tissue from the dorsolateral prefrontal cortex (DLPFC) brain region, measured with the 10x Genomics Visium platform. The dataset is described in the original publication by Maynard et al. (2021).
More details on the dataset are also included in Chapter 17.
5.3 Load data
Download and load the dataset in SpatialExperiment
format from the STexampleData Bioconductor package.
5.4 SpatialExperiment object
Check the structure of the SpatialExperiment
object. For more details on the SpatialExperiment
structure, see Chapter 3.
# check object
spe
class: SpatialExperiment
dim: 33538 4992
metadata(0):
assays(1): counts
rownames(33538): ENSG00000243485 ENSG00000237613 ... ENSG00000277475
ENSG00000268674
rowData names(3): gene_id gene_name feature_type
colnames(4992): AAACAACGAATAGTTC-1 AAACAAGTATCTCCCA-1 ...
TTGTTTGTATTACACG-1 TTGTTTGTGTAAATTC-1
colData names(7): barcode_id sample_id ... ground_truth cell_count
reducedDimNames(0):
mainExpName: NULL
altExpNames(0):
spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres
imgData names(4): sample_id image_id data scaleFactor
# number of genes (rows) and spots (columns)
dim(spe)
[1] 33538 4992
# names of 'assays'
assayNames(spe)
[1] "counts"
# row (gene) data
head(rowData(spe))
DataFrame with 6 rows and 3 columns
gene_id gene_name feature_type
<character> <character> <character>
ENSG00000243485 ENSG00000243485 MIR1302-2HG Gene Expression
ENSG00000237613 ENSG00000237613 FAM138A Gene Expression
ENSG00000186092 ENSG00000186092 OR4F5 Gene Expression
ENSG00000238009 ENSG00000238009 AL627309.1 Gene Expression
ENSG00000239945 ENSG00000239945 AL627309.3 Gene Expression
ENSG00000239906 ENSG00000239906 AL627309.2 Gene Expression
# column (spot) data
head(colData(spe))
DataFrame with 6 rows and 7 columns
barcode_id sample_id in_tissue array_row
<character> <character> <integer> <integer>
AAACAACGAATAGTTC-1 AAACAACGAATAGTTC-1 sample_151673 0 0
AAACAAGTATCTCCCA-1 AAACAAGTATCTCCCA-1 sample_151673 1 50
AAACAATCTACTAGCA-1 AAACAATCTACTAGCA-1 sample_151673 1 3
AAACACCAATAACTGC-1 AAACACCAATAACTGC-1 sample_151673 1 59
AAACAGAGCGACTCCT-1 AAACAGAGCGACTCCT-1 sample_151673 1 14
AAACAGCTTTCAGAAG-1 AAACAGCTTTCAGAAG-1 sample_151673 1 43
array_col ground_truth cell_count
<integer> <character> <integer>
AAACAACGAATAGTTC-1 16 NA NA
AAACAAGTATCTCCCA-1 102 Layer3 6
AAACAATCTACTAGCA-1 43 Layer1 16
AAACACCAATAACTGC-1 19 WM 5
AAACAGAGCGACTCCT-1 94 Layer3 2
AAACAGCTTTCAGAAG-1 9 Layer5 4
# spatial coordinates
head(spatialCoords(spe))
pxl_col_in_fullres pxl_row_in_fullres
AAACAACGAATAGTTC-1 3913 2435
AAACAAGTATCTCCCA-1 9791 8468
AAACAATCTACTAGCA-1 5769 2807
AAACACCAATAACTGC-1 4068 9505
AAACAGAGCGACTCCT-1 9271 4151
AAACAGCTTTCAGAAG-1 3393 7583
# image data
imgData(spe)
DataFrame with 2 rows and 4 columns
sample_id image_id data scaleFactor
<character> <character> <list> <numeric>
1 sample_151673 lowres #### 0.0450045
2 sample_151673 hires #### 0.1500150
5.5 Build object
Alternatively, we can also build a SpatialExperiment
object directly from raw data.
Here, we provide a short example with an empty dataset.
For more details, including how to load raw data from the 10x Genomics Space Ranger output files to build an object, or how to add image data to the object, see the SpatialExperiment documentation.
# create data
<- 200
n_genes <- 100
n_spots
<- matrix(0, nrow = n_genes, ncol = n_spots)
counts
<- DataFrame(
row_data gene_name = paste0("gene", sprintf("%03d", seq_len(n_genes)))
)
<- DataFrame(
col_data sample_id = rep("sample01", n_spots)
)
<- matrix(0, nrow = n_spots, ncol = 2)
spatial_coords colnames(spatial_coords) <- c("x", "y")
# create SpatialExperiment object
<- SpatialExperiment(
spe assays = list(counts = counts),
colData = col_data,
rowData = row_data,
spatialCoords = spatial_coords
)
5.6 Molecule-based data
For more details on data classes for molecule-based platforms, e.g. 10x Genomics Xenium or Vizgen MERSCOPE, see Chapter 3.