5  Load data

5.1 Introduction

The following chapters provide examples demonstrating methods for individual analysis steps for spatial transcriptomics data from sequencing-based platforms.

In these chapters, we assume the datasets are formatted as SpatialExperiment objects (see Chapter 3).

Here, we load a 10x Genomics Visium dataset that will be used in several of the following chapters. This dataset has previously been preprocessed using data preprocessing procedures with tools outside R and saved in SpatialExperiment format. (For more details on data preprocessing procedures for the Visium platform, see the related online book Visium Data Preprocessing, also listed in Section B.2.) This dataset is available for download in SpatialExperiment format from the STexampleData Bioconductor package.

5.2 Dataset

This dataset consists of one sample (Visium capture area) from one donor, consisting of postmortem human brain tissue from the dorsolateral prefrontal cortex (DLPFC) brain region, measured with the 10x Genomics Visium platform. The dataset is described in Maynard et al. (2021).

More details on the dataset are also included in Chapter 15.

5.3 Load data

Download and load the dataset in SpatialExperiment format from the STexampleData Bioconductor package.

# load object
spe <- Visium_humanDLPFC()
##  see ?STexampleData and browseVignettes('STexampleData') for documentation
##  downloading 1 resources
##  retrieving 1 resource
##  loading from cache

5.4 Save objects for later chapters

We also save the object(s) in .rds format for re-use within later chapters to speed up the build time of the book.

# save object(s)
saveRDS(spe, file = "spe_load.rds")

5.5 SpatialExperiment object

Check the structure of the SpatialExperiment object. For more details on the SpatialExperiment structure, see Chapter 3.

# check object
spe
##  class: SpatialExperiment 
##  dim: 33538 4992 
##  metadata(0):
##  assays(1): counts
##  rownames(33538): ENSG00000243485 ENSG00000237613 ... ENSG00000277475
##    ENSG00000268674
##  rowData names(3): gene_id gene_name feature_type
##  colnames(4992): AAACAACGAATAGTTC-1 AAACAAGTATCTCCCA-1 ...
##    TTGTTTGTATTACACG-1 TTGTTTGTGTAAATTC-1
##  colData names(8): barcode_id sample_id ... reference cell_count
##  reducedDimNames(0):
##  mainExpName: NULL
##  altExpNames(0):
##  spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres
##  imgData names(4): sample_id image_id data scaleFactor
# number of genes (rows) and spots (columns)
dim(spe)
##  [1] 33538  4992
# names of 'assays'
assayNames(spe)
##  [1] "counts"
# row (gene) data
head(rowData(spe))
##  DataFrame with 6 rows and 3 columns
##                          gene_id   gene_name    feature_type
##                      <character> <character>     <character>
##  ENSG00000243485 ENSG00000243485 MIR1302-2HG Gene Expression
##  ENSG00000237613 ENSG00000237613     FAM138A Gene Expression
##  ENSG00000186092 ENSG00000186092       OR4F5 Gene Expression
##  ENSG00000238009 ENSG00000238009  AL627309.1 Gene Expression
##  ENSG00000239945 ENSG00000239945  AL627309.3 Gene Expression
##  ENSG00000239906 ENSG00000239906  AL627309.2 Gene Expression
# column (spot) data
head(colData(spe))
##  DataFrame with 6 rows and 8 columns
##                             barcode_id     sample_id in_tissue array_row
##                            <character>   <character> <integer> <integer>
##  AAACAACGAATAGTTC-1 AAACAACGAATAGTTC-1 sample_151673         0         0
##  AAACAAGTATCTCCCA-1 AAACAAGTATCTCCCA-1 sample_151673         1        50
##  AAACAATCTACTAGCA-1 AAACAATCTACTAGCA-1 sample_151673         1         3
##  AAACACCAATAACTGC-1 AAACACCAATAACTGC-1 sample_151673         1        59
##  AAACAGAGCGACTCCT-1 AAACAGAGCGACTCCT-1 sample_151673         1        14
##  AAACAGCTTTCAGAAG-1 AAACAGCTTTCAGAAG-1 sample_151673         1        43
##                     array_col ground_truth   reference cell_count
##                     <integer>  <character> <character>  <integer>
##  AAACAACGAATAGTTC-1        16           NA          NA         NA
##  AAACAAGTATCTCCCA-1       102       Layer3      Layer3          6
##  AAACAATCTACTAGCA-1        43       Layer1      Layer1         16
##  AAACACCAATAACTGC-1        19           WM          WM          5
##  AAACAGAGCGACTCCT-1        94       Layer3      Layer3          2
##  AAACAGCTTTCAGAAG-1         9       Layer5      Layer5          4
# spatial coordinates
head(spatialCoords(spe))
##                     pxl_col_in_fullres pxl_row_in_fullres
##  AAACAACGAATAGTTC-1               3913               2435
##  AAACAAGTATCTCCCA-1               9791               8468
##  AAACAATCTACTAGCA-1               5769               2807
##  AAACACCAATAACTGC-1               4068               9505
##  AAACAGAGCGACTCCT-1               9271               4151
##  AAACAGCTTTCAGAAG-1               3393               7583
# image data
imgData(spe)
##  DataFrame with 2 rows and 4 columns
##        sample_id    image_id   data scaleFactor
##      <character> <character> <list>   <numeric>
##  1 sample_151673      lowres   ####   0.0450045
##  2 sample_151673       hires   ####   0.1500150

5.6 Build object

Alternatively, we can also build a SpatialExperiment object directly from raw data.

Here, we provide a short example with an empty dataset.

For more details, including how to load raw data from the 10x Genomics Space Ranger output files to build an object, or how to add image data to the object, see the SpatialExperiment documentation.

# create data
n_genes <- 200
n_spots <- 100

counts <- matrix(0, nrow = n_genes, ncol = n_spots)

row_data <- DataFrame(
  gene_name = paste0("gene", sprintf("%03d", seq_len(n_genes)))
)

col_data <- DataFrame(
  sample_id = rep("sample01", n_spots)
)

spatial_coords <- matrix(0, nrow = n_spots, ncol = 2)
colnames(spatial_coords) <- c("x", "y")

# create SpatialExperiment object
spe <- SpatialExperiment(
  assays = list(counts = counts), 
  colData = col_data, 
  rowData = row_data, 
  spatialCoords = spatial_coords
)

References

Maynard, Kristen R., Leonardo Collado-Torres, Lukas M. Weber, Cedric Uytingco, Brianna K. Barry, Stephen R. Williams, Joseph L. Catallini II, et al. 2021. “Transcriptome-Scale Spatial Gene Expression in the Human Dorsolateral Prefrontal Cortex.” Nature Neuroscience 24: 425–36. https://doi.org/10.1038/s41593-020-00787-0.
Back to top