30 Differential colocalization
30.1 Preamble
30.1.1 Introduction
In this section we will explore tools to assess differential colocalization between conditions with the two packages spicyR and spatialFDA. Both packages are based the comparison of known spatial statistics functions implemented in the package spatstat.
30.1.2 Dependencies
30.1.3 Load data
For this example, we use a dataset on type I diabetes (T1D) (Damond et al. 2019). This dataset contains three conditions (non-diabetic, onset, and long-duration diabetes), several patients per condition, and several fields-of-view per patient. This hierarchical setup requires the modeler to account for correlation between fields-of-view, for example if they stem from the same patient. Such a setup can be modeled with mixed models.
Code
# load data from full dataset (not run due to computational limitations)
spe <- .loadExample(full = TRUE)
# subset data to smaller example
spe <- subset(spe, , patient_id %in% c(6089, 6180, 6126, 6134, 6228, 6414))
# remove assays (for smaller object size)
assays(spe) <- list()
Code
# load data from OSF repository (for smaller download)
osf_repo <- osf_retrieve_node("https://osf.io/5n4q3/")
osf_files <- osf_ls_files(osf_repo, path = "zzz",
pattern = "Damond_noAssays.rds", n_max = Inf)
dir.create(td <- tempfile())
osf_download(osf_files, path = td)
## # A tibble: 1 × 4
## name id local_path meta
## <chr> <chr> <chr> <list>
## 1 IMC_HumanPancreas_Damond_noAs… 682e2ea1abc2a7… /tmp/RtmpoPOas… <named list>
Code
spe <- readRDS(file.path(td, list.files(td, ".rds")))
# subset patient IDs
spe <- subset(spe, , patient_id %in% c(6089, 6180, 6126, 6134, 6228, 6414))
colData(spe)$cell_type <- as.factor(colData(spe)$cell_type)
colData(spe)$patient_stage <- as.factor(colData(spe)$patient_stage) |>
relevel("Non-diabetic")
30.2 spicyR
First, we show the functionality of spicyR. This package condenses the information of a spatial statistics curve to a scalar value and models this continuous value as the response in a linear (mixed) effects model (Canete et al. 2022).
Code
## intercept coefficient p.value adj.pvalue
## delta__acinar -17.7729395 9.004587 1.545527e-29 3.048377e-27
## acinar__delta -17.0881419 9.108776 2.400297e-29 3.048377e-27
## alpha__acinar -15.9964182 7.365380 4.260325e-23 3.607075e-21
## acinar__alpha -15.2344993 7.439930 8.719721e-23 5.537023e-21
## beta__beta 65.9550031 -122.621670 7.039075e-20 3.575850e-18
## ductal__delta -11.8704699 7.459358 4.866759e-16 2.060261e-14
## delta__ductal -12.4926633 7.319482 1.817831e-15 6.596128e-14
## ductal__alpha -10.3178141 5.931253 3.552279e-12 1.127848e-10
## alpha__ductal -10.9802210 5.822295 6.401701e-12 1.806702e-10
## alpha__beta 56.0456396 -51.376914 1.341019e-10 3.406189e-09
## beta__alpha 56.0301246 -50.922099 1.579354e-10 3.646872e-09
## beta__delta 60.5631571 -54.877724 6.425864e-10 1.360141e-08
## delta__beta 60.5656499 -53.513149 1.773028e-09 3.464225e-08
## alpha__stromal -4.0611492 -8.926843 3.469114e-09 6.293964e-08
## stromal__alpha -3.3719695 -8.705048 4.598295e-09 7.786447e-08
## acinar__gamma -18.0474567 12.786526 2.365790e-08 3.755692e-07
## gamma__acinar -18.3462631 12.337313 4.409971e-08 6.364668e-07
## ductal__ductal 6.1000393 -2.337259 4.510395e-08 6.364668e-07
## ductal__macrophage 2.5307434 -3.798188 3.331388e-07 4.453540e-06
## ductal__stromal 0.3581832 -1.906647 4.482314e-07 5.692539e-06
## macrophage__ductal 2.7423608 -3.697457 1.061539e-06 1.283957e-05
## delta__stromal -5.9495963 -8.351216 1.565689e-06 1.807659e-05
## stromal__delta -5.3354402 -8.090325 2.864985e-06 3.163940e-05
## stromal__ductal 0.2547199 -1.623678 2.359888e-05 2.497548e-04
## acinar__acinar 2.6329065 -1.314186 2.893433e-05 2.939728e-04
## ductal__gamma -12.8011642 12.300659 4.071268e-05 3.977316e-04
## gamma__ductal -12.9030183 11.710494 6.960197e-05 6.547741e-04
## from to
## delta__acinar delta acinar
## acinar__delta acinar delta
## alpha__acinar alpha acinar
## acinar__alpha acinar alpha
## beta__beta beta beta
## ductal__delta ductal delta
## delta__ductal delta ductal
## ductal__alpha ductal alpha
## alpha__ductal alpha ductal
## alpha__beta alpha beta
## beta__alpha beta alpha
## beta__delta beta delta
## delta__beta delta beta
## alpha__stromal alpha stromal
## stromal__alpha stromal alpha
## acinar__gamma acinar gamma
## gamma__acinar gamma acinar
## ductal__ductal ductal ductal
## ductal__macrophage ductal macrophage
## ductal__stromal ductal stromal
## macrophage__ductal macrophage ductal
## delta__stromal delta stromal
## stromal__delta stromal delta
## stromal__ductal stromal ductal
## acinar__acinar acinar acinar
## ductal__gamma ductal gamma
## gamma__ductal gamma ductal
We can plot the results as a heatmap.
Code
signifPlot(
spicyTestPair,
breaks = c(-3, 3, 1)
)
30.3 spatialFDA
The spatialFDA package computes spatial statistics functions and compares these with tools from functional data analysis (FDA). We will consider the colocalization of \(\alpha\) cells in the islet and cytoxic T-cells, which are important in the destruction of islet cells in T1D (Damond et al. 2019).
This example is adapted from the spatialFDA vignette.
Code
metricRes <- calcMetricPerFov(
spe = spe,
selection = c("alpha", "Tc"),
subsetby = "image_number",
fun = "Gcross",
marks = "cell_type",
rSeq = seq(0, 50, length.out = 50),
by = c("patient_stage", "patient_id", "image_number"),
ncores = 1
)
## [1] "Calculating Gcross from alpha to Tc"
First, we will plot the functional box plot of the square-root transformed spatial statistics functions (Sun and Genton 2011).
Code
metricRes$ID <- paste0(
metricRes$patient_stage, "x", metricRes$patient_id,
"x", metricRes$image_number
)
# removing field of views that have as a curve only zeros - these are cases
# where there are no cells of one type
metricRes <- metricRes %>% dplyr::group_by(ID) %>% dplyr::filter(sum(rs) >= 1)
# sqrt transform response
metricRes$rs <- sqrt(metricRes$rs)
collector <- plotFbPlot(metricRes, "r", "rs", "patient_stage")
The black line indicates the median curve, the purple region is the 50% central region, and the red lines indicate outliers.
From these plots we note that onset functions seem to qualitatively have a higher median curve than non-diabetic or long-duration diabetes functions.
Another exploratory technique is functional PCA (FPCA). FPCA is similar to standard PCA in that it decomposes the functional data into the main modes of variation. Here, we decompose into eigenfunctions (Goldsmith et al. 2024).
Code
# filter out all rows that have a constant zero part - all r < 10
metricRes <- metricRes %>% filter(r > 10)
# prepare data frame from calcMetricRes to be in the correct format for pffr
dat <- prepData(metricRes, "r", "rs")
# create meta info of the IDs
splitData <- dat$ID %>%
str_replace("-","_") %>%
str_split_fixed("x", 3) %>%
data.frame(stringsAsFactors = TRUE) %>%
setNames(c("condition", "patient_id", "imageId")) %>%
mutate(condition = relevel(condition,"Non_diabetic"))
dat <- cbind(dat, splitData)
# drop rows with NA
dat <- dat |> drop_na()
# calculate the fPCA
pca <- functionalPCA(dat = dat, r = metricRes$r |> unique(), pve = 0.995)
evalues <- pca$evalues
efunctions <- pca$efunctions
plotFpca(dat = dat, res = pca, colourby = "condition") +
labs(color = "condition")
From the FPCA we note that the variability in onset diabetes curves is larger than the other two categories.
Next, we will show how FDA techniques can be used to make inference on the spatial statistics curves. We use a generalized additive mixed model with a functional response (Scheipl, Staicu, and Greven 2015; Scheipl, Gertheiss, and Greven 2016; Goldsmith et al. 2024).
Code
# create design matrix
mm <- model.matrix(~ condition, data = dat)
colnames(mm)[1] <- "Intercept"
mm %>% head()
## Intercept conditionLong_duration conditionOnset
## 1 1 1 0
## 2 1 1 0
## 3 1 1 0
## 4 1 1 0
## 5 1 1 0
## 6 1 1 0
Code
r <- metricRes$r |> unique()
# fit the model
mdl <- functionalGam(
data = dat, x = r,
designmat = mm, weights = dat$npoints,
formula = formula(Y ~ 1 + conditionLong_duration +
conditionOnset + s(patient_id, bs = "re")),
family = mgcv::scat(link = "log"),
algorithm = "gam"
)
summary(mdl)
##
## Family: Scaled t(3.692,0.098)
## Link function: log
##
## Formula:
## Y ~ 1 + conditionLong_duration + conditionOnset + s(patient_id,
## bs = "re")
##
## Constant coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.523 0.169 -9.013 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Smooth terms & functional coefficients:
## edf Ref.df Chi.sq p-value
## Intercept(x) 15.78 19.000 611.095 < 2e-16 ***
## conditionLong_duration(x) 1.00 1.000 0.067 0.796154
## conditionOnset(x) 2.76 2.926 20.457 0.000316 ***
## s(patient_id) 12.64 27.000 844.721 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.675 Deviance explained = 57.9%
## -REML score = -7849 Scale est. = 1 n = 12720 (318 x 40)
## using seWithMean for s(x.vec) .
## using seWithMean for s(x.vec) .
## using seWithMean for s(x.vec) .
Code
patchwork::wrap_plots(plotLs, nrow = 3, axes = "collect")
With the functional GAM we see not only the effect (increased or decreased colocalization in comparison to the reference category) but also at which scale this effect is strong.
The Q-Q plot indicates a reasonable model fit, although not perfect.