30  Differential colocalization

30.1 Preamble

30.1.1 Introduction

In this section we will explore tools to assess differential colocalization between conditions with the two packages spicyR and spatialFDA. Both packages are based the comparison of known spatial statistics functions implemented in the package spatstat.

30.1.2 Dependencies

30.1.3 Load data

For this example, we use a dataset on type I diabetes (T1D) (Damond et al. 2019). This dataset contains three conditions (non-diabetic, onset, and long-duration diabetes), several patients per condition, and several fields-of-view per patient. This hierarchical setup requires the modeler to account for correlation between fields-of-view, for example if they stem from the same patient. Such a setup can be modeled with mixed models.

Code
# load data from full dataset (not run due to computational limitations)
spe <- .loadExample(full = TRUE)
# subset data to smaller example
spe <- subset(spe, , patient_id %in% c(6089, 6180, 6126, 6134, 6228, 6414))
# remove assays (for smaller object size)
assays(spe) <- list()
Code
# load data from OSF repository (for smaller download)
osf_repo <- osf_retrieve_node("https://osf.io/5n4q3/")
osf_files <- osf_ls_files(osf_repo, path = "zzz", 
                          pattern = "Damond_noAssays.rds", n_max = Inf)
dir.create(td <- tempfile())
osf_download(osf_files, path = td)
##  # A tibble: 1 × 4
##    name                           id              local_path      meta        
##    <chr>                          <chr>           <chr>           <list>      
##  1 IMC_HumanPancreas_Damond_noAs… 682e2ea1abc2a7… /tmp/RtmpoPOas… <named list>
Code
spe <- readRDS(file.path(td, list.files(td, ".rds")))

# subset patient IDs
spe <- subset(spe, , patient_id %in% c(6089, 6180, 6126, 6134, 6228, 6414))

colData(spe)$cell_type <- as.factor(colData(spe)$cell_type)
colData(spe)$patient_stage <- as.factor(colData(spe)$patient_stage) |> 
                              relevel("Non-diabetic")

30.2 spicyR

First, we show the functionality of spicyR. This package condenses the information of a spatial statistics curve to a scalar value and models this continuous value as the response in a linear (mixed) effects model (Canete et al. 2022).

Code
colData(spe)$cellType <- colData(spe)$cell_type
colData(spe)$cell_type <- as.factor(colData(spe)$cell_type)

# note runtime: ~5 min
spicyTestPair <- spicy(
  spe, 
  condition = "patient_stage", 
  imageID = "image_number", 
  window = "square", 
  cores = 4
)

out <- topPairs(spicyTestPair, n = 27)
out
##                       intercept coefficient      p.value   adj.pvalue
##  delta__acinar      -17.7729395    9.004587 1.545527e-29 3.048377e-27
##  acinar__delta      -17.0881419    9.108776 2.400297e-29 3.048377e-27
##  alpha__acinar      -15.9964182    7.365380 4.260325e-23 3.607075e-21
##  acinar__alpha      -15.2344993    7.439930 8.719721e-23 5.537023e-21
##  beta__beta          65.9550031 -122.621670 7.039075e-20 3.575850e-18
##  ductal__delta      -11.8704699    7.459358 4.866759e-16 2.060261e-14
##  delta__ductal      -12.4926633    7.319482 1.817831e-15 6.596128e-14
##  ductal__alpha      -10.3178141    5.931253 3.552279e-12 1.127848e-10
##  alpha__ductal      -10.9802210    5.822295 6.401701e-12 1.806702e-10
##  alpha__beta         56.0456396  -51.376914 1.341019e-10 3.406189e-09
##  beta__alpha         56.0301246  -50.922099 1.579354e-10 3.646872e-09
##  beta__delta         60.5631571  -54.877724 6.425864e-10 1.360141e-08
##  delta__beta         60.5656499  -53.513149 1.773028e-09 3.464225e-08
##  alpha__stromal      -4.0611492   -8.926843 3.469114e-09 6.293964e-08
##  stromal__alpha      -3.3719695   -8.705048 4.598295e-09 7.786447e-08
##  acinar__gamma      -18.0474567   12.786526 2.365790e-08 3.755692e-07
##  gamma__acinar      -18.3462631   12.337313 4.409971e-08 6.364668e-07
##  ductal__ductal       6.1000393   -2.337259 4.510395e-08 6.364668e-07
##  ductal__macrophage   2.5307434   -3.798188 3.331388e-07 4.453540e-06
##  ductal__stromal      0.3581832   -1.906647 4.482314e-07 5.692539e-06
##  macrophage__ductal   2.7423608   -3.697457 1.061539e-06 1.283957e-05
##  delta__stromal      -5.9495963   -8.351216 1.565689e-06 1.807659e-05
##  stromal__delta      -5.3354402   -8.090325 2.864985e-06 3.163940e-05
##  stromal__ductal      0.2547199   -1.623678 2.359888e-05 2.497548e-04
##  acinar__acinar       2.6329065   -1.314186 2.893433e-05 2.939728e-04
##  ductal__gamma      -12.8011642   12.300659 4.071268e-05 3.977316e-04
##  gamma__ductal      -12.9030183   11.710494 6.960197e-05 6.547741e-04
##                           from         to
##  delta__acinar           delta     acinar
##  acinar__delta          acinar      delta
##  alpha__acinar           alpha     acinar
##  acinar__alpha          acinar      alpha
##  beta__beta               beta       beta
##  ductal__delta          ductal      delta
##  delta__ductal           delta     ductal
##  ductal__alpha          ductal      alpha
##  alpha__ductal           alpha     ductal
##  alpha__beta             alpha       beta
##  beta__alpha              beta      alpha
##  beta__delta              beta      delta
##  delta__beta             delta       beta
##  alpha__stromal          alpha    stromal
##  stromal__alpha        stromal      alpha
##  acinar__gamma          acinar      gamma
##  gamma__acinar           gamma     acinar
##  ductal__ductal         ductal     ductal
##  ductal__macrophage     ductal macrophage
##  ductal__stromal        ductal    stromal
##  macrophage__ductal macrophage     ductal
##  delta__stromal          delta    stromal
##  stromal__delta        stromal      delta
##  stromal__ductal       stromal     ductal
##  acinar__acinar         acinar     acinar
##  ductal__gamma          ductal      gamma
##  gamma__ductal           gamma     ductal

We can plot the results as a heatmap.

Code
signifPlot(
  spicyTestPair, 
  breaks = c(-3, 3, 1)
)

30.3 spatialFDA

The spatialFDA package computes spatial statistics functions and compares these with tools from functional data analysis (FDA). We will consider the colocalization of \(\alpha\) cells in the islet and cytoxic T-cells, which are important in the destruction of islet cells in T1D (Damond et al. 2019).

This example is adapted from the spatialFDA vignette.

Code
metricRes <- calcMetricPerFov(
  spe = spe, 
  selection = c("alpha", "Tc"), 
  subsetby = "image_number", 
  fun = "Gcross", 
  marks = "cell_type", 
  rSeq = seq(0, 50, length.out = 50), 
  by = c("patient_stage", "patient_id", "image_number"), 
  ncores = 1
)
##  [1] "Calculating Gcross from alpha to Tc"

First, we will plot the functional box plot of the square-root transformed spatial statistics functions (Sun and Genton 2011).

Code
metricRes$ID <- paste0(
  metricRes$patient_stage, "x", metricRes$patient_id, 
  "x", metricRes$image_number
)

# removing field of views that have as a curve only zeros - these are cases
# where there are no cells of one type
metricRes <- metricRes %>% dplyr::group_by(ID) %>% dplyr::filter(sum(rs) >= 1)

# sqrt transform response
metricRes$rs <- sqrt(metricRes$rs)

collector <- plotFbPlot(metricRes, "r", "rs", "patient_stage")

The black line indicates the median curve, the purple region is the 50% central region, and the red lines indicate outliers.

From these plots we note that onset functions seem to qualitatively have a higher median curve than non-diabetic or long-duration diabetes functions.

Another exploratory technique is functional PCA (FPCA). FPCA is similar to standard PCA in that it decomposes the functional data into the main modes of variation. Here, we decompose into eigenfunctions (Goldsmith et al. 2024).

Code
# filter out all rows that have a constant zero part - all r < 10
metricRes <- metricRes %>% filter(r > 10)

# prepare data frame from calcMetricRes to be in the correct format for pffr
dat <- prepData(metricRes, "r", "rs")

# create meta info of the IDs
splitData <- dat$ID %>% 
  str_replace("-","_") %>% 
  str_split_fixed("x", 3) %>% 
  data.frame(stringsAsFactors = TRUE) %>% 
  setNames(c("condition", "patient_id", "imageId")) %>% 
  mutate(condition = relevel(condition,"Non_diabetic"))

dat <- cbind(dat, splitData)

# drop rows with NA
dat <- dat |> drop_na()

# calculate the fPCA
pca <- functionalPCA(dat = dat, r = metricRes$r |> unique(), pve = 0.995)
evalues <- pca$evalues
efunctions <- pca$efunctions

plotFpca(dat = dat, res = pca, colourby = "condition") + 
  labs(color = "condition")

From the FPCA we note that the variability in onset diabetes curves is larger than the other two categories.

Next, we will show how FDA techniques can be used to make inference on the spatial statistics curves. We use a generalized additive mixed model with a functional response (Scheipl, Staicu, and Greven 2015; Scheipl, Gertheiss, and Greven 2016; Goldsmith et al. 2024).

Code
# create design matrix
mm <- model.matrix(~ condition, data = dat)
colnames(mm)[1] <- "Intercept"
mm %>% head()
##    Intercept conditionLong_duration conditionOnset
##  1         1                      1              0
##  2         1                      1              0
##  3         1                      1              0
##  4         1                      1              0
##  5         1                      1              0
##  6         1                      1              0
Code
r <- metricRes$r |> unique()

# fit the model
mdl <- functionalGam(
    data = dat, x = r, 
    designmat = mm, weights = dat$npoints, 
    formula = formula(Y ~ 1 + conditionLong_duration + 
                          conditionOnset + s(patient_id, bs = "re")), 
    family = mgcv::scat(link = "log"), 
    algorithm = "gam"
)

summary(mdl)
##  
##  Family: Scaled t(3.692,0.098) 
##  Link function: log 
##  
##  Formula:
##  Y ~ 1 + conditionLong_duration + conditionOnset + s(patient_id, 
##      bs = "re")
##  
##  Constant coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
##  (Intercept)   -1.523      0.169  -9.013   <2e-16 ***
##  ---
##  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##  
##  Smooth terms & functional coefficients:
##                              edf Ref.df  Chi.sq  p-value    
##  Intercept(x)              15.78 19.000 611.095  < 2e-16 ***
##  conditionLong_duration(x)  1.00  1.000   0.067 0.796154    
##  conditionOnset(x)          2.76  2.926  20.457 0.000316 ***
##  s(patient_id)             12.64 27.000 844.721  < 2e-16 ***
##  ---
##  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##  
##  R-sq.(adj) =  0.675   Deviance explained = 57.9%
##  -REML score =  -7849  Scale est. = 1         n = 12720 (318 x 40)
Code
plotLs <- lapply(colnames(mm), plotMdl, mdl = mdl, 
                 shift = mdl$coefficients[["(Intercept)"]])
##  using seWithMean for  s(x.vec) .
##  using seWithMean for  s(x.vec) .
##  using seWithMean for  s(x.vec) .
Code
patchwork::wrap_plots(plotLs, nrow = 3, axes = "collect")

With the functional GAM we see not only the effect (increased or decreased colocalization in comparison to the reference category) but also at which scale this effect is strong.

The Q-Q plot indicates a reasonable model fit, although not perfect.

Code
qqnorm(resid(mdl), pch = 16)
qqline(resid(mdl))

30.4 Appendix

References

Canete, Nicolas P., Sourish S. Iyengar, John T. Ormerod, Heeva Baharlou, Andrew N. Harman, and Ellis Patrick. 2022. spicyR: Spatial Analysis of in Situ Cytometry Data in R.” Bioinformatics 38: 3099–3105. https://doi.org/10.1093/bioinformatics/btac268.
Damond, Nicolas, Stefanie Engler, Vito R. T. Zanotelli, Denis Schapiro, Clive H. Wasserfall, Irina Kusmartseva, Harry S. Nick, et al. 2019. “A Map of Human Type 1 Diabetes Progression by Imaging Mass Cytometry.” Cell Metabolism 29 (3): 755–768.e5. https://doi.org/10.1016/j.cmet.2018.11.014.
Goldsmith, Jeff, Fabian Scheipl, Lei Huang, Julia Wrobel, Chongzhi Di, Jonathan Gellar, Jaroslaw Harezlak, et al. 2024. “Refund: Regression with Functional Data.” R Package. https://cran.r-project.org/package=refund.
Scheipl, Fabian, Jan Gertheiss, and Sonja Greven. 2016. “Generalized Functional Additive Mixed Models.” Electronic Journal of Statistics 10: 1455–92. https://doi.org/10.1214/16-EJS1145.
Scheipl, Fabian, Ana-Maria Staicu, and Sonja Greven. 2015. “Functional Additive Mixed Models.” Journal of Computational and Graphical Statistics 24: 477–501. https://doi.org/10.1080/10618600.2014.901914.
Sun, Ying, and Marc G. Genton. 2011. “Functional Boxplots.” Journal of Computational and Graphical Statistics 20: 316–34. https://doi.org/10.1198/jcgs.2011.09224.
Back to top