library(tidyverse)
library(cowplot)
library(broom) 
library(modelr) 
library(viridis)
library(lubridate)
library(hms)
library(knitr)
library(kableExtra)
library(ggridges)
library(brms)
library(tidybayes)
library(mice)

knitr::opts_chunk$set(tidy.opts=list(width.cutoff=60),tidy=TRUE, echo = TRUE, message=FALSE, warning=FALSE, fig.align="center")


source("../../../../IDA/tools/plotting_tools.R")

theme_set(theme_1())

Intro

This notebook performs bayesian parameter estimation using brms on data that I acquired and processed in this notebook. Essentially I quantified extracellular DNA in WT P. aeruginosa colony biofilms using the dye TOTO-1. I measured the fluorescence in replicate suspensions of replicate colonies and then calculated DNA content with a calibration curve of TOTO with calf thymus DNA (in vitro).

This dataset is inherently hierarchical because we have one hyperparameter that we are interested in - the eDNA concentration in WT colony biofilms, with 6 biological replicates (separate colony biofilms), and each biological replicate was measured in technical triplicate. It is quite reasonable to think that variability occurs at each level, because the biofilm suspension is not very homogeneous (so each technical replicate probably varies), and naturally there’s biological variability as well.

Results

Let’s first take a look at the dataset:

wt_preds <- read_csv("07_23_19_wt_TOTO_calCurve_estimates.csv")

wt_preds %>% kable() %>% kable_styling(bootstrap_options = "condensed") %>% 
    scroll_box(width = "100%", height = "400px")
Wavelength Well FluorInt Condition Colony_number Replicate pred eDNA_ug_mL eDNA_dsDNA_uM
535 A2 22868 Control NA NA 1.5861354 84.593889 253.781666
535 D1 217 noDye 1 1 -0.0197467 -1.053155 -3.159467
535 D2 168 noDye 2 1 -0.0232206 -1.238432 -3.715297
535 D3 157 noDye 3 1 -0.0240005 -1.280025 -3.840075
535 D4 173 noDye 4 1 -0.0228661 -1.219526 -3.658580
535 D5 159 noDye 5 1 -0.0238587 -1.272463 -3.817388
535 D6 152 noDye 6 1 -0.0243550 -1.298931 -3.896793
535 E1 10665 TOTO 1 1 0.7209825 38.452399 115.357197
535 E2 9108 TOTO 2 1 0.6105963 32.565134 97.695401
535 E3 6415 TOTO 3 1 0.4196713 22.382471 67.147414
535 E4 7289 TOTO 4 1 0.4816351 25.687205 77.061614
535 E5 10798 TOTO 5 1 0.7304118 38.955293 116.865880
535 E6 6794 TOTO 6 1 0.4465412 23.815531 71.446592
535 F1 12517 TOTO 1 2 0.8522832 45.455107 136.365320
535 F2 6882 TOTO 2 2 0.4527801 24.148273 72.444818
535 F3 9006 TOTO 3 2 0.6033648 32.179456 96.538366
535 F4 6559 TOTO 4 2 0.4298805 22.926958 68.780874
535 F5 8397 TOTO 5 2 0.5601887 29.876729 89.630188
535 F6 6527 TOTO 6 2 0.4276118 22.805961 68.417883
535 G1 13607 TOTO 1 3 0.9295607 49.576571 148.729711
535 G2 8622 TOTO 2 3 0.5761404 30.727490 92.182471
535 G3 7151 TOTO 3 3 0.4718513 25.165405 75.496214
535 G4 8291 TOTO 4 3 0.5526736 29.475926 88.427779
535 G5 7998 TOTO 5 3 0.5319009 28.368047 85.104140
535 G6 8059 TOTO 6 3 0.5362256 28.598697 85.796092

So you can see that we have the raw data from the wells (from the platereader), as well as a prediction of the well concentration, and biofilm concentration in uM (eDNA_dsDNA_uM). Look back at the other notebook to see how this was done. Anyway, what we have is a dataset that looks like this:

ggplot(wt_preds, aes(x = Colony_number, y = eDNA_dsDNA_uM, shape = Condition)) + 
    geom_point()

You can see that the wells incubated with dye show an obvious signal. There is heterogeneity among the different colonies (biological replicates), and the technical replicates. Let’s see if we can use some hierarchical models to accurately calculate an estimate for the eDNA content of the “hyper-colony”!

Model 1: 3 Levels

Ok, so for this first model I wrote out the following in brms syntax: eDNA_dsDNA_uM ~ 1 + (1 | Colony_number / Replicate)

You’ll see as we work through the other models, but briefly this means estimate the intercept (e.g. the value) for this eDNA variable (eDNA_dsDNA_uM ~ 1), but group this intercept into a two level hierarchy with colony and then replicate (+ (1 | Colony_number / Replicate)). You can see that we only have to set the prior for the intercept hyperparameter \(\text{Normal}(100, 30)\). This model takes a while to run, but we’ll cache the results in a file. With multiple levels of hierarchy it can also be hard to estimate, so I’ve ramped up the iterations to 10,000 and adapt delta to 0.95.

wt_brm <- brm(eDNA_dsDNA_uM ~ 1 + (1 | Colony_number/Replicate), 
    data = wt_preds %>% filter(Condition == "TOTO"), prior = prior(normal(100, 
        30), class = Intercept), sample_prior = T, chains = 4, 
    cores = 2, iter = 10000, control = list(adapt_delta = 0.95), 
    thin = 5, inits = "random", file = "wt_brm_model_3")


summary(wt_brm)
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: eDNA_dsDNA_uM ~ 1 + (1 | Colony_number/Replicate) 
##    Data: wt_preds %>% filter(Condition == "TOTO") (Number of observations: 18) 
## Samples: 4 chains, each with iter = 10000; warmup = 5000; thin = 5;
##          total post-warmup samples = 4000
## 
## Group-Level Effects: 
## ~Colony_number (Number of levels: 6) 
##               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept)    20.49      8.68     6.45    41.55       3378 1.00
## 
## ~Colony_number:Replicate (Number of levels: 18) 
##               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept)     9.89      5.74     0.53    21.82       1339 1.00
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept    92.81      9.21    74.44   111.55       3680 1.00
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sigma    11.08      5.37     2.05    22.10        861 1.00
## 
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
## is a crude measure of effective sample size, and Rhat is the potential 
## scale reduction factor on split chains (at convergence, Rhat = 1).

With these settings it seems like we get nice convergence and reasonable parameter estimates. Here’s the chains and posteriors:

plot(wt_brm, N = 4, ask = F)

All the chains seem to be well mixed. b_Intercept is the hyperparameter for the intercept and the posterior looks reasonably good. The sd and sigma variables have some weirder distributions, but I think that’s to be expected for this dataset with relatively few datapoints.

prior_summary(wt_brm) %>% kable() %>% kable_styling()
prior class coef group resp dpar nlpar bound
normal(100, 30) Intercept
student_t(3, 0, 19) sd
sd Colony_number
sd Intercept Colony_number
sd Colony_number:Replicate
sd Intercept Colony_number:Replicate
student_t(3, 0, 19) sigma

Just for clarity, these were the prior distributions that brms assumed for this model. So you can see there’s the hyperprior I supplied, then it’s a little harder to undertand. There seems to be a std.dev prior from the hyperparameter, then at the colony level there’s a sd for the sd and the intercept and same thing for each replicate?

I still don’t fully understand what brms is doing here, but also notice that there’s a sigma at bottom. My understanding is that brms has assumed that this is a gaussian process and that sigma should connect any group level parameters to the actual datapoints. Therefore, I believe, this model actually has too many levels and that each datapoint is accidentally in its own group, which is not correct.

We’ll try a simpler version of this model next.

Here’s what the hierarchical estimates look like plotted:

# ggplot(data = posterior_samples(wt_brm)) +
# geom_density(aes(x = b_Intercept))

# as_tibble(coef(wt_brm)$Colony_number, rownames =
# 'Colony_number')


ggplot(wt_preds %>% filter(Condition == "TOTO"), aes(x = factor(Colony_number), 
    y = eDNA_dsDNA_uM)) + geom_hline(yintercept = as_tibble(fixef(wt_brm))$Estimate, 
    linetype = 2) + geom_hline(yintercept = as_tibble(fixef(wt_brm))$Q2.5, 
    linetype = 2, color = "gray") + geom_hline(yintercept = as_tibble(fixef(wt_brm))$Q97.5, 
    linetype = 2, color = "gray") + geom_jitter(height = 0, width = 0.1, 
    shape = 21) + geom_pointrange(data = as_tibble(coef(wt_brm)$Colony_number, 
    rownames = "Colony_number"), aes(x = factor(Colony_number), 
    y = Estimate.Intercept, ymin = Q2.5.Intercept, ymax = Q97.5.Intercept))

The point ranges are 95% CI for each colony, and the dotted lines are 95% CI for the hyperparameter.

Model 2: 2 Levels!

Ok, with my realization on the last model, I believe we can just rewrite this hierarchy in brms syntax as eDNA_dsDNA_uM ~ 1 + (1 | Colony_number). Comparing with above you can see that we have now told brms to only group the intercept values by colony number. Since brms assumes this is a gaussian model, it should connect the group level estimates from each colony to the observed datapoints with a normal distribution parameterized by sigma (as seen in the prior summary above). Let’s see how it works!

wt_brm_single_level <- brm(eDNA_dsDNA_uM ~ 1 + (1 | Colony_number), 
    data = wt_preds %>% filter(Condition == "TOTO"), prior = prior(normal(100, 
        30), class = Intercept), sample_prior = T, chains = 4, 
    cores = 2, iter = 10000, control = list(adapt_delta = 0.95), 
    thin = 5, inits = "random", file = "wt_brm_single_level_1")

summary(wt_brm_single_level)
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: eDNA_dsDNA_uM ~ 1 + (1 | Colony_number) 
##    Data: wt_preds %>% filter(Condition == "TOTO") (Number of observations: 18) 
## Samples: 4 chains, each with iter = 10000; warmup = 5000; thin = 5;
##          total post-warmup samples = 4000
## 
## Group-Level Effects: 
## ~Colony_number (Number of levels: 6) 
##               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept)    21.21      8.69     7.91    41.45       3039 1.00
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept    92.34      9.27    73.56   111.27       3644 1.00
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sigma    15.36      3.55    10.25    24.02       2799 1.00
## 
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
## is a crude measure of effective sample size, and Rhat is the potential 
## scale reduction factor on split chains (at convergence, Rhat = 1).

Again, the chains seem to have mixed well. Notice that we have a lot more effective samples and fewer parameter estimates now - this looks more correct.

plot(wt_brm_single_level)

Again the chains look well mixed, and now the sd and sigma posteriors actually look unimodal, which is better. Let’s look at the prior summary:

# coef(wt_brm_single_level)

prior_summary(wt_brm_single_level) %>% kable() %>% kable_styling()
prior class coef group resp dpar nlpar bound
normal(100, 30) Intercept
student_t(3, 0, 19) sd
sd Colony_number
sd Intercept Colony_number
student_t(3, 0, 19) sigma

This looks more like what I would expect. I still need to understand exactly how brms is using the sd class priors though…

Anyway, let’s compare the prior and posterior for our hyperparameter:

priorPostSamples <- bind_rows(posterior_samples(wt_brm_single_level) %>% 
    select(b_Intercept) %>% mutate(dist = "posterior"), tibble(b_Intercept = rnorm(4000, 
    100, 30)) %>% mutate(dist = "prior"))

ggplot(priorPostSamples, aes(x = b_Intercept, y = dist)) + geom_density_ridges(jittered_points = T, 
    position = "raincloud", point_alpha = 0.02) + theme_ridges()

Not bad! The data definitely informed the estimate, and the estimate seems reasonable based on the data. Let’s look at the group vs overall estimates:

ggplot(wt_preds %>% filter(Condition == "TOTO"), aes(x = factor(Colony_number), 
    y = eDNA_dsDNA_uM)) + geom_hline(yintercept = as_tibble(fixef(wt_brm_single_level))$Estimate, 
    linetype = 2) + geom_hline(yintercept = as_tibble(fixef(wt_brm_single_level))$Q2.5, 
    linetype = 2, color = "gray") + geom_hline(yintercept = as_tibble(fixef(wt_brm_single_level))$Q97.5, 
    linetype = 2, color = "gray") + geom_jitter(height = 0, width = 0.1, 
    shape = 21) + geom_pointrange(data = as_tibble(coef(wt_brm_single_level)$Colony_number, 
    rownames = "Colony_number"), aes(x = factor(Colony_number), 
    y = Estimate.Intercept, ymin = Q2.5.Intercept, ymax = Q97.5.Intercept))

You can see that for colonies 1 and 5 the confidence interval is pulling toward the overall estimate. The effects of the hierarchy aren’t super dramatic, but I think this makes a lot of sense. Just to fully vizualize the effects of the hierarchy, why don’t we compare to a model with no hierarchy!?

Model 3: No hierarchy

Here the model is simply eDNA_dsDNA_uM ~ 1:

wt_brm_no_level <- brm(eDNA_dsDNA_uM ~ 1, data = wt_preds %>% 
    filter(Condition == "TOTO"), prior = prior(normal(100, 30), 
    class = Intercept), sample_prior = T, chains = 4, cores = 2, 
    iter = 10000, control = list(adapt_delta = 0.95), thin = 5, 
    inits = "random", file = "wt_brm_no_level_1")

summary(wt_brm_no_level)
##  Family: gaussian 
##   Links: mu = identity; sigma = identity 
## Formula: eDNA_dsDNA_uM ~ 1 
##    Data: wt_preds %>% filter(Condition == "TOTO") (Number of observations: 18) 
## Samples: 4 chains, each with iter = 10000; warmup = 5000; thin = 5;
##          total post-warmup samples = 4000
## 
## Population-Level Effects: 
##           Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept    92.12      5.64    81.24   103.15       3832 1.00
## 
## Family Specific Parameters: 
##       Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sigma    24.37      4.37    17.55    34.42       3944 1.00
## 
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
## is a crude measure of effective sample size, and Rhat is the potential 
## scale reduction factor on split chains (at convergence, Rhat = 1).

The sampling seems to have gone well and the CI looks tighter.

no_one_level_Samples <- bind_rows(posterior_samples(wt_brm_single_level) %>% 
    select(b_Intercept) %>% mutate(dist = "two level"), posterior_samples(wt_brm_no_level) %>% 
    select(b_Intercept) %>% mutate(dist = "no hier"), tibble(b_Intercept = rnorm(4000, 
    100, 30)) %>% mutate(dist = "prior"))

ggplot(no_one_level_Samples, aes(x = b_Intercept, y = dist)) + 
    geom_density_ridges(jittered_points = T, position = "raincloud", 
        point_alpha = 0.02) + theme_ridges()

Indeed, we can see that no hierarchy model is actually more confident in a tighter region than the hierarchy - why might that be?

Let’s fit models for each group individually (no hierarchy, but no pooling). These models get fit in parallel as a list, so it’s hard to show the output. I’ll just show the parameter estimate from colony 1 to convince you it worked:

wt_preds_list <- wt_preds %>% filter(Condition == "TOTO") %>% 
    group_split(Colony_number)

wt_brm_multi_flat <- brm_multiple(eDNA_dsDNA_uM ~ 1, data = wt_preds_list, 
    prior = prior(normal(100, 30), class = Intercept), sample_prior = T, 
    chains = 4, cores = 2, iter = 4000, control = list(adapt_delta = 0.95), 
    inits = "random", combine = F)

as_tibble(fixef(wt_brm_multi_flat[[1]]))
## # A tibble: 1 x 4
##   Estimate Est.Error  Q2.5 Q97.5
##      <dbl>     <dbl> <dbl> <dbl>
## 1     129.      11.7  102.  150.

Comparing Hierarchy vs. Flat

Now let’s do something cool Let’s overlay the 2 level hierarchical estimates with the no hierarchy pooled and unpooled estimates to vizualize the effect of the hierarchy:

wt_no_level_coef <- bind_rows(as_tibble(fixef(wt_brm_multi_flat[[1]])), 
    as_tibble(fixef(wt_brm_multi_flat[[2]]), ), as_tibble(fixef(wt_brm_multi_flat[[3]])), 
    as_tibble(fixef(wt_brm_multi_flat[[4]])), as_tibble(fixef(wt_brm_multi_flat[[5]])), 
    as_tibble(fixef(wt_brm_multi_flat[[6]]))) %>% mutate(Colony_number = 1:6)

ggplot(wt_preds %>% filter(Condition == "TOTO"), aes(x = factor(Colony_number), 
    y = eDNA_dsDNA_uM)) + geom_hline(yintercept = as_tibble(fixef(wt_brm_single_level))$Estimate, 
    linetype = 2) + geom_hline(yintercept = as_tibble(fixef(wt_brm_single_level))$Q2.5, 
    linetype = 2, color = "gray") + geom_hline(yintercept = as_tibble(fixef(wt_brm_single_level))$Q97.5, 
    linetype = 2, color = "gray") + geom_hline(yintercept = as_tibble(fixef(wt_brm_no_level))$Estimate, 
    linetype = 2, color = "red") + geom_hline(yintercept = as_tibble(fixef(wt_brm_no_level))$Q2.5, 
    linetype = 2, color = "red") + geom_hline(yintercept = as_tibble(fixef(wt_brm_no_level))$Q97.5, 
    linetype = 2, color = "red") + geom_jitter(height = 0, width = 0.1, 
    shape = 21) + geom_pointrange(data = wt_no_level_coef, aes(x = factor(Colony_number), 
    y = Estimate, ymin = Q2.5, ymax = Q97.5), color = "red", 
    position = position_nudge(x = -0.1)) + geom_pointrange(data = as_tibble(coef(wt_brm_single_level)$Colony_number, 
    rownames = "Colony_number"), aes(x = factor(Colony_number), 
    y = Estimate.Intercept, ymin = Q2.5.Intercept, ymax = Q97.5.Intercept), 
    position = position_nudge(x = +0.1))

The hierarchical model results are in black and dashed gray, and the no hier models are in red. You can see the hierarchy changes the group level estimates slightly - colony 1 and 6 are shifted down / up, colonies 2, 3, and 5 have more confident estimates. However, the hyperparameter estimates are nearly identical. In the end we had a pretty balanced dataset, so the hierarchy had a small but limited effect. However, this was a really interesting exercise that should work well on future data!


sessionInfo()
## R version 3.5.2 (2018-12-20)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Mojave 10.14.6
## 
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] mice_3.6.0        lattice_0.20-38   tidybayes_1.1.0  
##  [4] brms_2.9.0        Rcpp_1.0.1        ggridges_0.5.1   
##  [7] kableExtra_1.0.1  knitr_1.23        hms_0.4.2        
## [10] lubridate_1.7.4   viridis_0.5.1     viridisLite_0.3.0
## [13] modelr_0.1.2      broom_0.5.1       cowplot_0.9.4    
## [16] forcats_0.3.0     stringr_1.3.1     dplyr_0.8.1      
## [19] purrr_0.2.5       readr_1.3.1       tidyr_0.8.2      
## [22] tibble_2.1.3      ggplot2_3.2.0     tidyverse_1.2.1  
## 
## loaded via a namespace (and not attached):
##   [1] minqa_1.2.4               colorspace_1.4-0         
##   [3] rsconnect_0.8.13          ggstance_0.3.2           
##   [5] markdown_0.9              base64enc_0.1-3          
##   [7] rstudioapi_0.9.0          rstan_2.18.2             
##   [9] svUnit_0.7-12             DT_0.7                   
##  [11] mvtnorm_1.0-10            xml2_1.2.0               
##  [13] splines_3.5.2             bridgesampling_0.6-0     
##  [15] shinythemes_1.1.2         bayesplot_1.7.0          
##  [17] jsonlite_1.6              nloptr_1.2.1             
##  [19] shiny_1.2.0               compiler_3.5.2           
##  [21] httr_1.4.0                backports_1.1.3          
##  [23] assertthat_0.2.1          Matrix_1.2-15            
##  [25] lazyeval_0.2.1            cli_1.0.1                
##  [27] formatR_1.5               later_0.7.5              
##  [29] htmltools_0.3.6           prettyunits_1.0.2        
##  [31] tools_3.5.2               igraph_1.2.4.1           
##  [33] coda_0.19-2               gtable_0.2.0             
##  [35] glue_1.3.1                reshape2_1.4.3           
##  [37] cellranger_1.1.0          nlme_3.1-140             
##  [39] crosstalk_1.0.0           xfun_0.7                 
##  [41] ps_1.3.0                  lme4_1.1-21              
##  [43] rvest_0.3.2               mime_0.6                 
##  [45] miniUI_0.1.1.1            gtools_3.8.1             
##  [47] pan_1.6                   MASS_7.3-51.1            
##  [49] zoo_1.8-6                 scales_1.0.0             
##  [51] colourpicker_1.0          promises_1.0.1           
##  [53] Brobdingnag_1.2-6         parallel_3.5.2           
##  [55] inline_0.3.15             shinystan_2.5.0          
##  [57] yaml_2.2.0                gridExtra_2.3            
##  [59] loo_2.1.0                 StanHeaders_2.18.1-10    
##  [61] rpart_4.1-13              stringi_1.2.4            
##  [63] highr_0.7                 dygraphs_1.1.1.6         
##  [65] boot_1.3-20               pkgbuild_1.0.3           
##  [67] rlang_0.3.4               pkgconfig_2.0.2          
##  [69] matrixStats_0.54.0        evaluate_0.14            
##  [71] labeling_0.3              rstantools_1.5.1         
##  [73] htmlwidgets_1.3           tidyselect_0.2.5         
##  [75] processx_3.2.1            plyr_1.8.4               
##  [77] magrittr_1.5              R6_2.4.0                 
##  [79] generics_0.0.2            mitml_0.3-7              
##  [81] pillar_1.3.1              haven_2.0.0              
##  [83] withr_2.1.2               xts_0.11-2               
##  [85] nnet_7.3-12               survival_2.43-3          
##  [87] abind_1.4-5               crayon_1.3.4             
##  [89] jomo_2.6-9                arrayhelpers_1.0-20160527
##  [91] rmarkdown_1.13            grid_3.5.2               
##  [93] readxl_1.2.0              callr_3.1.1              
##  [95] threejs_0.3.1             digest_0.6.18            
##  [97] webshot_0.5.1             xtable_1.8-3             
##  [99] httpuv_1.4.5.1            stats4_3.5.2             
## [101] munsell_0.5.0             shinyjs_1.0