library(tidyverse) # for data wrangling
library(gridExtra) # for multi-panel plots
library(lme4) # for fitting random-effects models
library(nlme) # for fitting random-effect and gls models
library(glmmTMB) # for fitting random-effects models
library(modelsummary) # for creating output tables
library(kableExtra) # for tables
options(kableExtra.html.bsTable = T)
library(ggplot2)# for plotting
library(performance) # for model diagnostics
library(ggeffects) # for effect plots
19 Generalized linear mixed effects models (GLMMs)
Learning objectives
Learn how to implement mixed models in R for count and binary data using generalized linear mixed effect models (GLMMs)
Gain an appreciation for why generalized linear mixed effects can be difficult to fit
Understand why parameters in mixed-effects models may have a different interpretation when modeling using a non-linear link function
Learn how to estimate population-level response patterns for count and binary data using GLMMs by integrating over the distribution of the random effects
Be able to describe models and their assumptions using equations and text and match parameters in these equations to estimates in computer output.
We will begin by briefly reviewing modeling approaches that are appropriate for repeated measures data where the response variable, \(Y\), conditional on predictor variables, \(X\) is Normally distributed. These methods include linear mixed-effects models (Chapter 18) and generalized least squares (Section 18.14). We will then consider modeling approaches for repeated measures in cases where \(Y | X\) has a distribution other than a Normal distribution, including generalized linear mixed effect models (GLMMs) in this Section, and Generalized Estimating Equations (GEE) in Chapter 20. Throughout, we will draw heavily upon the material in Fieberg et al. (2009), which explores data from a study of mallard nesting structures, including the clutch size data from Chapter 4.
19.1 R packages
We begin by loading a few packages upfront:
In addition, we will use data and functions from the following packages:
Data4Ecologists
for theclutch
data setGLMMadaptive
for fitting generalized linear mixed effect models and estimating marginal response patterns from themcar
for multiple degree-of-freedom hypothesis tests using theAnova
function
19.2 Case study: A comparison of single and double-cylinder nest structures
In the late 90’s, the Minnesota Department of Natural Resources (MN DNR) conducted a study to compare the cost-effectiveness of single- and double-cylinder mallard nesting structures and to identify the best places to locate them in the landscape (Zicus, Fieberg, and Rave 2003; Zicus, Rave, and Fieberg 2006; Zicus et al. 2006). A drawing of a single cylinder structure by a former MN DNR employee, Ross Hier, who happens to be a fantastic artist is depicted below (Figure 19.1); a double-cylinder structure would have two cylinders associated with a single pole. To accomplish the project objectives, 110 nest structures (53 single-cylinder and 57 double-cylinder) were placed in 104 wetlands (the largest eight wetlands included two structures each). The structure type (single or double) was randomly chosen for each location. The nest structures were inspected \(\ge 4\) times annually from 1997-1999 to determine whether or not each nest cylinder was occupied. In addition, clutch sizes were recorded for 139 nests during the course of the study. Thus, we have multiple observations for each nest structure, and we might expect our response variables (structure occupancy or clutch size) from the same structure to be more similar than observations from different structures even after accounting for known covariates.
19.4 Extentions to Count and Binary Data
Unlike the multivariate Normal distribution, the Poisson, Negative Binomial, Bernoulli, and Binomial distributions we have used to model count and binary data are not parameterized in terms of separate mean and variance parameters. Recall from Chapter 9 that the variance of most commonly used distributions are a function of the same parameters that describe the mean. For example, the mean and variance of the Poisson distribution are both equal to \(\lambda\) and the mean and variance of the Bernoulli distribution are a function of \(p\) (the mean = \(p\) and the variance = \(p(1-p)\)). Thus, there are no multivariate analogues of these distributions that will allow us to construct marginal models with separate parameters describing the mean and the variance/covariance of binary or count data (as in the previous section).
We will explore 2 options for extending generalized linear models to data containing repeated measures:
- Add random effects to the linear predictor, leading to generalized linear mixed effect models (Section 19.5).
- Explicitly model the mean and variance/covariance of our data without assuming a particular distribution. We can then appeal to large sample (asymptotic) results to derive an appropriate sampling distribution of our regression parameter estimators under an assumption that our clusters are independent sample units. This is the Generalized Estimating Equations (GEE) approach (Chapter 20).
As we will see, the former approach leads to a model for the conditional (subject-specific) means whereas the latter approach is parameterized in terms of the population mean. These two means will not, in general, be equivalent when using a non-linear link function (e.g., logit or log).
19.5 Generalized linear mixed effects models
Recall from Chapter 14 that generalized linear models assume that some transformation of the the mean, \(g(\mu_i)\), can be modeled as a linear function of predictors and regression parameters:
\(g(\mu_i) = \eta_i = \beta_0 + \beta_1x_1 + \ldots \beta_px_p\)
Further, we assume that the response variable has a particular distribution (e.g., Normal, Poisson, Bernoulli, Gamma, etc), \(f(y_i; \theta)\), that describes unmodeled variation about \(\mu_i = E[Y_i|X_i]\) using one or more parameters, \(\theta\):
\(Y_i \sim f(y_i; \theta), i=1, \ldots, n\)
We can easily add random effects to the model for the transformed mean (e.g., random intercepts and random slopes). In this case, we will have two random components to the model:
- The random effects, \(b \sim N(0, \Sigma_b)\); and
- The conditional distribution of \(Y_i\) given the random effects, \(Y_i | b \sim f(y_i; \theta, b)\).
It is important to recognize that the marginal distribution of \(Y_i\), which is determined by integrating over the random effects, will not generally be the same as the conditional distribution. In fact, we may not even be able to write down an analytical expression for the marginal distribution. This contrasts with the case for linear mixed effects models in which both the conditional and marginal distributions turned out to be Normal (Section 18.14). These differences have important implications when it comes to model fitting (Section 19.7) and parameter interpretation (Section 19.8).
When describing generalized linear mixed effects models using equations, it is most appropriate to specify the model in terms of the conditional distribution of \(Y_i\) given the random effects and the random effects distribution. Examples are given below for a random intercept and slope model with a single covariate, \(X\):
Poisson-normal model:
\[\begin{gather} Y_{ij} | b_i \sim \text{ Poisson}(\lambda_{ij}) \nonumber \\ \log(\lambda_{ij}) = (\beta_0 + b_{0i}) + (\beta_1 + b_{1i})X_{ij} \nonumber \\ (b_{0i}, b_{1i}) \sim N(0, \Sigma_b) \nonumber \end{gather}\]
Logistic-normal model:
\[\begin{gather} Y_{ij} | b_i \sim \text{ Bernoulli}(p_{ij}) \nonumber \\ \text{logit}(p_{ij}) = (\beta_0 + b_{0i}) + (\beta_1 + b_{1i})X_{ij} \nonumber \\ (b_{0i}, b_{1i}) \sim N(0, \Sigma_b) \nonumber \end{gather}\]
19.6 Modeling nest occupancy using generalized linear mixed effects models
Zicus et al. (2006) modeled the probability that a nesting structure would be occupied as a function of the type of structure (single versus double cylinder; deply
), the size of the wetland holding the structure (wetsize
), a measure of the extent of surrounding cover affecting a nest’s visibility (vom
), and the year (year
) and observation period (period
). They also included interactions between year
and period
and between period
and vom
. Lastly, they included a random intercept for nest structure.
Below, we access the data from the Data4Ecologists
package, convert several variables to factors, and then fit this model using the glmer
function in the lme4
package, specifying that we want to use the bobyqa
optimizer to avoid warnings that appear when using the default optimizer:
# load data from Data4Ecologists package
data(nestocc)
# turn deply, period, year, strtno into factor variables
$deply <- as.factor(nestocc$deply)
nestocc$period <- as.factor(nestocc$period)
nestocc$year <- as.factor(nestocc$year)
nestocc$strtno <- as.factor(nestocc$strtno)
nestocc# remove observations with missing records and fit model
<- nestocc %>% filter(complete.cases(.))
nestocc <- glmer(nest ~ year + period + deply + vom +
nest.ri * period + vom * period + poly(wetsize, 2) + (1 | strtno),
year family = binomial(), data = nestocc, control = glmerControl(optimizer = "bobyqa"))
Given the large number of fixed effects parameters, it is easiest to describe the model using a combination of text (as in the introduction to this Section) and simplified equations that refer to the predictors that appear in the model’s design matrix:
\[\begin{gather} Y_{ij} | b_i \sim Bernoulli(p_{ij}) \nonumber \\ \log[p_{ij}/(1-p_{ij})] = X_{ij}\beta + b_{0i} \nonumber \\ b_{0i} \sim N(0, \sigma^2_{b_0}), \nonumber \end{gather}\]
where \(Y_{ij}\) is equal to 1 if the \(i^{th}\) nest structure was occupied during the \(j^{th}\) sampling period and 0 otherwise, \(p_{ij}\) represents the probability that structure \(i\) is occupied during the \(j^{th}\) sampling period, and \(X_{ij}\) contains the vector of predictors in the design matrix associated with structure \(i\) during the \(j^{th}\) sampling period; \(X_{ij}\) includes a set of 1’s associated with the overall intercept, an indicator variable for whether the structure was a double-cylinder structure (I(deply==2)
), basis vectors for linear and quadratic effects of wetland size, the level of visual obstruction (vom
; i.e., how easy it is for predators to see the nest), indicator variables for year and sampling period, and interactions between year and sampling period and between sampling period and visual obstruction. If we wanted to look at the first few observations in the design matrix, we could do so using the model.matrix
function:
1:3,] nestocc[
strtno year period vom deply nest wetsize
1 1 1997 1 0.2255628 2 0 24
2 1 1997 2 0.2274146 2 0 24
3 1 1997 3 0.8913011 2 0 24
model.matrix(nest.ri)[1:3,]
(Intercept) year1998 year1999 period2 period3 period4 deply2 vom poly(wetsize, 2)1 poly(wetsize, 2)2 year1998:period2 year1999:period2 year1998:period3
1 1 0 0 0 0 0 1 0.2255628 0.08133736 0.02148962 0 0 0
2 1 0 0 1 0 0 1 0.2274146 0.08133736 0.02148962 0 0 0
3 1 0 0 0 1 0 1 0.8913011 0.08133736 0.02148962 0 0 0
year1999:period3 year1998:period4 year1999:period4 period2:vom period3:vom period4:vom
1 0 0 0 0.0000000 0.0000000 0
2 0 0 0 0.2274146 0.0000000 0
3 0 0 0 0.0000000 0.8913011 0
Note that the model considers the effects of 3 different categorical predictors, year
, period
, and strtno
. The first two are modeled using fixed effects with separate intercept parameters to distinguish among the different year
and period
combinations. The column of 1’s in the design matrix is associated with the overall intercept, which reflects year
= 1997 and period
= 1. The columns in the design matrix associated with year1998
and year1999
capture differences between years 1998 and 1997 and 1999 and 1997, but only during the reference period, i.e., period
= 1, due to the interaction between year
and period
. Similarly, the columns in the design matrix associated with period2
, period3
, and period4
contrast these periods with period 1 during the reference year, 1997. The columns in the design matrix that code for the interactions capture differences in differences – e.g., year1998:period2
quantifies how the difference between period 2 and period 1 differ between 1998 and 1997 (the reference year). Or, we can view this parameter as quantifying how the difference between years 1998 and 1997 differ between period 2 and period 1 (the reference period). If you find this discussion confusing, it might help to review design matrices from Chapter 3, but it is not particularly important to understanding the key points of this Section.
Fixed or random effects: We model the effects of period
and year
using fixed effects, because there are only 4 periods of interest, and we only observed data from 3 years. Therefore, estimating a variance for period
makes little sense and would be extremely challenging for year
. By contrast, we observed data from 110 different nesting structures, making it possible to estimate variability among structures. In addition, we likely want to make inference to the population of structures that could be located in the landscape. Thus, we will model the effect of strtno
using random effects. We can think of the random intercepts for each structure as capturing unmeasured characteristics associated with the different structures that make them more or less attractive to mallards, and thus, allow the structures to have different propensities of being occupied.
Let’s now look at the summary output:
summary(nest.ri)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: nest ~ year + period + deply + vom + year * period + vom * period + poly(wetsize, 2) + (1 | strtno)
Data: nestocc
Control: glmerControl(optimizer = "bobyqa")
AIC BIC logLik deviance df.resid
778.0 878.8 -369.0 738.0 1120
Scaled residuals:
Min 1Q Median 3Q Max
-1.8896 -0.3290 -0.1902 -0.1138 5.8768
Random effects:
Groups Name Variance Std.Dev.
strtno (Intercept) 1.91 1.382
Number of obs: 1140, groups: strtno, 109
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.2418 0.7097 -5.977 2.27e-09 ***
year1998 1.2322 0.4409 2.795 0.005189 **
year1999 1.1736 0.4551 2.579 0.009917 **
period2 1.5982 0.8289 1.928 0.053834 .
period3 3.2272 2.2386 1.442 0.149415
period4 6.1326 2.7068 2.266 0.023474 *
deply2 0.1077 0.3650 0.295 0.767830
vom 5.0353 1.9255 2.615 0.008920 **
poly(wetsize, 2)1 19.5886 5.9114 3.314 0.000921 ***
poly(wetsize, 2)2 -16.4712 5.5945 -2.944 0.003238 **
year1998:period2 -2.2450 0.7006 -3.205 0.001353 **
year1999:period2 -1.4567 0.6602 -2.206 0.027362 *
year1998:period3 -1.7489 1.0569 -1.655 0.097975 .
year1999:period3 -1.3053 1.0435 -1.251 0.210953
year1998:period4 -1.8957 0.7725 -2.454 0.014133 *
year1999:period4 -1.5073 0.7327 -2.057 0.039665 *
period2:vom -2.9871 2.6549 -1.125 0.260524
period3:vom -6.6757 3.1628 -2.111 0.034802 *
period4:vom -7.7030 2.3906 -3.222 0.001272 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation matrix not shown by default, as p = 19 > 12.
Use print(x, correlation=TRUE) or
vcov(x) if you need it
The summary output is similar to what we saw when using lmer
. We see that the variance of \(b_{0i}\) is given by 1.924 and the standard deviation of the \(b_{0i} = \sqrt{Var(b_{0i})} = 1.387\). We also see a list of fixed effects parameters, associated standard errors, and p-values for null hypothesis tests (that the parameters are 0) based on a Normality assumption, which may be reasonable for large samples due to using maximum likelihood (see Section 10.6). Other than the complications associated with interpreting parameters in models with interactions and the need to specify an optimizer other than the default choice, everything seems pretty straightforward on the surface.
19.7 Parameter estimation
In the last section, we specified a different optimizer than the default choice when using glmer
in order to estimate parameters without a convergence warning. It turns out that estimating parameters in generalized linear mixed effects models is much more challenging than when fitting generalized linear models (Chapter 14) or linear mixed effects models with Normal responses (Chapter 18). If you don’t believe me, try fitting the same model using lmer
(after dropping the family=binomial()
and control=glmerControl(optimizer="bobyqa")
) arguments; you will have results almost immediately. Why is it so much more difficult to fit GLMMs than LMMs?
Well, before we can estimate model parameters, we have to determine the marginal likelihood of the data. We specify the GLMM in terms of a conditional model, \(Y_{ij} | b_i\). To determine the marginal distribution of \(Y_{ij}\), we need to integrate over the random effects distribution:
\[ L(\beta, \Sigma_b; Y_{ij}) = \prod_{i=1}^n \int f(Y_{ij} | b_i)f(b_i)db_i \tag{19.8}\]
where \(\beta\) contains our fixed effects parameters, \(\Sigma_b\) are variance parameters associated with the random-effects distribution, \(f(Y_{ij} |b_i)\) is the conditional distribution of \(Y\) given the random effects, and \(f(b_i)\) is the distribution of the random effects; Typically, \(f(Y_{ij} |b_i)\) will be a distribution from the exponential family ( e.g., Bernoulli, Poisson), and \(f(b_i) \sim N(0, \Sigma_b)\).
Think-Pair-Share: write down the different components of the likelihood for the nest occupancy model.
For our nest occupancy model, we have:
- \(f(Y_{ij} |b_i) \sim \text{ Bernoulli}(p_{ij}) = p_{ij}^{Y_{ij}}(1-p_{ij})^{1-Y_{ij}}\), with \(\text{logit}(p_{ij}) = X_{ij}\beta+b_{0i} \Rightarrow p_{ij} = \frac{\exp(X_{ij}\beta+b_{0i})}{1+exp(X_{ij}\beta+b_{0i})}\)
- \(f(b_i)\) \(\sim N(0, \sigma^2_{b}) = \frac{1}{\sqrt{2\pi}\sigma_{b}}\exp\left(\frac{b_{0i}^2}{2\sigma^2_{b}}\right)\)
Importantly, if we plug these distributions into our equation for the likelihood (Equation 19.8), we find that there is no closed form, analytical solution – i.e., we can’t write down an expression for the likelihood that does not involve an intractable integral. So, how does R estimate the parameters of the model? We have 3 options:
- Approximate the integrand, i.e., the expression for the likelihood, \(L(Y_{ij} | \beta, D)\), by a multivariate Normal distribution. This leads to a closed form, known solution. This approach is referred to as the Laplace approximation.
- Approximate the integral using simulation-based or numerical integration techniques. These approaches can be slow, particularly with multiple random effects since the integral needs to be solved numerically for each iteration of the numerical optimization routine.
- Add priors, and use Bayesian techniques to approximate the posterior distribution of \(\beta\) and \(\Sigma_b\).
Most users of glmer
, and GLMMs more generally, are blissfully unaware of these challenges except when they try to fit models that return convergence warnings. If, however, you look at the help page for glmer
, you will see that there is an argument, nAGQ
, with the following description:
integer scalar - the number of points per axis for evaluating the adaptive Gauss-Hermite approximation to the log-likelihood. Defaults to 1, corresponding to the Laplace approximation. Values greater than 1 produce greater accuracy in the evaluation of the log-likelihood at the expense of speed. A value of zero uses a faster but less exact form of parameter estimation for GLMMs by optimizing the random effects and the fixed-effects coefficients in the penalized iteratively reweighted least squares step. (See Details.)
Further down in the details section, you will also see this point addressed:
The expression for the likelihood of a mixed-effects model is an integral over the random effects space. For a linear mixed-effects model (LMM), as fit by lmer, this integral can be evaluated exactly. For a GLMM the integral must be approximated. The most reliable approximation for GLMMs is adaptive Gauss-Hermite quadrature, at present implemented only for models with a single scalar random effect. The nAGQ argument controls the number of nodes in the quadrature formula. A model with a single, scalar random-effects term could reasonably use up to 25 quadrature points per scalar integral.
Again, the reason we did not encounter this issue with linear mixed effects models is that for a linear mixed-effects model (LMM), as fit by lmer, this integral can be evaluated exactly (the marginal distribution is, in this case, given by a Normal distribution - see Section 18.14).
19.8 Parameter interpretation
In Section 19.3.1, we discussed how fixed-effects parameters in linear mixed-effects models can be interpreted in terms of changes in subject-specific means, population-averaged means, or the mean associated with a “typical” subject/site/individual with all random effects set to 0 (Fieberg et al. 2009). In generalized linear mixed effects where the random effects enter on a transformed scale, this equivalence will no longer hold in general.
More specifically:
- We cannot determine population-averaged response patterns by setting all random effects to 0 and then solving for \(\mu_{ij}= g^{-1}(X\beta)\). This approach will, however, give us the mean for a “typical” individual, i.e., an individual with all random effects equal to 0.
- When interpreting parameters and their impact on means, we need to hold the random effects, \(b_i\), constant. With non-linear transformations of the mean, parameters will then only have a subject-specific interpretation.
- To determine population-averaged means, we need to integrate over the random-effects distribution. Similar to our discussion regarding the marginal likelihood (see Section 19.7), there will often be no analytical expression for the population mean.
Consider, for example, the simplified version of our nest occupancy model given below:
\[\begin{gather} Y_i | b_i \sim Bernoulli(p_i) \nonumber \\ \log[p_{ij}/(1-p_{ij})] = \beta_0 + b_{0i} + \beta_1VOM_{ij} + \beta_2I(deply=2)_i \nonumber \\ b_{0i} \sim N(0, \sigma_{b_0}^2) \nonumber \end{gather}\]
In this case, \(\exp(\beta_1)\) represents the change in the odds of structure occupancy if we change \(VOM\) by 1 unit at a particular structure while keeping the structure’s deployment type (single or double cylinder) constant. Similarly, \(\exp(\beta_2)\) estimates the change in the odds of structure occupancy if we were to change the structure from a single to double cylinder model at a particular location with \(VOM\) held constant. The interpretation here focuses on a particular structure, since we have to hold both \(b_{0i}\) and all other fixed effects constant. As we will see shortly, these conditional (i.e., subject-specific) odds ratios will be larger than marginal (population-averaged) odds ratios.
It is important to note that there may be cases where a subject-specific interpretation is not of interest or may not even make sense. Consider, e.g., models that include the sex of individual animals in a study. Although many fish species can change their sex, we may be more interested in comparing differences between males and females at the population rather than individual level. In Section 19.8.1, we will discuss methods for estimating marginal means using parameters from fitted generalized linear mixed effects models, including a few specific cases where we can derive analytical expressions for marginal means from fitted GLMMs. We will also discuss ways to approximate marginal means using numerical integration techniques.
First, let’s explore differences between conditional and marginal means using a simple simulation framework involving a binary regression model with a random intercept and a single covariate. Consider a group of first graders learning how to play hockey for the first time1. One of the key challenges they will face is learning how to skate backwards. The more times they skate, the more likely they will be able to skate backwards without falling, but individuals may have more or less experience with ice skating before they begin the season. Let \(Y_{it}\) = 1 if individual \(i\) can skate the width of the ice backwards without falling at time \(t\) and 0 otherwise. Assume these data were collected pre-covid and that all skaters attended 10 practices. Thus, let \(X_{it} = t\) be the number of practices that individual \(i\) has attended by time \(t\). We will assume the probability of successfully skating backwards depends on the number of practices and can be modeled as:
\[\begin{gather} Y_{it} | b_i \sim Bernoulli(p_{it}) \nonumber \\ \log[p_{it}/(1-p_{it})] = \beta_0 + b_{0i} + \beta_1X_{it} \nonumber \\ b_{0i} \sim N(0, \sigma_{b_0}^2) \nonumber \end{gather}\]
The random intercept, \(b_{0i}\), accounts for among-individual differences in their initial abilities and experiences with skating. Let’s simulate 10 observations for each of 100 individuals, setting:
- \(\beta_0 = -6\)
- \(\beta_1 = 1\)
- \(\sigma_{b_0} = 2\)
# ## Simulation set up
set.seed(1200)
# Sample size
<- 100 # number of individuals
nindividuals <- 10 # number of observations per individual
nperindividual <- nindividuals * nperindividual # total number of observations
ntotal
# Data generating parameters
<- rnorm(nindividuals, 0, 2) # deviations from mean intercept
b0_i <- -6 # mean intercept
beta_0 <- 1 # slope for number of practices
beta_1
# Generate 10 different observations for each individual
# for practices 1 through 10
<- seq(1, 10, length = 10)
practice.num <- rep(1:nindividuals, each = nperindividual) # individual id
individual
# Y|boi is Bernouli p[ij]
<- beta_0 + rep(b0_i, each = nperindividual) + practice.num * beta_1
logitp <- plogis(logitp)
p <- rbinom(ntotal, size = 1, prob = p)
Y
# Create data set for model fitting
<- data.frame(Y = Y, practice.num = practice.num, individual = individual) skatedat
We can then fit the GLMM used to generate the data and show that we can recover the simulation parameters:
<- glmer(Y ~ practice.num + (1 | individual), family = binomial(), data = skatedat)
glmermod summary(glmermod)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: binomial ( logit )
Formula: Y ~ practice.num + (1 | individual)
Data: skatedat
AIC BIC logLik deviance df.resid
766.9 781.6 -380.5 760.9 997
Scaled residuals:
Min 1Q Median 3Q Max
-5.0829 -0.2764 -0.0439 0.2765 6.4474
Random effects:
Groups Name Variance Std.Dev.
individual (Intercept) 4.909 2.216
Number of obs: 1000, groups: individual, 100
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -5.94865 0.48504 -12.26 <2e-16 ***
practice.num 0.99702 0.07088 14.07 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr)
practice.nm -0.862
For example, the true parameters are contained within Wald-based confidence intervals formed using a Normality assumption.
confint(glmermod, method="Wald")
2.5 % 97.5 %
.sig01 NA NA
(Intercept) -6.8993143 -4.997981
practice.num 0.8580962 1.135940
Now, let’s plot the average value of \(Y\) at each time point – i.e., the proportion of the 100 skaters that can successfully skate backward at each practice (Figure 19.3). We will also plot the predicted mean response curve formed by setting the random effect, \(b_{0i} = 0\). The latter response curve corresponds to the predicted probability that a “typical skater” (i.e., the median individual) will be able to skate backwards at each practice.
# Predictions for a "typical subject" with b0i=0
<- seq(1, 10, length = 10)
practice.num <- data.frame(practice.num = practice.num)
mod $glmer.logit <- fixef(glmermod)[1] + practice.num * fixef(glmermod)[2]
mod$glmer.p <- plogis(mod$glmer.logit)
mod
# Summarize average Y (i.e., proportion of skaters that successfully
# skate backwards) at each practice
<- skatedat %>%
mdat group_by(practice.num) %>%
::summarize(PopProportion = mean(Y))
dplyr
# Combine two curves and plot
<- data.frame(
mdat2 practice.num = rep(practice.num, 2),
p = c(mod$glmer.p, mdat$PopProportion),
type = rep(c("Probability, typical individual", "Sample proportion"), each = 10)
)
# Plot
ggplot(mdat2, aes(practice.num, p, col=type)) +
geom_line() + geom_point() +
xlab("Number of practices") +
ylab("Proportion of skaters that can skate backwards")
We see that the two curves do not line up – the predicted probability of skating backwards for a “typical individual” (with \(b_{0i} = 0\)) differs from the proportion of the sample that can skate backwards. This occurs because of the non-linear logit transformation, \(E[Y | X] \ne E[Y | X, b_{0i} = 0]\). To understand this mismatch, we next compare:
- the mean response curve for each individual on both the logit and probability scales
- the mean of these individual-specific curves on both the logit and probability scales
- the overall mean population response curve on both the logit and probability scales
We begin by computing the individual-specific curves, below:
<- NULL
pdat for (i in 1:nindividuals) {
<- beta_0 + b0_i[i] + practice.num * beta_1 # individual i's curve on logit scale
logitp.indiv <- plogis(logitp.indiv) # individual i's curve on the probability scale
p.indiv <- data.frame(p = p.indiv,
tempdata logitp = logitp.indiv,
individual = i)
<- rbind(pdat, tempdata)
pdat
} $practice.num<-practice.num pdat
We then calculate the average of the individual-specific curves on both the logit and probability scales:
<-pdat %>% group_by(practice.num) %>%
pop.patterns::summarize(meanlogitp=mean(logitp), meanp=mean(p)) dplyr
Lastly, we compare the individual-specific (i.e., subject-specific or conditional means; in black) to their average (in red) and to the population-average response curves (in blue) on both logit and probability response scales (Figure 19.4).
<- ggplot(pdat, aes(practice.num, logitp)) +
m1 geom_line(aes(group = individual)) + ylab("logit p") +
geom_line(data = pop.patterns, aes(practice.num, meanlogitp), col = "blue", linewidth = 1.2) +
geom_line(data = mod, aes(practice.num, glmer.logit), col = "red", linewidth = 1.2, linetype = "dashed") +
theme(legend.position = 'none') + xlab("Practice Number") +
ylab(expression(Logit(p))) + ggtitle("A)")
<- ggplot(pdat, aes(practice.num, p)) +
m2 geom_line(aes(group = individual)) + ylab("p") +
geom_line(data = pop.patterns, aes(practice.num, meanp), col = "blue", linewidth = 1.2) +
geom_line(data = mod, aes(practice.num, glmer.p), col = "red", linewidth = 1.2, linetype = "dashed") +
theme(legend.position = 'none') + xlab("Practice Number") +
ylab(expression(p[i])) + ggtitle("B)")
grid.arrange(m1, m2, ncol = 2)
Let’s begin by looking at the predicted curves on the logit scale (Figure 19.4 A). As expected, the logit probabilities of skating backwards increase linearly with the number of practices. Further, when we average the individual curves on the logit scale, we find that average of the individual-specific curves (in blue) and the curve for a typical individual with \(b_{0i}=0\) (in red) are identical (Figure 19.4 A).
Now, let’s look at the same curves but on the probability scale (Figure 19.4 B). We see that all individuals improve at the same rate (governed by \(\beta_1\)), but that they have different initial abilities (determined by \(b_{0i}\)). Thus, some individuals are able to skate backwards successfully by the 5th practice while others still have a low probability of skating backwards after 10 practices (Figure 19.4 B). The average skater (with \(b_{0i} = 0\)), plotted in red, is extremely unlikely to skate backwards successfully after the first practice, but is nearly guaranteed to be skating backwards by the end of the season. When we average these curves, we find that there is a small, but non-negligible proportion of skaters that can skate backwards after the first practice and similarly, a small and non-negligible proportion of skaters that cannot skate backwards after 10 practices. The population mean curve (in blue) is flatter than the curve for a typical individual (in red). This will always be the case when fitting a logistic regression model with a random intercept – the population mean response curve will be attenuated with slope that is smaller in absolute value than the curve formed by setting all random effects equal to 0 (Zeger, Liang, and Albert 1988; Fieberg et al. 2009).
If we are interested in directly modeling how the population mean (or proportion)2 changes as a function of one or more variables, we can consider Generalized Estimating Equations (Chapter 20). Alternatively, we can use the methods in the next section to approximate population mean response curves from fitted GLMMs.
19.8.1 Approximating marginal (population-average) means from fitted GLMMs
As discussed in Section 19.7, to calculate the marginal distribution (and thus, likelihood) of the data, we have to integrate over the distribution of the random effects. Unfortunately, there will not typically be a closed form solution to the integral that allows us to derive an analytical expression for the marginal distribution. Similarly, we will need to integrate over the distribution of the random effects to determine the marginal mean, \(E[Y|X] = \sum_y yf_{Y,b}(y) = \sum_y y \int f_{Y|b}(y)f_b(b)db\). Except for a few special cases (see Section 19.8.2), there will not generally be a closed form solution for this expression. Thus, we must resort to the same set of approaches discussed in Section 19.7 to derive an expression for the marginal mean. Specifically, we can:
- Approximate the integrand, \(\int f_{Y|b}(y)f_b(b)db\), in a way that gives us a closed form or known solution.
- Use numerical integration techniques to solve the integral.
- Use simulation techniques to approximate the solution to the integral.
19.8.2 Special cases
There are a few special cases where we can derive expressions for the marginal (population) means from the subject-specific parameters estimated when fitting generalized linear mixed effects models (Ritz and Spiegelman 2004). We list a few of the more notable cases here. We already highlighted that setting \(b_{0i} = 0\) in linear mixed effects models will give us an expression for the marginal (population-average) mean (Section 19.3.1); this equivalence holds more generally for models with an identity link. Another important example is the case where a random intercept is added to a Poisson regression model:
\[\begin{gather} Y_{ij} |b_{0i} \sim Poisson(\lambda_{ij}) \nonumber \\ \log(\lambda_{ij}) = X_{ij}\beta+ b_{0i} \nonumber \\ b_{0i} \sim N(0, \sigma_{b_0}) \nonumber \end{gather}\]
In this case, the population-averaged response pattern is determined by adjusting the intercept: \(\beta^M_{0} = \beta^C_{0} + \sigma_{b_0}^2/2\), where \(\beta^M_{0}\) and \(\beta^C_{0}\) are the intercepts in the expression for the marginal and conditional means, respectively, and \(\sigma_{b_0}^2\) is the variance of the random intercept. The slope parameters will not need adjusting and have both conditional (subject-specific) and marginal (population-averaged) interpretations. We can demonstrate this result with another simple simulation example, below.
Assume 100 students in an ornithology class from the University of Minnesota go birding 20 times during the course of the semester. Each time out, they look for birds for between 30 min and 2 hours, recording the time they spent surveying to the nearest 5 minutes. Let \(Y_{ij}\) = the number of birds seen by student \(i\) during the \(j^{th}\) observation period and \(X_{ij}\) be the time (in hours) they spent looking for birds during the \(j^{th}\) observation period. Individuals will have different skill levels, which we represent using a Normally distributed random intercept, \(b_{0i}\). We will assume, conditional on \(b_{0i}\), that the number of birds observed can be described by a Poisson distribution with mean depending on the log number of hours of observation. Thus, we simulate data from the following model:
\[\begin{gather} Y_{ij} |b_{0i} \sim Poisson(\lambda_{ij}) \nonumber \\ \log(\lambda_{ij}) = \beta_0 + b_{0i} + \log(X_{ij})\beta_1 \nonumber \\ b_{0i} \sim N(0, \sigma_{b_0}), \nonumber \end{gather}\]
with parameters:
- \(\beta_0 = - 1.5\)
- \(\beta_1 = 0.2\)
- \(\sigma_{b_{0}} = 1.5\)
# Simulation set up
# Generate 100 random intercepts
set.seed(610)
# Sample size
<- 100 # number of individuals
nindividuals <- 10 # number of observations per individual
nperindivid <- nindividuals * nperindivid # total number of observations
ntotal
# Data generating parameters
<- rnorm(nindividuals, 0, 1.5) # deviations from mean intercept
b0_i <- -1.5 # mean intercept
beta_0 <- 0.2 # slope for time
beta_1
# Generate 10 different observations for each individual Here, x will
# record the observation time to the nearest 5 minutes
<- log(sample(seq(30, 120, by = 5), size = ntotal, replace = TRUE))
lobs.time
# Y|boi is Poisson lambda[ij]
<- beta_0 + rep(b0_i, each = nperindivid) + lobs.time * beta_1
loglam <- exp(loglam)
lam <- rpois(ntotal, lam)
Y
# Create data set for model fitting
<- rep(1:nindividuals, each = nperindivid) # Cluster id
individual <- data.frame(Y = Y, lobs.time = lobs.time, individual = individual) birdobs
We then fit the GLMM corresponding to the data generating model and show that we are able to recover the parameter values used to simulate the data.
# ## Fit glmer model
# Estimates are close to values used to simulate data
<- glmer(Y ~ lobs.time + (1 | individual), family = poisson(), data = birdobs)
fitsim summary(fitsim)
Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
Family: poisson ( log )
Formula: Y ~ lobs.time + (1 | individual)
Data: birdobs
AIC BIC logLik deviance df.resid
2382.9 2397.6 -1188.4 2376.9 997
Scaled residuals:
Min 1Q Median 3Q Max
-2.4449 -0.5947 -0.3465 0.3592 4.5086
Random effects:
Groups Name Variance Std.Dev.
individual (Intercept) 1.924 1.387
Number of obs: 1000, groups: individual, 100
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.95290 0.33671 -5.800 6.63e-09 ***
lobs.time 0.31214 0.07018 4.448 8.68e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr)
lobs.time -0.895
cbind(confint(fitsim), trueparam = c(1.5, -1.5, 0.2))
Computing profile confidence intervals ...
2.5 % 97.5 % trueparam
.sig01 1.1759471 1.6614586 1.5
(Intercept) -2.6309957 -1.2907895 -1.5
lobs.time 0.1733807 0.4525831 0.2
Next, we compute and plot the following:
- the estimated mean response curve for a typical student, \(E[Y_{ij} | X_{ij}, b_{0i} = 0] = \exp(\beta_0 + \beta_1 \log(X_{ij}))\)
- the estimated mean response curve for the population of students, \(E[Y_{ij} | X_{ij}] = \exp(\beta_0 + \sigma^2_{b_0}/2 + \beta_1 \log(X_{ij}))\)
- the mean number of birds observed in the population of students as a function of log(observation time)
# Summarize the mean number of birds seen as a function of lobs.time at the population-level
<- birdobs %>% group_by(lobs.time) %>% dplyr::summarize(meanY = mean(Y))
mdatb
# Now, add on appropriate conditional and population average mean
# response curves
$ss <- exp(fixef(fitsim)[1] + mdatb$lobs.time * fixef(fitsim)[2])
mdatb$pa <- exp(fixef(fitsim)[1] + as.data.frame(VarCorr(fitsim))[1,4] / 2 +
mdatb$lobs.time * fixef(fitsim)[2])
mdatb
# plot
ggplot(mdatb, aes(lobs.time, meanY)) + geom_point() +
geom_line(aes(lobs.time, ss), color="red") +
geom_line(aes(lobs.time, pa), color="blue") +
xlab("log(Observation time)") + ylab("Number of birds seen")
Examining Figure 19.5, we see that the mean response curve for the “typical student”, formed by setting \(b_{0i} = 0\) (in red) underestimates the number of birds seen in the population of students. However, the mean response curve derived by adjusting the intercept (blue curve) fits the data well. This adjustment is only appropriate when fitting Poisson GLMMs that include only a random intercept but no random slope terms.
Another important case where we can derive expressions for the marginal mean from a fitted random-effects model is when using Probit regression Equation 19.9:
\[ Y_{ij} | b_{0i} \sim Bernoulli(p_i) \nonumber \\ p_i = \Phi(X_{ij}\beta + b_{0i}), \tag{19.9}\]
where \(\Phi\) is the cumulative distribution function of the standard Normal distribution (and can be calculated using the pnorm
function in R). In this case, marginal parameters, \(\beta_M\) can be derived from conditional parameters \(\beta_C\) and the random effects variance component, \(\sigma^2_{b_0}\) using:
\[\beta^M = \frac{\beta^C}{\sqrt{\sigma^2_{b_0} +1}}\]
Lastly, we note that Zeger, Liang, and Albert (1988) derived a useful approximation for describing marginal (i.e., population average) mean response patterns using the parameters estimated from a generalized linear mixed effects model in the case of logistic regression with a random intercept:
\[ \beta^M = (\sigma^2_{b_0}0.346 +1)^{-1/2}\beta^C \tag{19.10}\]
This approximation works well for our simulated hockey player data (Figure 19.6).
Again, this approximation only works for logistic regression models containing random intercepts but not random slopes. Alternatively, we can use numerical integration to average over the distribution of the random effects and thereby determine the marginal mean response curve.
19.8.3 Numerical integration
Recall, to determine the marginal mean response curve, we need to integrate over the distribution of the random effects:
\[ E[Y|X] = \sum_y y f_{Y,b}(y) = \sum_Y y\int f_{Y| b}(y)f_b(b)db \tag{19.11}\]
For binary regression models, note that \(Y\) can take only 1 of 2 values (0 and 1), and only the value of 1 contributes a term to the expected value (Equation 19.11). Furthermore, when \(Y =1\), \(f_{Y|b}(y) = \frac{e^{\beta_0 + b + \beta_1 X}}{1 + e^{\beta_0 + b + \beta_1 X}}\). Lastly, \(f_b(b)\) is a Normal distribution with mean 0 and variance \(\sigma^2_{b_0}\). Thus, we need to compute:
\[E[Y|X] = \sum_y yf_{Y,b}(y) =\int \frac{e^{\beta_0 + b + \beta_1 X}}{1 + e^{\beta_0 + b + \beta_1 X}}\frac{1}{\sqrt{2 \pi}\sigma}e^{-\frac{b^2}{2\sigma^2_{b_0}}}db\] We could make use of the integrate
function in R to approximate this integral, as we demonstrate below for our model describing how quickly young hockey players learn how to skate backwards. We then plot the result (Figure 19.7).
# Matrix to hold marginal mean response curve, E[Y|X]
<-matrix(NA,10,1)
pa.ratefor(i in 1:10){
<-function(x){
intfunplogis(fixef(glmermod)[1] + x + practice.num[i]*fixef(glmermod)[2])*dnorm(x,0,sqrt(sigma2b0))
}<-integrate(intfun,-Inf, Inf)[1]
pa.rate[i]
} <-rbind(mdat2,
mdat4data.frame(practice.num = practice.num,
p = unlist(pa.rate),
type = rep("Marginal mean by Integration", 10)))
# Plot
ggplot(mdat4, aes(practice.num, p, col=type)) +
geom_line() + geom_point() +
xlab("Number of practices") +
ylab("Proportion of skaters that can skate backwards")
Again, we see that we are able to approximate the population proportions well once we translate the fit of our GLMM to a prediction about the marginal mean response curve.
19.8.4 Simulation
Rather than use numerical integration, we could solve for marginal response patterns using a simulation approach, where we:
- generate a large sample of random effects from the fitted random-effects distribution, \(N(0, \hat{\sigma}^2_{b_0})\);
- generate the conditional, subject-specific curves for each of these randomly chosen individuals;
- average the subject-specific curves from step 2 to estimate the marginal mean response curve.
We leave this as a potential exercise.
19.8.5 Approximating marginal means using GLMMadpative
In Section 19.8.3 and Section 19.8.4, we discussed approaches for piecing together marginal response curves that essentially estimate \(E[Y_i | X_i]\) at a set of unique values of \(X_i\) chosen by the user. We can then plot \(E[Y_i | X_i]\) versus \(X_i\) to examine these marginal (population-average) response curves. However, these approaches do not give us a set of marginal coefficients for summarizing the effects of \(X_i\) at the population level. In Section 19.8.2, we saw a few special cases where we could derive or approximate marginal or population-level regression parameters from the fitted GLMM. In this section, we will explore the GLMMadaptive
package (Rizopoulos 2021), which provides a function, marginal_coefs
, that estimates coefficients describing marginal response patterns for a wider variety of GLMMS.
We begin by fitting the same GLMM to our simulated skating data, but using the mixed_model
function in the GLMMadaptive
package. Its syntax is similar to that of lme
in the nlme
package:
library(GLMMadaptive)
<- mixed_model(fixed = Y ~ practice.num,
fit.glmm random = ~ 1 | individual,
family = binomial(link = "logit"),
data = skatedat
)summary(fit.glmm)
Call:
mixed_model(fixed = Y ~ practice.num, random = ~1 | individual,
data = skatedat, family = binomial(link = "logit"))
Data Descriptives:
Number of Observations: 1000
Number of Groups: 100
Model:
family: binomial
link: logit
Fit statistics:
log.Lik AIC BIC
-379.886 765.772 773.5875
Random effects covariance matrix:
StdDev
(Intercept) 2.230948
Fixed effects:
Estimate Std.Err z-value p-value
(Intercept) -5.9444 0.4860 -12.2303 < 1e-04
practice.num 0.9960 0.0709 14.0464 < 1e-04
Integration:
method: adaptive Gauss-Hermite quadrature rule
quadrature points: 11
Optimization:
method: hybrid EM and quasi-Newton
converged: TRUE
We see that we get very similar answers as to when we fit the model using the glmer
function. The mixed_model
function uses numerical integration to approximate the likelihood (see Section 19.7), but uses more quadrature points than glmer
does by default, and thus, should result in a more accurate approximation of the likelihood. In addition, the glmer
function only uses numerical integration when fitting models with a single random intercept, whereas the mixed_model
function can perform numerical integration for models that include random intercepts and random slopes. Another advantage of mixed_model
over glmer
is that it can fit zero-inflated Poisson and Negative Binomial models that allow for random effects in the zero part of the model; glmmTMB
also has this capability, but it only uses the Laplace approximation when fitting models (see Section 19.7). On the other hand, the mixed_model
function only allows for a single grouping factor for the random effects whereas glmer
and glmmTMB
can fit models using multiple grouping factors (e.g., one could include (1 | individual)
and (1 | year)
for an analysis of data that included many individuals followed for many years).
The primary reason we will explore mixed_model
in the GLMMadaptive
package is because of its ability to provide coefficient estimates, \(\hat{\beta}^M\), for describing marginal response patterns. The marginal_coefs
function estimates these coefficients via the following method (Hedeker et al. 2018):
- Marginal means at the observed \(X_i\) are estimated using Monte Carlo simulations (Section 19.8.4). Specifically, a large number of random effects are generated from a Normal distribution, \(b^{sim} \sim N(0, \hat{\Sigma}_b)\). These simulated random effects are then used to estimate subject-specific means, \(\hat{E}[Y_i | X_i, b^{sim}]\). The subject-specific means are then averaged to form population-averaged means, \(\hat{E}[Y_i | X_i]\).
- Marginal coefficients, \(\beta^M\), are estimated by fitting a linear model with \(\text{logit}(\hat{E}[Y_i |X_i])\) as the response variable and \(X_i\) as the predictor variable. This results in the following estimator:
\[ \hat{\beta}^M = (X'X)^{-1}X' \text{logit}(\hat{E}[Y_i |X_i]), \tag{19.12}\]
Below, we calculate these marginal coefficients and compare them to our estimates using numerical integration, the estimated mean curve for a “typical skater” (formed by setting \(b_{0i}=0\)), and to the population proportions at each practice number (Figure 19.8).
<- marginal_coefs(fit.glmm, std_errors = FALSE)
marginal_beta <-rbind(mdat4,
mdat5data.frame(practice.num = practice.num,
p = plogis(marginal_beta$betas[1]+practice.num * marginal_beta$betas[2]),
type = rep("GLMMadaptive", 10)))
# Plot
ggplot(mdat5, aes(practice.num, p, col=type)) +
geom_line() + geom_point() +
xlab("Number of practices") +
ylab("Proportion of skaters that can skate backwards")
marginal_coef
function (red) from the GLMMadaptive
package versus the response curve for a typical individual (formed by setting \(b_{0i} = 0\); aqua blue). Population proportions are given in purple.
Again, we see that we are able to recover marginal (population-average) response patterns using this approach, but with the advantage of being able to report population-level odds ratios:
exp(marginal_beta[1]$betas[2])
practice.num
1.830493
I.e., we can report that the odds of skating backwards increases by a factor of 1.8 at the population level for each additional practice. This odds ratio will always be smaller than the corresponding subject-specific odds ratio that holds \(b_{0i}\) constant:
exp(fixef(glmermod)[2])
practice.num
2.710188
Again, the difference is expected since the subject-specific curves rise faster than the population average curve (Figure 19.4 B).
19.9 Hypothesis testing and modeling strategies
Methods for selecting an appropriate model and testing hypotheses involving regression parameters in generalized linear mixed effects models typically rely on an asymptotic Normality assumption. Thus, we do not have to worry about various degrees-of-freedom approximations, and there is no REML option to consider. On the other hand, inference is more reliant on large samples. In addition, the challenges associated with fitting GLMMS (see Section 19.7) often limits the complexity of the models we can fit, particularly with binary data which tend to be less informative than continuous data (e.g., see Section 8.5).
Below, we reconsider our model for the mallard nest occupancy data and show how the Anova
function can be used to test hypotheses regarding the fixed effects parameters:
library(car)
Anova(nest.ri)
Analysis of Deviance Table (Type II Wald chisquare tests)
Response: nest
Chisq Df Pr(>Chisq)
year 0.8803 2 0.64395
period 2.0194 3 0.56839
deply 0.0872 1 0.76783
vom 0.0028 1 0.95811
poly(wetsize, 2) 18.0552 2 0.00012 ***
year:period 13.3667 6 0.03757 *
period:vom 10.5896 3 0.01417 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Based on the significant interactions, it seems reasonable to stick with the full model for inference. We would conclude that nest occupancy rates depend on vom
, but that the relationship between voc
and nest occupancy varies by period
. We might also consider using functions in the ggeffects
package to explore the non-linear relationship between wetland size and nest occupancy (Figure 19.9).
plot(ggpredict(nest.ri, "wetsize [all]"))
plot(ggeffect(nest.ri, "wetsize [all]"))
ggpredict
(left panel) and ggeffect
(right panel) functions in the ggeffects
package (Lüdecke 2018) depicting the relationship between wetland size (wetsize
) on nest occupancy rates inferred from the generalized linear mixed effects model fit to the data from Zicus et al. (2006).
To understand what is plotted, we can look at the output from the ggpredict
and ggeffect
functions.
ggpredict(nest.ri, "wetsize [all]")
# Predicted probabilities of nest
wetsize | Predicted | 95% CI
--------------------------------
0.00 | 0.41 | 0.06, 0.88
0.23 | 0.43 | 0.06, 0.89
0.63 | 0.46 | 0.07, 0.90
1.37 | 0.51 | 0.09, 0.92
3.14 | 0.63 | 0.13, 0.95
5.05 | 0.72 | 0.19, 0.97
7.19 | 0.80 | 0.25, 0.98
27.39 | 0.69 | 0.11, 0.98
Adjusted for:
* year = 1997
* period = 1
* deply = 1
* vom = 0.89
* strtno = 0 (population-level)
Not all rows are shown in the output. Use `print(..., n = Inf)` to show all rows.
The output from ggpredict
is relatively easy to understand – these are predictions formed by setting all covariates other than the focal covariate (wetsize
) to specific values listed below the output. The random effect is also set to 0, implying that these are predictions for a “typical” subject when year
= 1997, period
= 1, deply
= 1 (single cylinder), and vom
= 0.89.
Now, let’s look at the output from ggeffect
.
ggeffect(nest.ri, "wetsize [all]")
# Predicted probabilities of nest
wetsize | Predicted | 95% CI
--------------------------------
0.00 | 0.20 | 0.06, 0.49
0.23 | 0.21 | 0.07, 0.50
0.63 | 0.23 | 0.08, 0.53
1.37 | 0.28 | 0.09, 0.58
3.14 | 0.38 | 0.14, 0.69
5.05 | 0.49 | 0.20, 0.79
7.19 | 0.59 | 0.26, 0.86
27.39 | 0.45 | 0.10, 0.86
Not all rows are shown in the output. Use `print(..., n = Inf)` to show all rows.
The output from ggeffect
is more difficult to decipher as it is averaging over the values of the other covariates in some way. I also suspect that it is setting the random effect to 0. We can see this if we compare our earlier plots for the ice skating example to the output from ggeffects
(Figure 19.10). The response pattern from ggeffects
most closely matches the response curve for a typical individual formed by setting the random effect equal to 0.
ggplot(mdat2, aes(practice.num, p, col=type)) +
geom_line() + geom_point() +
xlab("Number of practices") +
ylab("Proportion of skaters that can skate backwards") +
theme(legend.position = "bottom", legend.box = "horizontal")
plot(ggeffect(glmermod))
ggeffects
function (right panel) to the response curve for a typical individual, formed by setting \(b_{0i} = 0\) (left panel, red). Population proportions are indicated by the aqua blue curve in the left panel.