2025-09-30
Let’s start by loading the necessary packages for our analysis:
Extension of linear regression that accounts for:
lme4 Package (most common)
nlme Package (alternative with more covariance structures)
Let’s read the Stata dataset and explore its structure:
# A tibble: 6 × 9
caseid schoolid score cohort90 female sclass schtype schurban schdenom
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 18 1 0 -6 1 2 0 1 0
2 17 1 10 -6 1 2 0 1 0
3 19 1 0 -6 1 4 0 1 0
4 20 1 40 -6 1 3 0 1 0
5 21 1 42 -6 1 2 0 1 0
6 13 1 4 -6 1 2 0 1 0
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: score ~ 1 + (1 | schoolid)
Data: mydata
REML criterion at convergence: 286539.2
Scaled residuals:
Min 1Q Median 3Q Max
-2.9764 -0.7011 0.1017 0.7391 3.0819
Random effects:
Groups Name Variance Std.Dev.
schoolid (Intercept) 61.17 7.821
Residual 258.36 16.073
Number of obs: 33988, groups: schoolid, 508
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 30.6006 0.3698 450.7146 82.74 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: score ~ cohort90 + (1 + cohort90 | schoolid)
Data: mydata
REML criterion at convergence: 280692.2
Scaled residuals:
Min 1Q Median 3Q Max
-3.1010 -0.7203 0.0387 0.7264 3.5217
Random effects:
Groups Name Variance Std.Dev. Corr
schoolid (Intercept) 42.9666 6.5549
cohort90 0.1614 0.4017 -0.39
Residual 215.7368 14.6880
Number of obs: 33988, groups: schoolid, 508
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 30.60958 0.31381 425.94021 97.54 <2e-16 ***
cohort90 1.23388 0.02535 315.70740 48.67 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr)
cohort90 -0.266
Repeated Measures ANOVA
MANOVA (Multivariate ANOVA)
Generalized Estimating Equations (GEE)
Linear Mixed Effect Models
Working Correlation Structure:
Call:
geeglm(formula = Reaction ~ Days, family = gaussian, data = sleepstudy,
id = Subject, corstr = "exchangeable")
Coefficients:
Estimate Std.err Wald Pr(>|W|)
(Intercept) 251.405 6.632 1436.89 < 2e-16 ***
Days 10.467 1.502 48.55 3.22e-12 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation structure = exchangeable
Estimated Scale Parameters:
Estimate Std.err
(Intercept) 2251 536.6
Link = identity
Estimated Correlation Parameters:
Estimate Std.err
alpha 0.576 0.1114
Number of clusters: 18 Maximum cluster size: 10
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Reaction ~ Days + (1 | Subject)
Data: sleepstudy
REML criterion at convergence: 1786
Scaled residuals:
Min 1Q Median 3Q Max
-3.226 -0.553 0.011 0.519 4.251
Random effects:
Groups Name Variance Std.Dev.
Subject (Intercept) 1378 37.1
Residual 960 31.0
Number of obs: 180, groups: Subject, 18
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 251.405 9.747 22.810 25.8 <2e-16 ***
Days 10.467 0.804 161.000 13.0 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr)
Days -0.371
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Reaction ~ Days + (1 + Days | Subject)
Data: sleepstudy
REML criterion at convergence: 1744
Scaled residuals:
Min 1Q Median 3Q Max
-3.954 -0.463 0.023 0.463 5.179
Random effects:
Groups Name Variance Std.Dev. Corr
Subject (Intercept) 612.1 24.74
Days 35.1 5.92 0.07
Residual 654.9 25.59
Number of obs: 180, groups: Subject, 18
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 251.41 6.82 17.00 36.84 < 2e-16 ***
Days 10.47 1.55 17.00 6.77 3.3e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr)
Days -0.138
Data: sleepstudy
Models:
model1: Reaction ~ Days + (1 | Subject)
model2: Reaction ~ Days + (1 + Days | Subject)
npar AIC BIC logLik -2*log(L) Chisq Df Pr(>Chisq)
model1 4 1802 1815 -897 1794
model2 6 1764 1783 -876 1752 42.1 2 7.1e-10 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Intercept): Average reaction time at Day 0 across all subjectsDays: Average increase in reaction time per day of sleep deprivation(Intercept): Variability in individual baseline reaction timesDays: Variability in individual slopes (how much each person’s reaction time increases per day)Correlation: Relationship between individual baselines and slopes```{r}
# Compare models using likelihood ratio tests
model_simple <- lmer(outcome ~ time + (1|subject), data = longdata)
model_complex <- lmer(outcome ~ time + treatment + time:treatment +
(1 + time|subject), data = longdata)
anova(model_simple, model_complex)
# Check model assumptions
library(performance)
check_model(model_complex)
```GEE:
Linear Mixed Models:
Resources:
Linear Mixed Effect Models