Ling 204: session 06

Author

Affiliations

UC Santa Barbara

JLU Giessen

Published

11 Feb 2026 11-34-56

1 Session 06: Mixed-effects models 2a

In this session, we will look at another data set in a language acquisition paper published in Cognition in (2013), namely Pine et al. (2013). It, too, is not really ideal for the application of mixed-effects modeling, but again nicely small (so they are easy to look at comprehensively and fast to run) and also didactically interesting. Still, bear in mind that they are not the best examples and that the analysis we will proceed with here is quite some overkill, given the size and complexity of this data set.

rm(list=ls(all.names=TRUE))
set.seed(sum(utf8ToInt("Gummibaerchen")))
pkgs2load <- c("dplyr", "car", "effects", "emmeans", "lme4", "magrittr", "multcomp", "MuMIn")
suppressMessages(sapply(pkgs2load, library, character.only=TRUE, logical.return=TRUE, quietly=TRUE))
rm(pkgs2load)

   dplyr      car  effects  emmeans     lme4 magrittr multcomp    MuMIn
    TRUE     TRUE     TRUE     TRUE     TRUE     TRUE     TRUE     TRUE

This analysis will be based on the data in their Table 2, which we load from _input/pine-etal-table2.csv, and you can find information about its contents in _input/pine-etal-table2.r:

summary(d <- read.delim(
   file="_input/pine-etal-table2.csv",
   stringsAsFactors=TRUE))

      CASE          OVERLAP            NAME        SAMPLE       USED
 Min.   : 1.00   Min.   :0.0500   Anne   : 6   Min.   :100.0   no :36
 1st Qu.:18.75   1st Qu.:0.1000   Aran   : 6   1st Qu.:100.0   yes:36
 Median :36.50   Median :0.1900   Becky  : 6   Median :200.0
 Mean   :36.50   Mean   :0.2614   Carl   : 6   Mean   :266.7
 3rd Qu.:54.25   3rd Qu.:0.3475   Dominic: 6   3rd Qu.:500.0
 Max.   :72.00   Max.   :1.0000   Gail   : 6   Max.   :500.0
                                  (Other):36

They did this case study because they expected that the amount of overlap would be positively correlated with the sample size.

1.1 Exploration & preparation

As always, it is indispensable to do a decent amount of exploratory plotting before any statistical analysis is begun. If you do mixed-effects modeling, you really must look at

every variable in isolation (the response, predictors, controls, random effects, everything);
every rhs variable (fixed and random effects) with the response;
every combination of rhs variables that will be considered in the model on their own and with the response.

1.1.1 The response

Since we want to stay as close to their data as possible, we follow them in applying the arcsine transformation to the response variable:

d$OVERLAP.asin <- asin(d$OVERLAP)
summary(d <- d[,c(1,3,6,4,5,2)])

      CASE            NAME     OVERLAP.asin         SAMPLE       USED
 Min.   : 1.00   Anne   : 6   Min.   :0.05002   Min.   :100.0   no :36
 1st Qu.:18.75   Aran   : 6   1st Qu.:0.10017   1st Qu.:100.0   yes:36
 Median :36.50   Becky  : 6   Median :0.19116   Median :200.0
 Mean   :36.50   Carl   : 6   Mean   :0.28496   Mean   :266.7
 3rd Qu.:54.25   Dominic: 6   3rd Qu.:0.35494   3rd Qu.:500.0
 Max.   :72.00   Gail   : 6   Max.   :1.57080   Max.   :500.0
                 (Other):36
    OVERLAP
 Min.   :0.0500
 1st Qu.:0.1000
 Median :0.1900
 Mean   :0.2614
 3rd Qu.:0.3475
 Max.   :1.0000

What do the response and its transformed version look like?

par(mfrow=c(1,3))
hist(d$OVERLAP, breaks="FD", main="")
hist(d$OVERLAP.asin, breaks="FD", main="")
plot(d$OVERLAP, d$OVERLAP.asin); grid(); abline(0, 1)
par(mfrow=c(1,1))

Figure 1: Descriptive exploration of the response

Not really clear to me that this transformation does so much good here …

1.1.2 The fixed-effects structure

Here’s the first predictor, SAMPLE, which we convert to a factor first because that’s what Pine et al. did; all levels are equally frequent, perfectly balanced:

d$SAMPLE <- factor(d$SAMPLE)
table(d$SAMPLE) # frequencies of levels


100 200 500
 24  24  24

tapply(d$OVERLAP.asin, d$SAMPLE, mean) # means of response per level

      100       200       500
0.1852856 0.2780748 0.3915233

boxplot(d$OVERLAP.asin ~ d$SAMPLE, notch=TRUE); grid()

Figure 2: Exploration of the response as a function of SAMPLE

Here’s the second predictor USED; same story: everything’s perfectly balanced and at least from what we see now, this should be highly significant:

table(d$USED) # frequencies of levels


 no yes
 36  36

tapply(d$OVERLAP.asin, d$USED, mean) # means of response per level

       no       yes
0.1171094 0.4528131

boxplot(d$OVERLAP.asin ~ d$USED, notch=TRUE); grid()

Figure 3: Exploration of the response as a function of USED

Now the combination of predictors, which looks like there could be an interaction between SAMPLE and USED:

table(d$SAMPLE, d$USED) # frequencies of combinations of levels

tapply(d$OVERLAP.asin, list(d$SAMPLE, d$USED), mean) # means of ...

            no       yes
100 0.06505012 0.3055211
200 0.10269516 0.4534543
500 0.18358289 0.5994637

1.1.3 The random-effects structure

Every kid contributes the same (small) number of data points:

table(d$NAME)


   Anne    Aran   Becky    Carl Dominic    Gail    Joel    John     Liz  Nicole
      6       6       6       6       6       6       6       6       6       6
   Ruth  Warren
      6       6

Every kid is recorded with each sample size, which means we need varying slopes of SAMPLE|NAME (and maybe the interaction w/ USED):

table(d$NAME, d$SAMPLE)


          100 200 500
  Anne      2   2   2
  Aran      2   2   2
  Becky     2   2   2
  Carl      2   2   2
  Dominic   2   2   2
  Gail      2   2   2
  Joel      2   2   2
  John      2   2   2
  Liz       2   2   2
  Nicole    2   2   2
  Ruth      2   2   2
  Warren    2   2   2

Every kid is recorded in each condition of USED (& even equally often), which means we need varying slopes of USED|NAME (and maybe the interaction w/ SAMPLE):

table(d$NAME, d$USED)


          no yes
  Anne     3   3
  Aran     3   3
  Becky    3   3
  Carl     3   3
  Dominic  3   3
  Gail     3   3
  Joel     3   3
  John     3   3
  Liz      3   3
  Nicole   3   3
  Ruth     3   3
  Warren   3   3

Every kid is recorded once in each combo of SAMPLE & USED (!)

ftable(d$NAME, d$USED, d$SAMPLE)

             100 200 500

Anne    no     1   1   1
        yes    1   1   1
Aran    no     1   1   1
        yes    1   1   1
Becky   no     1   1   1
        yes    1   1   1
Carl    no     1   1   1
        yes    1   1   1
Dominic no     1   1   1
        yes    1   1   1
Gail    no     1   1   1
        yes    1   1   1
Joel    no     1   1   1
        yes    1   1   1
John    no     1   1   1
        yes    1   1   1
Liz     no     1   1   1
        yes    1   1   1
Nicole  no     1   1   1
        yes    1   1   1
Ruth    no     1   1   1
        yes    1   1   1
Warren  no     1   1   1
        yes    1   1   1

1.2 Wrong & traditional analyses

One might be tempted to fit this model, but …

summary(lm(OVERLAP ~ NAME*SAMPLE*USED, data=d))

that would be wrong, because …? Because

NAME is not a fixed effect but a random effect;
this model doesn’t account for the repeated measurements;
this model cannot be fit because the highest-level interaction has only 1 case per combo.

Here’s what Pine et al. did, a repeated-measures ANOVA with the 2 predictors SAMPLE and USED nested into NAME:

summary(aov(OVERLAP.asin ~ SAMPLE*USED + Error(NAME/(SAMPLE*USED)), data=d))


Error: NAME
          Df Sum Sq Mean Sq F value Pr(>F)
Residuals 11  1.566  0.1424

Error: NAME:SAMPLE
          Df Sum Sq Mean Sq F value   Pr(>F)
SAMPLE     2 0.5121 0.25606   67.12 4.31e-10 ***
Residuals 22 0.0839 0.00381
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Error: NAME:USED
          Df Sum Sq Mean Sq F value  Pr(>F)
USED       1  2.029  2.0285    15.8 0.00218 **
Residuals 11  1.412  0.1284
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Error: NAME:SAMPLE:USED
            Df  Sum Sq Mean Sq F value   Pr(>F)
SAMPLE:USED  2 0.09435 0.04717   13.94 0.000123 ***
Residuals   22 0.07444 0.00338
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

ez::ezANOVA(data=d, dv=.(OVERLAP.asin), wid=.(NAME), within=.(SAMPLE, USED))

$ANOVA
       Effect DFn DFd        F            p p<.05       ges
2      SAMPLE   2  22 67.12335 4.312508e-10     * 0.1403439
3        USED   1  11 15.79960 2.177850e-03     * 0.3927156
4 SAMPLE:USED   2  22 13.94226 1.227513e-04     * 0.0291981

$`Mauchly's Test for Sphericity`
       Effect         W            p p<.05
2      SAMPLE 0.2317863 6.690196e-04     *
4 SAMPLE:USED 0.1420874 5.791317e-05     *

$`Sphericity Corrections`
       Effect       GGe        p[GG] p[GG]<.05       HFe        p[HF] p[HF]<.05
2      SAMPLE 0.5655425 1.505157e-06         * 0.5863370 1.016140e-06         *
4 SAMPLE:USED 0.5382384 2.557495e-03         * 0.5500931 2.363317e-03         *

In their results, there is

a main effect of noun type (= USED);
a main effect of developmental stage (= SAMPLE);
an interaction of USED:SAMPLE.

BTW: note the violation of sphericity.

1.3 Modeling

1.3.1 Exploring the random-effects structure

The first model one might fit is nonsense – why?

summary(m_01a <- lmer(
   OVERLAP.asin ~ 1 +         # intercept
      SAMPLE*USED +           # fixed effects
      (1 + SAMPLE*USED|NAME), # random effects
   data=d), correlation=FALSE)

So we eliminate the interaction in the ranef structure, but first we do something else that is potentially useful: we set successive-difference contrasts for SAMPLE:

contrasts(d$SAMPLE) <- contr.sdif
summary(m_01b <- lmer(
   OVERLAP.asin ~ 1 +         # intercept
      SAMPLE*USED +           # fixed effects
      (1 + SAMPLE+USED|NAME), # random effects
   data=d), correlation=FALSE)

Linear mixed model fit by REML ['lmerMod']
Formula: OVERLAP.asin ~ 1 + SAMPLE * USED + (1 + SAMPLE + USED | NAME)
   Data: d

REML criterion at convergence: -158.9

Scaled residuals:
    Min      1Q  Median      3Q     Max
-3.4748 -0.3831 -0.0187  0.3337  2.9520

Random effects:
 Groups   Name          Variance  Std.Dev. Corr
 NAME     (Intercept)   5.961e-05 0.007721
          SAMPLE200-100 5.407e-03 0.073530  1.00
          SAMPLE500-200 3.087e-04 0.017569 -1.00 -1.00
          USEDyes       8.482e-02 0.291235  1.00  1.00 -1.00
 Residual               1.821e-03 0.042667
Number of obs: 72, groups:  NAME, 12

Fixed effects:
                      Estimate Std. Error t value
(Intercept)           0.117109   0.007452  15.714
SAMPLE200-100         0.037645   0.027459   1.371
SAMPLE500-200         0.080888   0.018142   4.459
USEDyes               0.335704   0.084672   3.965
SAMPLE200-100:USEDyes 0.110288   0.024634   4.477
SAMPLE500-200:USEDyes 0.065122   0.024634   2.644
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

Of course we get a singularity warning … We explore the ranef structure to see what we might end up with:

rePCA(m_01b)$NAME$sdev %>% round(4)

[1] 7.0542 0.0005 0.0001 0.0000

The output suggests we will end up with 1 source of ranef variation. We have two kinds of random slopes we might eliminate and we cannot see from the ranef output which one comes with the smaller variance/sd. Thus, we try eliminating both (separately). First, we drop the slopes for SAMPLE:

summary(m_02a <- update(
   m_01b, .~.
   - (1 + SAMPLE+USED|NAME)   # random effects dropped
   + (1 +        USED|NAME)), # random effects inserted: no slopes for SAMPLE
   correlation=FALSE)

Linear mixed model fit by REML ['lmerMod']
Formula: OVERLAP.asin ~ SAMPLE + USED + (1 + USED | NAME) + SAMPLE:USED
   Data: d

REML criterion at convergence: -131.6

Scaled residuals:
    Min      1Q  Median      3Q     Max
-4.9561 -0.3226 -0.0054  0.2551  4.0874

Random effects:
 Groups   Name        Variance  Std.Dev. Corr
 NAME     (Intercept) 6.006e-05 0.00775
          USEDyes     8.442e-02 0.29056  1.00
 Residual             3.003e-03 0.05480
Number of obs: 72, groups:  NAME, 12

Fixed effects:
                      Estimate Std. Error t value
(Intercept)           0.117109   0.009403  12.454
SAMPLE200-100         0.037645   0.022371   1.683
SAMPLE500-200         0.080888   0.022371   3.616
USEDyes               0.335704   0.084865   3.956
SAMPLE200-100:USEDyes 0.110288   0.031638   3.486
SAMPLE500-200:USEDyes 0.065122   0.031638   2.058
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

anova(m_01b, m_02a, test="Chisq", refit=FALSE) # note the refit=FALSE, we stick with REML

Data: d
Models:
m_02a: OVERLAP.asin ~ SAMPLE + USED + (1 + USED | NAME) + SAMPLE:USED
m_01b: OVERLAP.asin ~ 1 + SAMPLE * USED + (1 + SAMPLE + USED | NAME)
      npar     AIC     BIC logLik -2*log(L)  Chisq Df Pr(>Chisq)
m_02a   10 -111.64 -88.872 65.819   -131.64
m_01b   17 -124.93 -86.223 79.463   -158.93 27.287  7  0.0002958 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We still get a singularity warning and the LR-test says we need subject-specific slopes for SAMPLE.

Second, we drop the slopes for USED:

summary(m_02b <- update(
   m_01b, .~.
   - (1 + SAMPLE+USED|NAME)   # random effects dropped
   + (1 + SAMPLE     |NAME)), # random effects inserted: no slopes for USED
   correlation=FALSE)

Linear mixed model fit by REML ['lmerMod']
Formula: OVERLAP.asin ~ SAMPLE + USED + (1 + SAMPLE | NAME) + SAMPLE:USED
   Data: d

REML criterion at convergence: -16.7

Scaled residuals:
    Min      1Q  Median      3Q     Max
-2.6928 -0.4206 -0.0201  0.2695  4.0133

Random effects:
 Groups   Name          Variance  Std.Dev. Corr
 NAME     (Intercept)   0.0193394 0.13907
          SAMPLE200-100 0.0044044 0.06637   1.00
          SAMPLE500-200 0.0002475 0.01573  -1.00 -1.00
 Residual               0.0273878 0.16549
Number of obs: 72, groups:  NAME, 12

Fixed effects:
                      Estimate Std. Error t value
(Intercept)            0.11711    0.04871   2.404
SAMPLE200-100          0.03765    0.07023   0.536
SAMPLE500-200          0.08089    0.06771   1.195
USEDyes                0.33570    0.03901   8.606
SAMPLE200-100:USEDyes  0.11029    0.09555   1.154
SAMPLE500-200:USEDyes  0.06512    0.09555   0.682
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')

anova(m_01b, m_02b, test="Chisq", refit=FALSE) # note the refit=FALSE, we stick with REML

Data: d
Models:
m_02b: OVERLAP.asin ~ SAMPLE + USED + (1 + SAMPLE | NAME) + SAMPLE:USED
m_01b: OVERLAP.asin ~ 1 + SAMPLE * USED + (1 + SAMPLE + USED | NAME)
      npar     AIC     BIC logLik -2*log(L)  Chisq Df Pr(>Chisq)
m_02b   13    9.34  38.936  8.330    -16.66
m_01b   17 -124.93 -86.223 79.463   -158.93 142.27  4  < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We still get a singularity warning and the LR-test says we need subject-specific slopes for USED. Just like last time,

the model summaries say you cannot fit the models because of singularity warnings;
LR-tests say you cannot simplify the models’ ranef structure because the slopes are all significant.

So what do we do? We go with trying to get models without any convergence warnings and of the two we just fit, m_02a is less different from m_01b than m_02b, so that’s the we one we continue to do model selection with:

summary(m_03 <- update(
   m_02a, .~.
   - (1 + USED|NAME)   # random effects dropped
   + (1       |NAME)), # random effects inserted: no slopes for USED
   correlation=FALSE)

Linear mixed model fit by REML ['lmerMod']
Formula: OVERLAP.asin ~ SAMPLE + USED + (1 | NAME) + SAMPLE:USED
   Data: d

REML criterion at convergence: -14.8

Scaled residuals:
    Min      1Q  Median      3Q     Max
-2.1555 -0.3730 -0.0158  0.2510  4.4267

Random effects:
 Groups   Name        Variance Std.Dev.
 NAME     (Intercept) 0.01897  0.1377
 Residual             0.02856  0.1690
Number of obs: 72, groups:  NAME, 12

Fixed effects:
                      Estimate Std. Error t value
(Intercept)            0.11711    0.04873   2.403
SAMPLE200-100          0.03765    0.06899   0.546
SAMPLE500-200          0.08089    0.06899   1.172
USEDyes                0.33570    0.03983   8.428
SAMPLE200-100:USEDyes  0.11029    0.09757   1.130
SAMPLE500-200:USEDyes  0.06512    0.09757   0.667

anova(m_02a, m_03, test="Chisq", refit=FALSE) # note the refit=FALSE, we stick with REML

Data: d
Models:
m_03: OVERLAP.asin ~ SAMPLE + USED + (1 | NAME) + SAMPLE:USED
m_02a: OVERLAP.asin ~ SAMPLE + USED + (1 + USED | NAME) + SAMPLE:USED
      npar      AIC     BIC logLik -2*log(L)  Chisq Df Pr(>Chisq)
m_03     8    1.197  19.410  7.401   -14.803
m_02a   10 -111.638 -88.872 65.819  -131.638 116.84  2  < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The LR-test again says we shouldn’t simplify, but this model converges so we’re done with the ranef structure and begin the exploration of the fixef structure with (the ML version of) m_03.

1.3.2 Exploring the fixed-effects structure

First, we switch to an ML version of m_03:

summary(m_03.ml <- update( # update
   m_03, .~.,              # this to a model w/
   REML=FALSE),            # ML estimation
   correlation=FALSE)

Linear mixed model fit by maximum likelihood  ['lmerMod']
Formula: OVERLAP.asin ~ SAMPLE + USED + (1 | NAME) + SAMPLE:USED
   Data: d

      AIC       BIC    logLik -2*log(L)  df.resid
    -22.7      -4.5      19.3     -38.7        64

Scaled residuals:
    Min      1Q  Median      3Q     Max
-2.2514 -0.3896 -0.0165  0.2622  4.6236

Random effects:
 Groups   Name        Variance Std.Dev.
 NAME     (Intercept) 0.01739  0.1319
 Residual             0.02618  0.1618
Number of obs: 72, groups:  NAME, 12

Fixed effects:
                      Estimate Std. Error t value
(Intercept)            0.11711    0.04665   2.510
SAMPLE200-100          0.03765    0.06605   0.570
SAMPLE500-200          0.08089    0.06605   1.225
USEDyes                0.33570    0.03814   8.803
SAMPLE200-100:USEDyes  0.11029    0.09341   1.181
SAMPLE500-200:USEDyes  0.06512    0.09341   0.697

Then, we determine what, if anything, we can / need to drop:

drop1(m_03.ml, test="Chisq") %>% data.frame

            npar       AIC      LRT   Pr.Chi.
<none>        NA -22.67822       NA        NA
SAMPLE:USED    2 -23.17829 3.499933 0.1737798

The interaction of SAMPLE:USED is ns so we drop it:

summary(m_04.ml <- update( # update
   m_03.ml, .~.            # this to a model w/
   - SAMPLE:USED),         # fixef interaction dropped
   correlation=FALSE)

Linear mixed model fit by maximum likelihood  ['lmerMod']
Formula: OVERLAP.asin ~ SAMPLE + USED + (1 | NAME)
   Data: d

      AIC       BIC    logLik -2*log(L)  df.resid
    -23.2      -9.5      17.6     -35.2        66

Scaled residuals:
    Min      1Q  Median      3Q     Max
-2.1833 -0.4533 -0.0623  0.2739  4.5693

Random effects:
 Groups   Name        Variance Std.Dev.
 NAME     (Intercept) 0.01713  0.1309
 Residual             0.02775  0.1666
Number of obs: 72, groups:  NAME, 12

Fixed effects:
              Estimate Std. Error t value
(Intercept)    0.11711    0.04688   2.498
SAMPLE200-100  0.09279    0.04809   1.930
SAMPLE500-200  0.11345    0.04809   2.359
USEDyes        0.33570    0.03926   8.550

anova(m_03.ml, m_04.ml, test="Chisq") %>% data.frame

        npar       AIC       BIC   logLik X.2.log.L.    Chisq Df Pr..Chisq.
m_04.ml    6 -23.17829 -9.518289 17.58914  -35.17829       NA NA         NA
m_03.ml    8 -22.67822 -4.464889 19.33911  -38.67822 3.499933  2  0.1737798

Can/must we drop even more?

drop1(m_04.ml, test="Chisq") %>% data.frame

       npar       AIC      LRT      Pr.Chi.
<none>   NA -23.17829       NA           NA
SAMPLE    2 -11.08792 16.09036 3.206430e-04
USED      1  22.62703 47.80531 4.707142e-12

Nope, that’s it, we’re done: Our final model is the REML version of m_04.ml:

summary(m_final <- update( # update
   m_04.ml, .~.,           # this to a model w/
   REML=TRUE),             # REML estimation
   correlation=FALSE)

Linear mixed model fit by REML ['lmerMod']
Formula: OVERLAP.asin ~ SAMPLE + USED + (1 | NAME)
   Data: d

REML criterion at convergence: -17.4

Scaled residuals:
    Min      1Q  Median      3Q     Max
-2.1482 -0.4389 -0.0580  0.2695  4.4334

Random effects:
 Groups   Name        Variance Std.Dev.
 NAME     (Intercept) 0.01886  0.1373
 Residual             0.02921  0.1709
Number of obs: 72, groups:  NAME, 12

Fixed effects:
              Estimate Std. Error t value
(Intercept)    0.11711    0.04882   2.399
SAMPLE200-100  0.09279    0.04934   1.881
SAMPLE500-200  0.11345    0.04934   2.299
USEDyes        0.33570    0.04028   8.333

We first generate a summary output of the final model that provides p-values for the fixef coefficients. Note that I am again not fully loading the package lmerTest with library but use explicit namespacing instead:

summary(lmerTest::lmer(formula(m_final), data=d), correlation=FALSE)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: formula(m_final)
   Data: d

REML criterion at convergence: -17.4

Scaled residuals:
    Min      1Q  Median      3Q     Max
-2.1482 -0.4389 -0.0580  0.2695  4.4334

Random effects:
 Groups   Name        Variance Std.Dev.
 NAME     (Intercept) 0.01886  0.1373
 Residual             0.02921  0.1709
Number of obs: 72, groups:  NAME, 12

Fixed effects:
              Estimate Std. Error       df t value Pr(>|t|)
(Intercept)    0.11711    0.04882 15.84772   2.399   0.0291 *
SAMPLE200-100  0.09279    0.04934 57.00000   1.881   0.0651 .
SAMPLE500-200  0.11345    0.04934 57.00000   2.299   0.0252 *
USEDyes        0.33570    0.04028 57.00000   8.333 1.93e-11 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

With a one-tailed test, which we allowed to do given that we suspected positive correlation of SAMPLE with the response, all coefficients are significant. Also, we should maybe generate nd again for all combinations of predictors:

nd <- expand.grid(
   SAMPLE=levels(d$SAMPLE), # all levels of SAMPLE
   USED=levels(d$USED))     # all levels of USED
nd <- cbind(nd, PREDS=predict( # note: now we're not using ranefs
   m_final, re.form=NA, newdata=nd)) # for the predictions!
nd$PREDS_backtr <- sin(nd$PREDS)
tapply(nd$PREDS, list(nd$USED, nd$SAMPLE), c) %>% round(4)

       100    200    500
no  0.0174 0.1102 0.2237
yes 0.3531 0.4459 0.5594

1.4 Diagnostics

1.4.1 Exploring residuals generally

The overall distribution of the residuals is not that great (even though we need to bear in mind how small the data set really is):

hist(residuals(m_final), breaks="FD", xlim=c(-1, 1), main="")

Figure 4: The residuals of the current final model

1.4.2 Exploring residuals per kid

Now we see which kid sticks out:

plot(residuals(m_final), type="h", ylim=c(-0.8, 0.8),
     col=rainbow(length(levels(d$NAME)))[as.numeric(d$NAME)]); grid()
text(d$CASE,
     residuals(m_final),
     substr(as.character(d$SAMPLE), 1, 1),
     col=rainbow(length(levels(d$NAME)))[as.numeric(d$NAME)])
text(tapply(d$CASE, d$NAME, mean), 0.2, levels(d$NAME),
     col=rainbow(length(levels(d$NAME)))[seq(length(levels(d$NAME)))])

Figure 5: The residuals of the final model (per child)

Doing some more detailed exploration of whether Ruth’s data ‘do damage’ to the model might be useful …

1.4.3 Exploring residuals against fitted

Let’s also generate the usual diagnostic plot that plot(lm(...)) would generate: residuals vs. fitted. This certainly doesn’t look good: there are Ruth’s data points and – because of them? – a clear curved trend:

plot(m_final, type=c("p", "smooth"), id=0.05)

Figure 6: Residuals against fitted for the final model

One can sometimes also do the following, but in this case this is not useful:

plot(m_final, resid(.) ~ fitted(.) | NAME, abline=c(h=0), lty=1, type=c("p", "smooth"))

1.4.4 Exploring residuals against predictors

Finally, let’s see whether there’s anything noteworthy when we plot residuals against predictors; the residual distribution for USED is imbalanced.

stripchart(residuals(m_final) ~ d$SAMPLE, method="jitter"); grid(); abline(v=0)
stripchart(residuals(m_final) ~ d$USED  , method="jitter"); grid(); abline(v=0)

Figure 7: Residuals against predictors for the final model

Figure 8: Residuals against predictors for the final model

1.4.5 Looking for solutions 1: influence measures

It doesn’t seem as if omitting Ruth would change things a bit, but not all that much:

influence.ME::pchange(influence.ME::influence(m_final, group="NAME")) # percentage change
influence.ME::sigtest(influence.ME::influence(m_final, group="NAME")) # does sign change?

But let’s look at the diagnostics if we drop her, which is of course simplistic (because we only use the final model when omitting her might actually lead to a different final model). Definitely no progress with regard to the curvature:

summary(m_final.noRuth <- lmer(formula(m_final),
   data=d, subset=d$NAME!="Ruth"), correlation=FALSE)

Linear mixed model fit by REML ['lmerMod']
Formula: OVERLAP.asin ~ SAMPLE + USED + (1 | NAME)
   Data: d
 Subset: d$NAME != "Ruth"

REML criterion at convergence: -138.9

Scaled residuals:
     Min       1Q   Median       3Q      Max
-1.90513 -0.62064  0.02778  0.54927  2.66754

Random effects:
 Groups   Name        Variance Std.Dev.
 NAME     (Intercept) 0.002149 0.04635
 Residual             0.004073 0.06382
Number of obs: 66, groups:  NAME, 11

Fixed effects:
              Estimate Std. Error t value
(Intercept)    0.11526    0.01785   6.456
SAMPLE200-100  0.07015    0.01924   3.646
SAMPLE500-200  0.11822    0.01924   6.144
USEDyes        0.25542    0.01571  16.258

plot(m_final.noRuth, type=c("p", "smooth"), id=0.05)

Figure 9: Residuals against fitted for the final model (no Ruth)

Let’s explore in more detail:

par(mfrow=c(1,2))
plot(type="n",
   xlab="Fitted values", x=fitted(m_final),
   ylab="Residuals"    , y=residuals(m_final)); grid(); abline(h=0, lty=2)
text(fitted(m_final), y=residuals(m_final), abbreviate(d$NAME, 2),
     col=rainbow(3)[d$SAMPLE])
text(0.2, 0.6, levels(d$SAMPLE), col=rainbow(3), pos=c(2:4))
plot(type="n",
   xlab="Fitted values", x=fitted(m_final),
   ylab="Residuals"    , y=residuals(m_final)); grid(); abline(h=0, lty=2)
text(fitted(m_final), y=residuals(m_final), abbreviate(d$NAME, 2),
     col=rainbow(2)[d$USED])
text(0.2, 0.6, levels(d$USED), col=rainbow(2), pos=c(1,3))
par(mfrow=c(1,1))

Figure 10: Residuals against fitted for the final model (all names)

It seems as if there is a negative correlation between residuals and fitted values when USED is no, but a positive correlation between residuals and fitted values when USED is yes …

1.4.6 Looking for solutions 2: curvature

What if we build curvature into the model by making SAMPLE a curved numeric predictor? We start from/with the minimal ranef structure only:

d$SAMPLE.n <- as.numeric(as.character(d$SAMPLE))
summary(m_curved <- lmer(
   OVERLAP.asin ~ 1 +            # intercept
      poly(SAMPLE.n, 2) + USED + # fixed effects
      (1|NAME),                  # random effects
   data=d), correlation=FALSE)

Linear mixed model fit by REML ['lmerMod']
Formula: OVERLAP.asin ~ 1 + poly(SAMPLE.n, 2) + USED + (1 | NAME)
   Data: d

REML criterion at convergence: -22.7

Scaled residuals:
    Min      1Q  Median      3Q     Max
-2.1482 -0.4389 -0.0580  0.2695  4.4334

Random effects:
 Groups   Name        Variance Std.Dev.
 NAME     (Intercept) 0.01886  0.1373
 Residual             0.02921  0.1709
Number of obs: 72, groups:  NAME, 12

Fixed effects:
                   Estimate Std. Error t value
(Intercept)         0.11711    0.04882   2.399
poly(SAMPLE.n, 2)1  0.69786    0.17091   4.083
poly(SAMPLE.n, 2)2 -0.15845    0.17091  -0.927
USEDyes             0.33570    0.04028   8.333

drop1(m_curved, test="Chisq") %>% data.frame

                  npar       AIC      LRT      Pr.Chi.
<none>              NA -23.17829       NA           NA
poly(SAMPLE.n, 2)    2 -11.08792 16.09036 3.206430e-04
USED                 1  22.62703 47.80531 4.707142e-12

These are actually the p-values from m_04.ml because, here, we use the same number of coefficients (2) to study SAMPLE: one time with 2 level differences (200-100 and 500-200), one time with the 2 polynomial columns. Thus, no progress really. We’ll stick with the current model for now to not overcomplicate things.

1.5 Numerical interpretation

Let’s generate all numerical results again, just to have it all in one place:

summary(lmerTest::lmer(formula(m_final), data=d), correlation=FALSE)

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: formula(m_final)
   Data: d

REML criterion at convergence: -17.4

Scaled residuals:
    Min      1Q  Median      3Q     Max
-2.1482 -0.4389 -0.0580  0.2695  4.4334

Random effects:
 Groups   Name        Variance Std.Dev.
 NAME     (Intercept) 0.01886  0.1373
 Residual             0.02921  0.1709
Number of obs: 72, groups:  NAME, 12

Fixed effects:
              Estimate Std. Error       df t value Pr(>|t|)
(Intercept)    0.11711    0.04882 15.84772   2.399   0.0291 *
SAMPLE200-100  0.09279    0.04934 57.00000   1.881   0.0651 .
SAMPLE500-200  0.11345    0.04934 57.00000   2.299   0.0252 *
USEDyes        0.33570    0.04028 57.00000   8.333 1.93e-11 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

drop1(m_final, test="Chisq") %>% data.frame

       npar       AIC      LRT      Pr.Chi.
<none>   NA -23.17829       NA           NA
SAMPLE    2 -11.08792 16.09036 3.206430e-04
USED      1  22.62703 47.80531 4.707142e-12

The overall significance test of (the fixed effects of ) m_final can be conducted via model comparison again: we compare m_final to a model that has the same ranef structure but no fixefs but the overall intercept. That way, an LRT with anova will determine the p-value of all fixefs together:

m_null_re <- lmer( # create a null model, now with ranefs:
   OVERLAP.asin ~ 1 + # no fixefs but an overall intercept plus
      (1|NAME),       # the ranef structure of the final model
      data=d)
anova(m_final,      # LRT comparison of the final model
      m_null_re,    # with this null model
      test="Chisq") %>% data.frame # with ML! The diff is ***

          npar       AIC       BIC    logLik X.2.log.L.    Chisq Df  Pr..Chisq.
m_null_re    3  26.41766 33.247661 -10.20883   20.41766       NA NA          NA
m_final      6 -23.17829 -9.518289  17.58914  -35.17829 55.59595  3 5.12349e-12

Our three ways to compute confidence intervals:

(m_final_ci.car <- Confint(m_final))

                Estimate        2.5 %    97.5 %
(Intercept)   0.11710939  0.021426880 0.2127919
SAMPLE200-100 0.09278913 -0.003911585 0.1894898
SAMPLE500-200 0.11344854  0.016747831 0.2101493
USEDyes       0.33570366  0.256747864 0.4146595

(m_final_ci.prof <- confint(m_final, method="profile"))

                     2.5 %    97.5 %
.sig01         0.076715866 0.2238922
.sigma         0.140702397 0.2014938
(Intercept)    0.019620444 0.2145983
SAMPLE200-100 -0.002991942 0.1885702
SAMPLE500-200  0.017667474 0.2092296
USEDyes        0.257498749 0.4139086

(m_final_ci.boot <- confint(m_final,
   method="boot", nsim=200,
   parallel="snow", ncpus=10))

                    2.5 %    97.5 %
.sig01         0.06510375 0.2058547
.sigma         0.14113839 0.2082616
(Intercept)    0.03718052 0.2068448
SAMPLE200-100 -0.01634246 0.1882157
SAMPLE500-200  0.03609337 0.2107238
USEDyes        0.24699319 0.4076761

save.image(file="_output/204_06_mfinal2&confints.RData")

Also, we want R²-values, which are all too often not provided (for reasons I don’t understand):

r.squaredGLMM(m_final)

           R2m       R2c
[1,] 0.4267268 0.6516578

Quite nice: decent R²_marginal and R²_conditional doesn’t provide that much more variance explanation.

Here’re the most straightforward ways to extract fixefs and ranefs (in a way similar to what coef did for lm/glm):

(fixdeffs <- fixef(m_final)) # same as coef(lm(...))

  (Intercept) SAMPLE200-100 SAMPLE500-200       USEDyes
   0.11710939    0.09278913    0.11344854    0.33570366

(randeffs <- ranef(m_final, condVar=TRUE, drop=TRUE)) # a ranef.mer object

$NAME
       Anne        Aran       Becky        Carl     Dominic        Gail
-0.05352984 -0.06318890 -0.03968140 -0.05170735  0.08763418 -0.05954090
       Joel        John         Liz      Nicole        Ruth      Warren
-0.04123977 -0.05519920 -0.03640479 -0.01619986  0.36714832 -0.03809049
attr(,"postVar")
 [1] 0.003869674 0.003869674 0.003869674 0.003869674 0.003869674 0.003869674
 [7] 0.003869674 0.003869674 0.003869674 0.003869674 0.003869674 0.003869674

with conditional variances for "NAME"

(randeffs <- ranef(m_final, condVar=TRUE, drop=TRUE)$NAME) # a data frame

       Anne        Aran       Becky        Carl     Dominic        Gail
-0.05352984 -0.06318890 -0.03968140 -0.05170735  0.08763418 -0.05954090
       Joel        John         Liz      Nicole        Ruth      Warren
-0.04123977 -0.05519920 -0.03640479 -0.01619986  0.36714832 -0.03809049
attr(,"postVar")
 [1] 0.003869674 0.003869674 0.003869674 0.003869674 0.003869674 0.003869674
 [7] 0.003869674 0.003869674 0.003869674 0.003869674 0.003869674 0.003869674

VarCorr(m_final)

 Groups   Name        Std.Dev.
 NAME     (Intercept) 0.13734
 Residual             0.17091

Let’s make ‘predictions’ for all cases based on fixed and random effects:

d$PREDS_ae <- predict( # make d$PREDS_ae the result of predicting
   m_final)            # from m_final

Let’s make ‘predictions’ for all cases based on fixefs only:

d$PREDS_fe <- predict( # make d$PREDS_fe the result of predicting
   m_final,            # from m_final
   re.form=NA)         # use no ranefs

And we check out the effects results: First, for SAMPLE, then for USED:

sa_d <- data.frame(      # sa_d the data frame of
   sa <- effect(         # sa, the effect of
      "SAMPLE", m_final))# SAMPLE in m_final
sa_d$fit.backtr <- sin(sa_d$fit)
sa_d

  SAMPLE       fit         se      lower     upper fit.backtr
1    100 0.1852856 0.05281054 0.07990384 0.2906674  0.1842273
2    200 0.2780748 0.05281054 0.17269296 0.3834565  0.2745049
3    500 0.3915233 0.05281054 0.28614151 0.4969051  0.3815969

us_d <- data.frame(     # us_d the data frame of
   us <- effect(        # us, the effect of
      "USED", m_final)) # USED in m_final
us_d$fit.backtr <- sin(us_d$fit)
us_d

  USED       fit        se      lower     upper fit.backtr
1   no 0.1171094 0.0488185 0.01969358 0.2145252  0.1168419
2  yes 0.4528131 0.0488185 0.35539725 0.5502289  0.4374968

1.6 Visual interpretation

1.6.1 Fixed effects

We settle for the standard effects plot for the fixed effects:

plot(effect("SAMPLE", m_final),
   ylim=c(0, 1.6), grid=TRUE,
   multiline=TRUE, confint=list(style="auto"))

Figure 11: The effect of SAMPLE in the current final model

… then for USED:

plot(effect("USED", m_final),
   ylim=c(0, 1.6), grid=TRUE, x.var="USED",
   multiline=TRUE, confint=list(style="auto"))

Figure 12: The effect of USED in the current final model

1.6.2 Random effects

We also plot the intercept adjustments: no news there, we see what we already figured out above:

lattice::dotplot(ranef(m_final, condVar=TRUE))

$NAME

Figure 13: The adjustments to intercepts of the final model

2 Session info

sessionInfo()

R version 4.5.2 (2025-10-31)
Platform: x86_64-pc-linux-gnu
Running under: Pop!_OS 22.04 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

time zone: America/Los_Angeles
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  compiler  methods
[8] base

other attached packages:
 [1] MuMIn_1.48.11   multcomp_1.4-29 TH.data_1.1-5   MASS_7.3-65
 [5] survival_3.8-6  mvtnorm_1.3-3   lme4_1.1-38     Matrix_1.7-4
 [9] emmeans_2.0.1   effects_4.2-4   car_3.1-5       carData_3.0-6
[13] dplyr_1.2.0     STGmisc_1.06    Rcpp_1.1.1      magrittr_2.0.4

loaded via a namespace (and not attached):
 [1] gtable_0.3.6        xfun_0.56           ggplot2_4.0.2
 [4] htmlwidgets_1.6.4   insight_1.4.6       lattice_0.22-9
 [7] numDeriv_2016.8-1.1 vctrs_0.7.1         tools_4.5.2
[10] Rdpack_2.6.6        generics_0.1.4      parallel_4.5.2
[13] stats4_4.5.2        sandwich_3.1-1      tibble_3.3.1
[16] pkgconfig_2.0.3     RColorBrewer_1.1-3  S7_0.2.1
[19] lifecycle_1.0.5     stringr_1.6.0       farver_2.1.2
[22] mitools_2.4         codetools_0.2-20    lmerTest_3.2-0
[25] survey_4.4-8        htmltools_0.5.9     yaml_2.3.12
[28] Formula_1.2-5       pillar_1.11.1       nloptr_2.2.1
[31] reformulas_0.4.4    boot_1.3-32         abind_1.4-8
[34] nlme_3.1-168        tidyselect_1.2.1    digest_0.6.39
[37] stringi_1.8.7       reshape2_1.4.5      splines_4.5.2
[40] fastmap_1.2.0       grid_4.5.2          colorspace_2.1-2
[43] cli_3.6.5           scales_1.4.0        estimability_1.5.1
[46] rmarkdown_2.30      otel_0.2.0          nnet_7.3-20
[49] zoo_1.8-15          coda_0.19-4.1       evaluate_1.0.5
[52] knitr_1.51          rbibutils_2.4.1     ez_4.4-0
[55] mgcv_1.9-4          rlang_1.1.7         xtable_1.8-4
[58] glue_1.8.0          DBI_1.2.3           rstudioapi_0.18.0
[61] minqa_1.2.8         jsonlite_2.0.0      plyr_1.8.9
[64] R6_2.6.1