Ling 202: session 03: linear regr. 2 (key)

Author
Affiliations

UC Santa Barbara

JLU Giessen

Published

16 Apr 2025 12-34-56

1 Introduction: A multifactorial model selection process

We are dealing with the same data set as last session; as a reminder, the data are in _input/rts.csv and you can find information about the variables/columns in _input/rts.r.

rm(list=ls(all.names=TRUE))
library(car); library(effects); library(emmeans); library(magrittr); library(multcomp)
summary(d <- read.delim(   # summarize d, the result of loading
   file="_input/rts.csv",  # this file
   stringsAsFactors=TRUE)) # change categorical variables into factors
      CASE            RT             LENGTH         LANGUAGE         GROUP
 Min.   :   1   Min.   : 271.0   Min.   :3.000   english:4023   english :3961
 1st Qu.:2150   1st Qu.: 505.0   1st Qu.:4.000   spanish:3977   heritage:4039
 Median :4310   Median : 595.0   Median :5.000
 Mean   :4303   Mean   : 661.5   Mean   :5.198
 3rd Qu.:6450   3rd Qu.: 732.0   3rd Qu.:6.000
 Max.   :8610   Max.   :4130.0   Max.   :9.000
                NA's   :248
      CORRECT        FREQPMW           ZIPFFREQ     SITE
 correct  :7749   Min.   :   1.00   Min.   :3.000   a:2403
 incorrect: 251   1st Qu.:  17.00   1st Qu.:4.230   b:2815
                  Median :  42.00   Median :4.623   c:2782
                  Mean   :  81.14   Mean   :4.591
                  3rd Qu.: 101.00   3rd Qu.:5.004
                  Max.   :1152.00   Max.   :6.061
                                                            
d$LANGUAGE <- factor(d$LANGUAGE,   # for didactic reasons only, I am
   levels=levels(d$LANGUAGE)[2:1]) # changing the order of the levels
d$RT_log <- log2(d$RT)

However, this session, we will deal with this in a multifactorial way. Specifically, we are asking, does the (logged) reaction time to a word (in ms) vary as a function of

  • the Zipf frequency of that word (ZIPFFREQ);
  • the length of that word (LENGTH);
  • the language that word was presented in (LANGUAGE: spanish vs. english);
  • the speaker group that words was presented to (GROUP: english vs. heritage);
  • any pairwise interaction of these predictors? (Specifically, we are expecting that the effect of LENGTH will be stronger for when LANGUAGE is spanish.)

2 Exploration & preparation

Since there are some missing data but only in the response variable, we immediately reduce d to all the complete cases a regression would consider:

summary(d <- droplevels(d[complete.cases(d),]))
      CASE            RT             LENGTH       LANGUAGE         GROUP
 Min.   :   1   Min.   : 271.0   Min.   :3.0   spanish:3775   english :3793
 1st Qu.:2123   1st Qu.: 505.0   1st Qu.:4.0   english:3977   heritage:3959
 Median :4262   Median : 595.0   Median :5.0
 Mean   :4268   Mean   : 661.5   Mean   :5.2
 3rd Qu.:6405   3rd Qu.: 732.0   3rd Qu.:6.0
 Max.   :8610   Max.   :4130.0   Max.   :9.0
      CORRECT        FREQPMW           ZIPFFREQ     SITE         RT_log
 correct  :7749   Min.   :   1.00   Min.   :3.000   a:2403   Min.   : 8.082
 incorrect:   3   1st Qu.:  18.00   1st Qu.:4.255   b:2815   1st Qu.: 8.980
                  Median :  43.00   Median :4.633   c:2534   Median : 9.217
                  Mean   :  82.88   Mean   :4.609            Mean   : 9.289
                  3rd Qu.: 104.00   3rd Qu.:5.017            3rd Qu.: 9.516
                  Max.   :1152.00   Max.   :6.061            Max.   :12.012  

Some exploration of the relevant variables; in the interest of time we only look at univariate and monofactorial stuff – a big ‘no no’ for any real analysis! How would we want to explore the response variable?

hist(d$RT_log); hist(d$RT_log, breaks="FD")

How would we want to explore the predictors and their relations to the response?

# the predictors on their own
hist(d$ZIPFFREQ); hist(d$ZIPFFREQ, breaks="FD")

hist(d$LENGTH); hist(d$LENGTH, breaks="FD")

table(d$LANGUAGE)

spanish english
   3775    3977 
table(d$GROUP)

 english heritage
    3793     3959 
# the predictor(s) w/ the response
plot(main="RT (in ms, logged) as a function of Zipf frequency",
   pch=16, col="#00000020",
   xlab="Zipf frequency"    , x=d$ZIPFFREQ,
   ylab="RT (in ms, logged)", y=d$RT_log); grid()
   abline(lm(d$RT_log ~ d$ZIPFFREQ), col="green", lwd=2)

plot(main="RT (in ms, logged) as a function of length in characters",
   pch=16, col="#00000020",
   xlab="Length"            , x=jitter(d$LENGTH),
   ylab="RT (in ms, logged)", y=d$RT_log); grid()
   abline(lm(d$RT_log ~ d$LENGTH), col="green", lwd=2)

qwe <- boxplot(main="RT (in ms, logged) as a function of stimulus language",
   d$RT_log ~ d$LANGUAGE, notch=TRUE,
   xlab="Stimulus language",
   ylab="RT (in ms, logged)"); grid()

asd <- boxplot(main="RT (in ms, logged) as a function of speaker group",
   d$RT_log ~ d$GROUP, notch=TRUE,
   xlab="Speaker group",
   ylab="RT (in ms, logged)"); grid()

table(d$LANGUAGE, d$GROUP)

          english heritage
  spanish    1805     1970
  english    1988     1989

Nothing super problematic, it seems …

3 Modeling & numerical interpretation

Let’s fit and evaluate our initial regression model:

summary(m_01 <- lm(       # make/summarize the linear model m_01:
   RT_log ~ 1 +           # RT_log ~ an overall intercept (1)
   # & these predictors & their pairwise interactions
   (ZIPFFREQ+LANGUAGE+GROUP+LENGTH)^2,
   data=d,                # those vars are in d
   na.action=na.exclude)) # skip cases with NA/missing data (good habit)

Call:
lm(formula = RT_log ~ 1 + (ZIPFFREQ + LANGUAGE + GROUP + LENGTH)^2,
    data = d, na.action = na.exclude)

Residuals:
     Min       1Q   Median       3Q      Max
-1.25135 -0.27954 -0.06143  0.20982  2.55535

Coefficients:
                               Estimate Std. Error t value Pr(>|t|)
(Intercept)                    9.315489   0.222360  41.894  < 2e-16 ***
ZIPFFREQ                      -0.003958   0.048821  -0.081  0.93538
LANGUAGEenglish               -0.308147   0.103832  -2.968  0.00301 **
GROUPheritage                 -0.103733   0.100623  -1.031  0.30262
LENGTH                         0.124494   0.038051   3.272  0.00107 **
ZIPFFREQ:LANGUAGEenglish       0.010640   0.018610   0.572  0.56751
ZIPFFREQ:GROUPheritage        -0.001619   0.017674  -0.092  0.92702
ZIPFFREQ:LENGTH               -0.019935   0.008436  -2.363  0.01814 *
LANGUAGEenglish:GROUPheritage  0.206348   0.019821  10.410  < 2e-16 ***
LANGUAGEenglish:LENGTH        -0.018155   0.010490  -1.731  0.08355 .
GROUPheritage:LENGTH           0.001545   0.009436   0.164  0.86999
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.4234 on 7741 degrees of freedom
Multiple R-squared:  0.1227,    Adjusted R-squared:  0.1216
F-statistic: 108.3 on 10 and 7741 DF,  p-value: < 2.2e-16
drop1(m_01,  # drop each droppable predictor at a time from m_01 &
   test="F") # test its significance w/ an F-test
Single term deletions

Model:
RT_log ~ 1 + (ZIPFFREQ + LANGUAGE + GROUP + LENGTH)^2
                  Df Sum of Sq    RSS    AIC  F value  Pr(>F)
<none>                         1387.8 -13313
ZIPFFREQ:LANGUAGE  1    0.0586 1387.9 -13315   0.3269 0.56751
ZIPFFREQ:GROUP     1    0.0015 1387.8 -13315   0.0084 0.92702
ZIPFFREQ:LENGTH    1    1.0013 1388.8 -13310   5.5850 0.01814 *
LANGUAGE:GROUP     1   19.4296 1407.2 -13207 108.3753 < 2e-16 ***
LANGUAGE:LENGTH    1    0.5370 1388.3 -13312   2.9953 0.08355 .
GROUP:LENGTH       1    0.0048 1387.8 -13315   0.0268 0.86999
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Let’s fit the 2nd model with that interaction ZIPFFREQ:GROUP deleted:

summary(m_02 <- update( # make m_02 an update of
   m_01, .~.            # m_01, namely all of it (.~.), but then
   - ZIPFFREQ:GROUP))   # remove this interaction

Call:
lm(formula = RT_log ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH +
    ZIPFFREQ:LANGUAGE + ZIPFFREQ:LENGTH + LANGUAGE:GROUP + LANGUAGE:LENGTH +
    GROUP:LENGTH, data = d, na.action = na.exclude)

Residuals:
     Min       1Q   Median       3Q      Max
-1.25164 -0.27940 -0.06149  0.20977  2.55564

Coefficients:
                               Estimate Std. Error t value Pr(>|t|)
(Intercept)                    9.319706   0.217529  42.844  < 2e-16 ***
ZIPFFREQ                      -0.004823   0.047897  -0.101  0.91980
LANGUAGEenglish               -0.308391   0.103791  -2.971  0.00297 **
GROUPheritage                 -0.111562   0.053096  -2.101  0.03566 *
LENGTH                         0.124438   0.038044   3.271  0.00108 **
ZIPFFREQ:LANGUAGEenglish       0.010686   0.018602   0.574  0.56567
ZIPFFREQ:LENGTH               -0.019933   0.008435  -2.363  0.01814 *
LANGUAGEenglish:GROUPheritage  0.206289   0.019810  10.414  < 2e-16 ***
LANGUAGEenglish:LENGTH        -0.018146   0.010489  -1.730  0.08368 .
GROUPheritage:LENGTH           0.001621   0.009399   0.172  0.86311
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.4234 on 7742 degrees of freedom
Multiple R-squared:  0.1227,    Adjusted R-squared:  0.1217
F-statistic: 120.3 on 9 and 7742 DF,  p-value: < 2.2e-16
anova(m_01, m_02, # compare m_01 to m_02
   test="F")      # using an F-test
Analysis of Variance Table

Model 1: RT_log ~ 1 + (ZIPFFREQ + LANGUAGE + GROUP + LENGTH)^2
Model 2: RT_log ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:LANGUAGE +
    ZIPFFREQ:LENGTH + LANGUAGE:GROUP + LANGUAGE:LENGTH + GROUP:LENGTH
  Res.Df    RSS Df  Sum of Sq      F Pr(>F)
1   7741 1387.8
2   7742 1387.8 -1 -0.0015043 0.0084  0.927
drop1(m_02,  # drop each droppable predictor at a time from m_02 &
   test="F") # test its significance w/ an F-test
Single term deletions

Model:
RT_log ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:LANGUAGE +
    ZIPFFREQ:LENGTH + LANGUAGE:GROUP + LANGUAGE:LENGTH + GROUP:LENGTH
                  Df Sum of Sq    RSS    AIC  F value  Pr(>F)
<none>                         1387.8 -13315
ZIPFFREQ:LANGUAGE  1    0.0592 1387.9 -13317   0.3300 0.56567
ZIPFFREQ:LENGTH    1    1.0011 1388.8 -13312   5.5846 0.01814 *
LANGUAGE:GROUP     1   19.4391 1407.3 -13209 108.4418 < 2e-16 ***
LANGUAGE:LENGTH    1    0.5365 1388.3 -13314   2.9928 0.08368 .
GROUP:LENGTH       1    0.0053 1387.8 -13317   0.0297 0.86311
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Let’s fit the 3rd model with the ns interaction GROUP:LENGTH deleted:

summary(m_03 <- update( # make m_03 an update of
   m_02, .~.            # m_02, namely all of it (.~.), but then
   - GROUP:LENGTH))     # remove this interaction

Call:
lm(formula = RT_log ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH +
    ZIPFFREQ:LANGUAGE + ZIPFFREQ:LENGTH + LANGUAGE:GROUP + LANGUAGE:LENGTH,
    data = d, na.action = na.exclude)

Residuals:
     Min       1Q   Median       3Q      Max
-1.25124 -0.27896 -0.06132  0.20991  2.55518

Coefficients:
                               Estimate Std. Error t value Pr(>|t|)
(Intercept)                    9.314561   0.215459  43.231  < 2e-16 ***
ZIPFFREQ                      -0.004710   0.047890  -0.098 0.921650
LANGUAGEenglish               -0.307637   0.103692  -2.967 0.003018 **
GROUPheritage                 -0.102721   0.013799  -7.444 1.08e-13 ***
LENGTH                         0.125358   0.037666   3.328 0.000878 ***
ZIPFFREQ:LANGUAGEenglish       0.010645   0.018600   0.572 0.567111
ZIPFFREQ:LENGTH               -0.019949   0.008434  -2.365 0.018038 *
LANGUAGEenglish:GROUPheritage  0.205486   0.019254  10.673  < 2e-16 ***
LANGUAGEenglish:LENGTH        -0.018175   0.010487  -1.733 0.083126 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.4234 on 7743 degrees of freedom
Multiple R-squared:  0.1227,    Adjusted R-squared:  0.1218
F-statistic: 135.4 on 8 and 7743 DF,  p-value: < 2.2e-16
anova(m_02, m_03, # compare m_02 to m_03
   test="F")      # using an F-test
Analysis of Variance Table

Model 1: RT_log ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:LANGUAGE +
    ZIPFFREQ:LENGTH + LANGUAGE:GROUP + LANGUAGE:LENGTH + GROUP:LENGTH
Model 2: RT_log ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:LANGUAGE +
    ZIPFFREQ:LENGTH + LANGUAGE:GROUP + LANGUAGE:LENGTH
  Res.Df    RSS Df  Sum of Sq      F Pr(>F)
1   7742 1387.8
2   7743 1387.8 -1 -0.0053296 0.0297 0.8631
drop1(m_03,  # drop each droppable predictor at a time from m_03 &
   test="F") # test its significance w/ an F-test
Single term deletions

Model:
RT_log ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:LANGUAGE +
    ZIPFFREQ:LENGTH + LANGUAGE:GROUP + LANGUAGE:LENGTH
                  Df Sum of Sq    RSS    AIC  F value  Pr(>F)
<none>                         1387.8 -13317
ZIPFFREQ:LANGUAGE  1    0.0587 1387.9 -13319   0.3276 0.56711
ZIPFFREQ:LENGTH    1    1.0028 1388.8 -13314   5.5948 0.01804 *
LANGUAGE:GROUP     1   20.4159 1408.2 -13206 113.9055 < 2e-16 ***
LANGUAGE:LENGTH    1    0.5383 1388.4 -13316   3.0035 0.08313 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Let’s fit the 4th model with the ns interaction ZIPFFREQ:LANGUAGE deleted:

summary(m_04 <- update(  # make m_04 an update of
   m_03, .~.             # m_03, namely all of it (.~.), but then
   - ZIPFFREQ:LANGUAGE)) # remove this interaction

Call:
lm(formula = RT_log ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH +
    ZIPFFREQ:LENGTH + LANGUAGE:GROUP + LANGUAGE:LENGTH, data = d,
    na.action = na.exclude)

Residuals:
     Min       1Q   Median       3Q      Max
-1.24964 -0.27871 -0.06224  0.21043  2.55420

Coefficients:
                               Estimate Std. Error t value Pr(>|t|)
(Intercept)                    9.259652   0.192913  47.999  < 2e-16 ***
ZIPFFREQ                       0.007370   0.042986   0.171 0.863875
LANGUAGEenglish               -0.257516   0.055524  -4.638 3.58e-06 ***
GROUPheritage                 -0.102590   0.013796  -7.436 1.15e-13 ***
LENGTH                         0.131719   0.035988   3.660 0.000254 ***
ZIPFFREQ:LENGTH               -0.021360   0.008065  -2.648 0.008106 **
LANGUAGEenglish:GROUPheritage  0.205373   0.019252  10.668  < 2e-16 ***
LANGUAGEenglish:LENGTH        -0.018385   0.010480  -1.754 0.079429 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.4233 on 7744 degrees of freedom
Multiple R-squared:  0.1227,    Adjusted R-squared:  0.1219
F-statistic: 154.7 on 7 and 7744 DF,  p-value: < 2.2e-16
anova(m_03, m_04, # compare m_03 to m_04
   test="F")      # using an F-test
Analysis of Variance Table

Model 1: RT_log ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:LANGUAGE +
    ZIPFFREQ:LENGTH + LANGUAGE:GROUP + LANGUAGE:LENGTH
Model 2: RT_log ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:LENGTH +
    LANGUAGE:GROUP + LANGUAGE:LENGTH
  Res.Df    RSS Df Sum of Sq      F Pr(>F)
1   7743 1387.8
2   7744 1387.9 -1 -0.058712 0.3276 0.5671
drop1(m_04,  # drop each droppable predictor at a time from m_04 &
   test="F") # test its significance w/ an F-test
Single term deletions

Model:
RT_log ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:LENGTH +
    LANGUAGE:GROUP + LANGUAGE:LENGTH
                Df Sum of Sq    RSS    AIC  F value    Pr(>F)
<none>                       1387.9 -13319
ZIPFFREQ:LENGTH  1    1.2570 1389.1 -13314   7.0135  0.008106 **
LANGUAGE:GROUP   1   20.3956 1408.3 -13208 113.8018 < 2.2e-16 ***
LANGUAGE:LENGTH  1    0.5515 1388.4 -13318   3.0773  0.079429 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We’re done now – why? Because remember that we said “we are expecting that the effect of LENGTH will be stronger for when LANGUAGE is spanish”, meaning if the coefficient for LANGUAGEenglish:LENGTH is negative – which it is – we can take half its p-value and then it is in fact significant! We call the final model m_final (for easy copying and pasting across analyses) and do some housekeeping:

m_final <- m_04; rm(m_04); invisible(gc())

Normally, we would do some regression diagnostics now – I have to skip those now to discuss how to interpret the model, but we will return to this later! For now, we compute the confidence intervals …

Confint(m_final)
                                  Estimate       2.5 %       97.5 %
(Intercept)                    9.259651692  8.88149065  9.637812729
ZIPFFREQ                       0.007369857 -0.07689353  0.091633246
LANGUAGEenglish               -0.257516141 -0.36635901 -0.148673274
GROUPheritage                 -0.102589569 -0.12963362 -0.075545520
LENGTH                         0.131719004  0.06117362  0.202264390
ZIPFFREQ:LENGTH               -0.021359653 -0.03717004 -0.005549264
LANGUAGEenglish:GROUPheritage  0.205372665  0.16763422  0.243111112
LANGUAGEenglish:LENGTH        -0.018384612 -0.03892847  0.002159247

… , add the predictions of the model to the original data frame:

d$PRED <-   # make d$PRED the
   predict( # predictions for all cases
   m_final) # based on m_final
head(d) # check the result
  CASE   RT LENGTH LANGUAGE    GROUP CORRECT FREQPMW ZIPFFREQ SITE    RT_log
1    1  748      5  spanish heritage correct      19 4.278754    b  9.546894
2    2  408      5  spanish heritage correct      78 4.892095    b  8.672425
3    3  483      6  spanish heritage correct     233 5.367356    b  8.915879
4    4 1266      4  spanish heritage correct     133 5.123852    b 10.306062
5    5  636      6  spanish heritage correct      14 4.146128    b  9.312883
6    6  395      6  spanish heritage correct      67 4.826075    b  8.625709
      PRED
1 9.390227
2 9.329244
3 9.299064
4 9.283925
5 9.446573
6 9.364444

…, and generate an often super-useful kind of data frame that I always call nd (for new data) that contains all combinations of

  • all levels of all categorical predictors and
  • useful values of all numeric predictors (where ‘useful’ is usually 0, 1 and the mean):
nd <- expand.grid(
   ZIPFFREQ=c(0, 1, mean(d$ZIPFFREQ)),
   LENGTH  =c(0, 1, mean(d$LENGTH)),
   GROUP   =levels(d$GROUP),
   LANGUAGE=levels(d$LANGUAGE))

To that, we add the predictions the model makes for all these combinations and give it some nice(r) names:

nd <- cbind(nd,
   predict(m_final, newdata=nd, interval="confidence"))
names(nd)[5:7] <- c("PRED", "PRED_LOW", "PRED_UPP")
head(nd)
  ZIPFFREQ LENGTH   GROUP LANGUAGE     PRED PRED_LOW PRED_UPP
1 0.000000      0 english  spanish 9.259652 8.881491 9.637813
2 1.000000      0 english  spanish 9.267022 8.971578 9.562465
3 4.609476      0 english  spanish 9.293623 9.228019 9.359227
4 0.000000      1 english  spanish 9.391371 9.081696 9.701045
5 1.000000      1 english  spanish 9.377381 9.135256 9.619506
6 4.609476      1 english  spanish 9.326885 9.272164 9.381606

A maybe (!) nicer way to represent this might be this, but whether this is nicer or not is probably subjective:

with(nd,    # use the data in nd and
   tapply(  # apply
      PRED, # to the predicted reaction times
      # a grouping like this: the numerics in the rows & columns, the factors in the layers/slices
      list(round(ZIPFFREQ, 2), round(LENGTH, 2), "GROUP"=GROUP, "LANGUAGE"=LANGUAGE),
      # just list the values, but rounded to 2 decimals
      c)) %>% round(2)
, , GROUP = english, LANGUAGE = spanish


          0    1  5.2
  0    9.26 9.39 9.94
  1    9.27 9.38 9.84
  4.61 9.29 9.33 9.47

, , GROUP = heritage, LANGUAGE = spanish


          0    1  5.2
  0    9.16 9.29 9.84
  1    9.16 9.27 9.74
  4.61 9.19 9.22 9.36

, , GROUP = english, LANGUAGE = english


          0    1  5.2
  0    9.00 9.12 9.59
  1    9.01 9.10 9.49
  4.61 9.04 9.05 9.11

, , GROUP = heritage, LANGUAGE = english


          0    1  5.2
  0    9.10 9.22 9.69
  1    9.11 9.20 9.59
  4.61 9.14 9.15 9.22

4 Visual interpretation

4.1 LANGUAGE:GROUP

Let’s visualize the nature of the effect of LANGUAGE:GROUP based on the predictions from effects:

lagr_d <- data.frame(   # make lagr_d a data frame of
   lagr <- effect(      # the effect
      "LANGUAGE:GROUP", # of LANGUAGE:GROUP
      m_final))         # in m_final
plot(lagr,              # plot the effect of the interaction
   ylim=range(d$RT_log, na.rm=TRUE), # w/ these y-axis limits
   grid=TRUE)           # & w/ a grid

plot(lagr,              # plot the effect of the interaction
   ylim=range(d$RT_log, na.rm=TRUE), # w/ these y-axis limits
   x.var="GROUP",       # w/ this predictor on the x-axis
   grid=TRUE)           # & w/ a grid

And here’s a version (of the first of the two effects plots) with everything: observed values, predicted values, and the confidence intervals of the predictions:

# split data & predictions up by levels of GROUP:
d_split <- split(d, d$GROUP)
lagr_d_split <- split(lagr_d, lagr_d$GROUP, drop=TRUE)
par(mfrow=c(1, 2))
stripchart(main="RT ~ LANGUAGE for GROUP='english'",  # stripchart w/ main heading
   pch=16, col="#00000020",                           # filled semi-transparent grey circles
   d_split$english$RT_log ~ d_split$english$LANGUAGE, # data: RT_log ~ LANG (for eng)
   xlab="RT (in ms, logged)", ylab="Language",        # axis labels
   vertical=TRUE, method="jitter"); grid()            # vertical plot & jittered observations, add grid
# add predicted means:
points(seq(lagr_d_split$english$fit), lagr_d_split$english$fit, pch="X", col=c("red", "blue"))
# add confidence intervals:
for (i in seq(lagr_d_split$english$fit)) {  # as many arrows as needed
   arrows(i, lagr_d_split$english$lower[i], # each from here
          i, lagr_d_split$english$upper[i], # to here
          code=3, angle=90, col=c("red", "blue")[i])      # both tips w/ 90 degrees
}
stripchart(main="RT ~ LANGUAGE for GROUP='heritage'",   # stripchart w/ main heading
   pch=16, col="#00000020",                             # filled semi-transparent grey circles
   d_split$heritage$RT_log ~ d_split$heritage$LANGUAGE, # data: RT_log ~ LANG (for her)
   xlab="RT (in ms, logged)", ylab="Language",          # axis labels
   vertical=TRUE, method="jitter"); grid()              # vertical plot & jittered observations, add grid
# add predicted means:
points(seq(lagr_d_split$heritage$fit), lagr_d_split$heritage$fit, pch="X", col=c("red", "blue"))
# add confidence intervals:
for (i in seq(lagr_d_split$heritage$fit)) {  # as many arrows as needed
   arrows(i, lagr_d_split$heritage$lower[i], # each from here
          i, lagr_d_split$heritage$upper[i], # to here
          code=3, angle=90, col=c("red", "blue")[i])      # both tips w/ 90 degrees
}
par(mfrow=c(1, 1))

4.2 LANGUAGE:LENGTH

Let’s visualize the nature of the effect of LANGUAGE:LENGTH based on the predictions from effects:

(lale_d <- data.frame(   # make lale_d a data frame of
   lale <- effect(       # the effect
      "LANGUAGE:LENGTH", # of LANGUAGE:LENGTH
      m_final,           # in m_final
      xlevels=list(      # for when
      LENGTH=3:9))))     # LENGTH is these values
   LANGUAGE LENGTH      fit          se    lower    upper
1   spanish      3 9.341016 0.015864551 9.309917 9.372115
2   english      3 9.133232 0.018124463 9.097703 9.168760
3   spanish      4 9.374278 0.010881934 9.352947 9.395610
4   english      4 9.148109 0.010668181 9.127197 9.169022
5   spanish      5 9.407541 0.007359139 9.393115 9.421967
6   english      5 9.162987 0.006742156 9.149770 9.176203
7   spanish      6 9.440803 0.007674979 9.425758 9.455848
8   english      6 9.177864 0.011110169 9.156085 9.199643
9   spanish      7 9.474065 0.011517882 9.451487 9.496643
10  english      7 9.192742 0.018647994 9.156187 9.229297
11  spanish      8 9.507327 0.016595958 9.474795 9.539860
12  english      8 9.207619 0.026802838 9.155079 9.260160
13  spanish      9 9.540589 0.022072562 9.497321 9.583858
14  english      9 9.222497 0.035147819 9.153598 9.291396
plot(lale,               # plot the effect of the interaction
   ylim=range(d$RT_log, na.rm=TRUE), # w/ these y-axis limits
   grid=TRUE)            # & w/ a grid

And here a version with everything: observed values, predicted values (red for LANGUAGE: english, blue for LANGUAGE: spanish), and the confidence intervals of the predictions (I’m building this up very stepwise, more so than I might do in actual research):

d_split <- split(d, d$LANGUAGE)
lale_d_split <- split(lale_d, lale_d$LANGUAGE, drop=TRUE)
plot(main="RT ~ LANGUAGE:LENGTH", type="n", # plot nothing
   xlab="Length"  , x=d$LENGTH,             # x-axis
   ylab="RT in ms", y=d$RT_log); grid()     # y-axis & grid
   # plot observed data points for LANGUAGE being eng:
   points(jitter(d_split$eng$LENGTH), d_split$eng$RT_log, pch=16, col="#FF000010")
   lines(lale_d_split$eng$LENGTH, lale_d_split$eng$fit, col="red")
   # plot observed data points for LANGUAGE being spa:
   points(jitter(d_split$spa$LENGTH), d_split$spa$RT_log, pch=16, col="#0000FF10")
   lines(lale_d_split$spa$LENGTH, lale_d_split$spa$fit, col="blue")
   # plot confidence band for LANGUAGE being eng:
   polygon(col="#FF000030", border="red", # semi-transparent grey, for once w/ border
      x=c(   lale_d_split$eng$LENGTH,   # x-coordinates left to right, then
         rev(lale_d_split$eng$LENGTH)), # the same values back (in reverse order)
      y=c(   lale_d_split$eng$lower,    # y-coordinates lower boundary
         rev(lale_d_split$eng$upper)))  # y-coordinates upper boundary
   # plot confidence band for LANGUAGE being spa:
   polygon(col="#0000FF30", border="blue", # semi-transparent grey, for once w/ border
      x=c(   lale_d_split$spa$LENGTH,   # x-coordinates left to right, then
         rev(lale_d_split$spa$LENGTH)), # the same values back (in reverse order)
      y=c(   lale_d_split$spa$lower,    # y-coordinates lower boundary
         rev(lale_d_split$spa$upper)))  # y-coordinates upper boundary
   # plot legend
   legend(title="Language", bty="n",
      x=6, xjust=0.5, y=11.5, yjust=0.5, ncol=2,
      legend=levels(d$LANGUAGE), fill=c("red", "blue"))

4.3 ZIPFFREQ:LENGTH

Let’s visualize the nature of the effect of ZIPFFREQ:LENGTH based on the predictions from effects:

(zile_d <- data.frame(   # make zile_d a data frame of
   zile <- effect(       # the effect
      "ZIPFFREQ:LENGTH", # of ZIPFFREQ:LENGTH
      m_final,           # in m_final
      xlevels=list(      # for when
      ZIPFFREQ=seq(3, 6, 0.25), # ZIPFFREQ is these values
      LENGTH=seq(3, 9, 0.5))))) # LENGTH is these values
    ZIPFFREQ LENGTH      fit          se    lower    upper
1       3.00    3.0 9.325689 0.034081672 9.258879 9.392498
2       3.25    3.0 9.311511 0.029465917 9.253750 9.369273
3       3.50    3.0 9.297334 0.024992078 9.248343 9.346325
4       3.75    3.0 9.283157 0.020752145 9.242477 9.323837
5       4.00    3.0 9.268980 0.016922850 9.235806 9.302153
6       4.25    3.0 9.254802 0.013849115 9.227654 9.281950
7       4.50    3.0 9.240625 0.012120106 9.216866 9.264384
8       4.75    3.0 9.226448 0.012315833 9.202305 9.250590
9       5.00    3.0 9.212270 0.014357798 9.184125 9.240416
10      5.25    3.0 9.198093 0.017615243 9.163563 9.232624
11      5.50    3.0 9.183916 0.021543700 9.141684 9.226147
12      5.75    3.0 9.169739 0.025838906 9.119087 9.220390
13      6.00    3.0 9.155561 0.030345526 9.096076 9.215047
14      3.00    3.5 9.354793 0.028132740 9.299645 9.409941
15      3.25    3.5 9.337945 0.024317877 9.290276 9.385615
16      3.50    3.5 9.321098 0.020614716 9.280688 9.361509
17      3.75    3.5 9.304251 0.017096000 9.270738 9.337764
18      4.00    3.5 9.287404 0.013902490 9.260151 9.314656
19      4.25    3.5 9.270557 0.011313024 9.248380 9.292733
20      4.50    3.5 9.253709 0.009817796 9.234464 9.272955
21      4.75    3.5 9.236862 0.009924365 9.217408 9.256317
22      5.00    3.5 9.220015 0.011588625 9.197298 9.242732
23      5.25    3.5 9.203168 0.014275809 9.175183 9.231152
24      5.50    3.5 9.186320 0.017521431 9.151974 9.220667
25      5.75    3.5 9.169473 0.021068974 9.128172 9.210774
26      6.00    3.5 9.152626 0.024789152 9.104033 9.201219
27      3.00    4.0 9.383897 0.022669802 9.339458 9.428336
28      3.25    4.0 9.364380 0.019588537 9.325981 9.402778
29      3.50    4.0 9.344862 0.016591628 9.312338 9.377387
30      3.75    4.0 9.325345 0.013734406 9.298422 9.352268
31      4.00    4.0 9.305828 0.011125024 9.284020 9.327636
32      4.25    4.0 9.286311 0.008982143 9.268703 9.303918
33      4.50    4.0 9.266794 0.007705329 9.251689 9.281898
34      4.75    4.0 9.247277 0.007736004 9.232112 9.262441
35      5.00    4.0 9.227759 0.009060898 9.209998 9.245521
36      5.25    4.0 9.208242 0.011230960 9.186226 9.230258
37      5.50    4.0 9.188725 0.013854585 9.161566 9.215884
38      5.75    4.0 9.169208 0.016719601 9.136433 9.201983
39      6.00    4.0 9.149691 0.019721081 9.111032 9.188349
40      3.00    4.5 9.413001 0.018137448 9.377447 9.448555
41      3.25    4.5 9.390814 0.015661904 9.360112 9.421515
42      3.50    4.5 9.368627 0.013249556 9.342654 9.394599
43      3.75    4.5 9.346439 0.010942283 9.324990 9.367889
44      4.00    4.5 9.324252 0.008822907 9.306957 9.341548
45      4.25    4.5 9.302065 0.007062657 9.288220 9.315910
46      4.50    4.5 9.279878 0.005987143 9.268142 9.291614
47      4.75    4.5 9.257691 0.005978122 9.245972 9.269410
48      5.00    4.5 9.235504 0.007039696 9.221704 9.249303
49      5.25    4.5 9.213317 0.008792270 9.196081 9.230552
50      5.50    4.5 9.191129 0.010907704 9.169747 9.212512
51      5.75    4.5 9.168942 0.013212846 9.143042 9.194843
52      6.00    4.5 9.146755 0.015623954 9.116128 9.177382
53      3.00    5.0 9.442105 0.015381559 9.411953 9.472257
54      3.25    5.0 9.417248 0.013270737 9.391234 9.443262
55      3.50    5.0 9.392391 0.011214830 9.370407 9.414375
56      3.75    5.0 9.367534 0.009250523 9.349400 9.385667
57      4.00    5.0 9.342677 0.007450623 9.328071 9.357282
58      4.25    5.0 9.317820 0.005965838 9.306125 9.329514
59      4.50    5.0 9.292962 0.005080403 9.283003 9.302921
60      4.75    5.0 9.268105 0.005115623 9.258077 9.278133
61      5.00    5.0 9.243248 0.006055456 9.231378 9.255119
62      5.25    5.0 9.218391 0.007570160 9.203552 9.233231
63      5.50    5.0 9.193534 0.009385411 9.175136 9.211932
64      5.75    5.0 9.168677 0.011358010 9.146412 9.190942
65      6.00    5.0 9.143820 0.013418744 9.117515 9.170124
66      3.00    5.5 9.471209 0.015388175 9.441044 9.501374
67      3.25    5.5 9.443682 0.013272622 9.417664 9.469700
68      3.50    5.5 9.416155 0.011222616 9.394156 9.438154
69      3.75    5.5 9.388628 0.009281689 9.370433 9.406823
70      4.00    5.5 9.361101 0.007534616 9.346331 9.375871
71      4.25    5.5 9.333574 0.006148916 9.321520 9.345627
72      4.50    5.5 9.306047 0.005409799 9.295442 9.316651
73      4.75    5.5 9.278520 0.005580387 9.267581 9.289459
74      5.00    5.5 9.250993 0.006590414 9.238074 9.263912
75      5.25    5.5 9.223466 0.008132923 9.207523 9.239408
76      5.50    5.5 9.195939 0.009963601 9.176407 9.215470
77      5.75    5.5 9.168411 0.011950745 9.144985 9.191838
78      6.00    5.5 9.140884 0.014028018 9.113386 9.168383
79      3.00    6.0 9.500313 0.018154276 9.464726 9.535901
80      3.25    6.0 9.470116 0.015666695 9.439405 9.500827
81      3.50    6.0 9.439919 0.013269320 9.413908 9.465931
82      3.75    6.0 9.409722 0.011021175 9.388118 9.431327
83      4.00    6.0 9.379525 0.009034361 9.361815 9.397235
84      4.25    6.0 9.349328 0.007518974 9.334589 9.364067
85      4.50    6.0 9.319131 0.006797954 9.305805 9.332457
86      4.75    6.0 9.288934 0.007116973 9.274983 9.302885
87      5.00    6.0 9.258737 0.008357772 9.242354 9.275121
88      5.25    6.0 9.228540 0.010188981 9.208567 9.248513
89      5.50    6.0 9.198343 0.012350718 9.174132 9.222554
90      5.75    6.0 9.168146 0.014697859 9.139334 9.196958
91      6.00    6.0 9.137949 0.017154469 9.104322 9.171576
92      3.00    6.5 9.529417 0.022692240 9.484934 9.573900
93      3.25    6.5 9.496550 0.019594922 9.458139 9.534962
94      3.50    6.5 9.463683 0.016617932 9.431108 9.496259
95      3.75    6.5 9.430816 0.013839141 9.403688 9.457945
96      4.00    6.5 9.397949 0.011404362 9.375594 9.420305
97      4.25    6.5 9.365082 0.009579594 9.346304 9.383861
98      4.50    6.5 9.332216 0.008754955 9.315053 9.349378
99      4.75    6.5 9.299349 0.009203334 9.281308 9.317390
100     5.00    6.5 9.266482 0.010766845 9.245376 9.287588
101     5.25    6.5 9.233615 0.013050671 9.208032 9.259197
102     5.50    6.5 9.200748 0.015744415 9.169884 9.231611
103     5.75    6.5 9.167881 0.018671493 9.131279 9.204482
104     6.00    6.5 9.135014 0.021737851 9.092402 9.177626
105     3.00    7.0 9.558521 0.028158055 9.503324 9.613719
106     3.25    7.0 9.522985 0.024325078 9.475301 9.570668
107     3.50    7.0 9.487448 0.020644358 9.446979 9.527916
108     3.75    7.0 9.451911 0.017213841 9.418167 9.485654
109     4.00    7.0 9.416374 0.014215831 9.388507 9.444241
110     4.25    7.0 9.380837 0.011979571 9.357354 9.404320
111     4.50    7.0 9.345300 0.010980762 9.323775 9.366825
112     4.75    7.0 9.309763 0.011545166 9.287131 9.332395
113     5.00    7.0 9.274226 0.013477808 9.247806 9.300646
114     5.25    7.0 9.238689 0.016299034 9.206739 9.270640
115     5.50    7.0 9.203152 0.019629376 9.164673 9.241631
116     5.75    7.0 9.167615 0.023251087 9.122037 9.213194
117     6.00    7.0 9.132078 0.027047374 9.079058 9.185098
118     3.00    7.5 9.587626 0.034108541 9.520764 9.654488
119     3.25    7.5 9.549419 0.029473558 9.491643 9.607195
120     3.50    7.5 9.511212 0.025023516 9.462159 9.560265
121     3.75    7.5 9.473005 0.020877015 9.432080 9.513929
122     4.00    7.5 9.434798 0.017254298 9.400975 9.468621
123     4.25    7.5 9.396591 0.014551959 9.368065 9.425117
124     4.50    7.5 9.358384 0.013341508 9.332231 9.384537
125     4.75    7.5 9.320177 0.014014986 9.292704 9.347651
126     5.00    7.5 9.281970 0.016341098 9.249938 9.314003
127     5.25    7.5 9.243764 0.019744132 9.205060 9.282467
128     5.50    7.5 9.205557 0.023765894 9.158969 9.252144
129     5.75    7.5 9.167350 0.028142358 9.112183 9.222516
130     6.00    7.5 9.129143 0.032731553 9.064980 9.193306
131     3.00    8.0 9.616730 0.040329732 9.537672 9.695787
132     3.25    8.0 9.575853 0.034855483 9.507527 9.644179
133     3.50    8.0 9.534976 0.029599262 9.476953 9.592999
134     3.75    8.0 9.494099 0.024700652 9.445679 9.542519
135     4.00    8.0 9.453222 0.020418676 9.413196 9.493248
136     4.25    8.0 9.412345 0.017219661 9.378590 9.446101
137     4.50    8.0 9.371469 0.015776735 9.340542 9.402395
138     4.75    8.0 9.330592 0.016555611 9.298138 9.363045
139     5.00    8.0 9.289715 0.019288997 9.251903 9.327527
140     5.25    8.0 9.248838 0.023298859 9.203166 9.294510
141     5.50    8.0 9.207961 0.028042869 9.152990 9.262933
142     5.75    8.0 9.167084 0.033207866 9.101988 9.232181
143     6.00    8.0 9.126208 0.038625335 9.050491 9.201924
144     3.00    8.5 9.645834 0.046713598 9.554262 9.737405
145     3.25    8.5 9.602287 0.040377611 9.523136 9.681438
146     3.50    8.5 9.558740 0.034292991 9.491517 9.625964
147     3.75    8.5 9.515193 0.028620510 9.459089 9.571297
148     4.00    8.5 9.471647 0.023658496 9.425270 9.518024
149     4.25    8.5 9.428100 0.019944468 9.389003 9.467196
150     4.50    8.5 9.384553 0.018256662 9.348765 9.420341
151     4.75    8.5 9.341006 0.019138865 9.303489 9.378524
152     5.00    8.5 9.297459 0.022287960 9.253769 9.341150
153     5.25    8.5 9.253913 0.026919819 9.201142 9.306683
154     5.50    8.5 9.210366 0.032404733 9.146844 9.273888
155     5.75    8.5 9.166819 0.038378683 9.091586 9.242052
156     6.00    8.5 9.123272 0.044645788 9.035754 9.210790
157     3.00    9.0 9.674938 0.053201611 9.570648 9.779227
158     3.25    9.0 9.628721 0.045989467 9.538569 9.718873
159     3.50    9.0 9.582504 0.039062197 9.505932 9.659077
160     3.75    9.0 9.536288 0.032601901 9.472379 9.600196
161     4.00    9.0 9.490071 0.026946560 9.437248 9.542893
162     4.25    9.0 9.443854 0.022705830 9.399345 9.488364
163     4.50    9.0 9.397637 0.020765281 9.356932 9.438343
164     4.75    9.0 9.351421 0.021749564 9.308786 9.394056
165     5.00    9.0 9.305204 0.025319847 9.255570 9.354838
166     5.25    9.0 9.258987 0.030583497 9.199035 9.318939
167     5.50    9.0 9.212770 0.036821331 9.140591 9.284950
168     5.75    9.0 9.166554 0.043617371 9.081052 9.252055
169     6.00    9.0 9.120337 0.050747849 9.020857 9.219816
plot(zile,               # plot the effect of the interaction
   ylim=range(d$RT_log, na.rm=TRUE), # w/ these y-axis limits
   grid=TRUE)            # & w/ a grid

Not that great … There are two main ways we could try and plot this better. One would be with an actual 3-D plot. The simplest version of this might use rgl::plot3D like this, which would open a new window with a rotatable 3-D plot (but this wouldn’t be rendered into this HTML properly, which is why I am not executing it here):

rgl::plot3d(zile_d$ZIPFFREQ, zile_d$LENGTH, zile_d$fit)

Another option would be to use plotly::plot_ly:

library(plotly)
fig <- plot_ly(zile_d,
   x=~ZIPFFREQ, y=~LENGTH, z=~fit, color=~fit, colors=c("#FFFFFF20", "#00000020"))
fig <- fig %>% add_markers()
fig <- fig %>% layout(scene = list(
   xaxis = list(title = 'Zipf frequency'),
   yaxis = list(title = 'Length'),
   zaxis = list(title = 'Predicted RT')))
fig

While this is nice, it’s also hard to publish in print. Let’s do this differently with a numeric heatmap kind of plot that I often use. For that, the numeric predictions are first cut into 9 ordinal bins with lower/higher bin numbers reflecting lower/higher numeric predictions …

zile_d$fitcat <- as.numeric( # create a numeric column fitcat in zile_d
   cut(zile_d$fit,           # by cutting the predicted values up
       breaks=seq(           # at the values ranging from
          min(zile_d$fit),   # the smallest obs/pred RT to
          max(zile_d$fit),   # the largest obs/pred RT
          length.out=10),    # into 9 groups
       include.lowest=TRUE,  # include the lowest, and
       labels=1:9))          # use the numbers 1:9 as labels

… and then we plot them into a coordinate system:

plot(main="RT ~ ZIPFFREQ:LENGTH", type="n", # plot nothing, just
   xlab="Zipf frequency", xlim=range(zile_d$ZIPFFREQ), x=0, # get the coordinate
   ylab="Length"        , ylim=range(zile_d$LENGTH)  , y=0) # system set up
text(                                  # plot text
   x=zile_d$ZIPFFREQ, y=zile_d$LENGTH, # at these coordinates, namely
   labels=zile_d$fitcat,               # the 'categorical/ordinal' predictions
   # make the physical size of the numbers dependent on the values
   cex=0.5+zile_d$fitcat/7,
   # make the darkness of the grey reflect the predicted RT_log
   col=grey(level=zile_d$fitcat/10)) # see ?grey

However, there’s one thing that’s possibly problematic about this – what is that? The problem is that we are binning with cut on the basis of zile_d$fit (check the above code), but the range of those predictions is much smaller than the range of the actually observed reaction times in d$RT_log:

range(zile_d$fit)
[1] 9.120337 9.674938
range(d$RT_log)
[1]  8.082149 12.011926

That in turn means the binning steps based on zile_d$fit will be based on much smaller differences between predicted values (of only 40, check levels(cut(zile_d$fit, breaks=seq(min(zile_d$fit), max(zile_d$fit), length.out=10)))), meaning a difference of predictions of only 81 ms could lead to a difference in 3 bins – if we were to bin based on the actual value range, that would not happen, because there the binning steps are 429 ms wide (check levels(cut(d$RT_log, breaks=seq(min(d$RT_log), max(d$RT_log), length.out=10)))). Thus, the above plot exaggerates the effects somewhat.

Accordingly, we determine the range of the values to be plotted in a potentially more representative way, by using

  • as a minimum, the minimum of the predictions and the actual values combined;
  • as a maximum, the maximum of the predictions and the actual values combined:
# define the minimum of the range to be used for binning
min_of_obs_n_pred <- min(c(
   d$RT_log,
   zile_d$fit),
   na.rm=TRUE)
# define the maximum of the range to be used for binning
max_of_obs_n_pred <- max(c(
   d$RT_log,
   zile_d$fit),
   na.rm=TRUE)

We can see that the initial plot indeed exaggerated the range: the upper plot showed prediction bins from 1 to 9, but we now see that the predicctions are actually only in a really small range of values.

zile_d$fitcat2 <- as.numeric( # create a numeric column fitcat2 in zile_d
   cut(zile_d$fit,           # by cutting the predicted values up
       breaks=seq(           # at the values ranging from
          min_of_obs_n_pred, # the smallest obs/pred RT_log to   <---------
          max_of_obs_n_pred, # the largest obs/pred RT_log       <---------
          length.out=10),    # into 9 groups
       include.lowest=TRUE,  # include the lowest, and
       labels=1:9)) # use the numbers 1:9 as labels
table(zile_d$fitcat, zile_d$fitcat2)

     3  4
  1 26  0
  2 30  0
  3 30  0
  4 28  0
  5  9 11
  6  0 15
  7  0 10
  8  0  6
  9  0  4

Thus, we plot this again (with some additional info in the form of the decile-based grid), which here doesn’t help much:

plot(main="RT ~ ZIPFFREQ:LENGTH", type="n", # plot nothing, just
   xlab="Zipf frequency", xlim=range(zile_d$ZIPFFREQ), x=0, # get the coordinate
   ylab="Length"        , ylim=range(zile_d$LENGTH)  , y=0) # system set up
text(                                  # plot text
   x=zile_d$ZIPFFREQ, y=zile_d$LENGTH, # at these coordinates, namely
   labels=zile_d$fitcat2,              # the 'categorical/ordinal' predictions
   # make the physical size of the numbers dependent on the values
   cex=0.5+zile_d$fitcat2/7,
   # make the darkness of the grey reflect the predicted RT_log
   col=grey(level=zile_d$fitcat2/15)) # see ?grey
abline(h=quantile(d$LENGTH  , probs=seq(0, 1, 0.1)), lty=3, col="#00000020")
abline(v=quantile(d$ZIPFFREQ, probs=seq(0, 1, 0.1)), lty=3, col="#00000020")

5 What if we restrict RT?

Given that we already noticed early on that the long tail of RT might cause problems, we used a logged version of this variable. But what if we had done something else, if we had not transformed RT but also not included all its values in the analysis? Here, we run the model selection process again, but now on a subset of the data in d, which is delimited like this:

shortest.densest.region <- function (
      nv,             # a numeric vector
      target=0.9,     # the target proportion to retain
      plotty=FALSE) { # whether to return hist & ecdf

   zero2one <- function (nv) {
      if (length(unique(nv))==1) {
         output <- rep(0.5, length(nv))
      } else {
         output <- "/"(
            nv - min(nv, na.rm=TRUE),
            diff(range(nv, na.rm=TRUE)))
      }; return(output)
   }

   nv <- nv[!is.na(nv)]
   tabulated.nv <- table(nv)
   all.subsettings <- mapply(":", seq(tabulated.nv), length(tabulated.nv))
   all.coverages <- lapply(all.subsettings, \(af) cumsum(tabulated.nv[af])/length(nv))

   results <- data.frame(
      # the values on the left defining the beg:
      BEGnum=rep(as.numeric(names(tabulated.nv)), sapply(all.subsettings, length)),
      # the values on the right defining the end:
      ENDnum=as.numeric(names(tabulated.nv)[unlist(all.subsettings)]),
      # the positions of the (sorted) values on the left defining the beg:
      BEGpos=rep(seq(all.subsettings), sapply(all.subsettings, length)),
      # the positions of the (sorted) values on the right defining the end:
      ENDpos=unlist(all.subsettings))
   # the NUMERICAL distance between the beg & the end:
   results$RANGEnum  <- results$ENDnum-results$BEGnum
   # the POSITIONAL distance between the beg & the end:
   results$RANGEpos <- results$ENDpos-results$BEGpos+1
   # coverage: how much of the tokens as a proportion is between the beg & the end:
   results$COVERAGES    <- unlist(all.coverages)
   # coverage normalized by the proportion of the NUMERICAL range between beg & end out of whole range:
   results$COVbyRANGEnum     <- results$COVERAGES/results$RANGEnum
   # coverage normalized by the proportion of the POSITIONAL range between beg & end out of whole range:
   results$COVbyRANGEpos     <- results$COVERAGES/results$RANGEpos
   # reduce to those ranges meeting the target:
   results <- results[results$COVERAGES>=target,]
   # conflating those into a single Euclidean distance:
   results$EUCL <- sqrt(results$COVbyRANGEnum^2 + results$COVbyRANGEpos^2)
   results <- results[order(-results$EUCL, results$COVERAGES),]
   results <- results[,c(7,10,1,2,5,8,3,4,6,9)]

   output <- list(
      "Smallest difference div. by 2"=nv %>% unique %>% sort %>% diff %>% "/"(2) %>% min,
      "Sorted by Euclidean based on width & length"=results)
      output[["Recommended range (based in Eucl)"]]=c(
         "lower"=results$BEGnum[1] - output[[1]],
         "upper"=results$ENDnum[1] + output[[1]])

   if (plotty) {
      par(mfrow=c(1,2))
      hist(nv, breaks="FD", main="")
         abline(v=output[[3]], col="blue", lty=3)
      plot(ecdf(nv), verticals=TRUE)
         abline(v=output[[3]], col="blue", lty=3)
      par(mfrow=c(1,1))
   }
   return(output)
}
shortest.densest.region(d$RT, target=0.9)[3:4]
$`Recommended range (based in Eucl)`
lower upper
364.5 967.5

$<NA>
NULL
table(keep <- d$RT>364.5 & d$RT<967.5)

FALSE  TRUE
  775  6977 
hist(d$RT , breaks="FD", xlim=c(0, 4250))
abline(v=c(364.5, 967.5), col="green")

5.1 Exploration & preparation

Let’s see how the above exploration results (for response and predictors) change when we use the restricted unlogged response variable; note the use of with(..., { ...}):

# the predictor(s) w/ the response
with(d[keep,], {
plot(
   main="RT in ms as a function of frequency per million words",
   pch=16, col="#00000030",
   xlab="Zipf frequency", xlim=c(  3,    6), x=ZIPFFREQ,
   ylab="RT (in ms)"    , ylim=c(250, 1000), y=RT); grid()
abline(lm(RT ~ ZIPFFREQ), col="green", lwd=2)
plot(
   main="RT in ms as a function of length in characters",
   pch=16, col="#00000020",
   xlab="Length"    , xlim=c(  3,    9), x=jitter(LENGTH),
   ylab="RT (in ms)", ylim=c(250, 1000), y=RT); grid()
   abline(lm(RT ~ LENGTH), col="green", lwd=2)
qwe <- boxplot(
   main="RT in ms as a function of stimulus language",
   RT ~ LANGUAGE, notch=TRUE, outline=FALSE,
   xlab="Stimulus language",
   ylab="RT (in ms)"       , ylim=c(250, 1000)); grid()
qwe <- boxplot(
   main="RT in ms as a function of speaker group",
   RT ~ GROUP, notch=TRUE, outline=FALSE,
   xlab="Speaker group",
   ylab="RT (in ms)"   , ylim=c(250, 1000)); grid()
})

This does look much better!

5.2 Modeling & numerical interpretation

Let’s see whether/how the modeling process changes if we only look at the data points we decided to keep:

summary(m_11 <- lm(       # make/summarize the linear model m_11:
   RT ~ 1 +               # RT ~ an overall intercept (1)
   # & these predictors & their pairwise interactions
   (ZIPFFREQ+LANGUAGE+GROUP+LENGTH)^2,
   data=d, subset=keep,   # those vars, but now in this subset
   na.action=na.exclude)) # skip cases with NA/missing data (good habit)

Call:
lm(formula = RT ~ 1 + (ZIPFFREQ + LANGUAGE + GROUP + LENGTH)^2,
    data = d, subset = keep, na.action = na.exclude)

Residuals:
    Min      1Q  Median      3Q     Max
-285.16  -93.81  -21.20   78.97  417.85

Coefficients:
                              Estimate Std. Error t value Pr(>|t|)
(Intercept)                   709.2442    72.3865   9.798  < 2e-16 ***
ZIPFFREQ                      -20.0810    15.8031  -1.271   0.2039
LANGUAGEenglish               -58.4942    32.7149  -1.788   0.0738 .
GROUPheritage                 -11.8688    32.0667  -0.370   0.7113
LENGTH                          7.1521    12.4666   0.574   0.5662
ZIPFFREQ:LANGUAGEenglish       -3.7245     5.8845  -0.633   0.5268
ZIPFFREQ:GROUPheritage         -5.7045     5.6376  -1.012   0.3116
ZIPFFREQ:LENGTH                -0.2982     2.7435  -0.109   0.9134
LANGUAGEenglish:GROUPheritage  47.4752     6.2346   7.615 2.99e-14 ***
LANGUAGEenglish:LENGTH         -2.6481     3.2737  -0.809   0.4186
GROUPheritage:LENGTH            3.7714     3.0322   1.244   0.2136
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 126.3 on 6966 degrees of freedom
Multiple R-squared:  0.0909,    Adjusted R-squared:  0.0896
F-statistic: 69.65 on 10 and 6966 DF,  p-value: < 2.2e-16
drop1(m_11,  # drop each droppable predictor at a time from m_11 &
   test="F") # test its significance w/ an F-test
Single term deletions

Model:
RT ~ 1 + (ZIPFFREQ + LANGUAGE + GROUP + LENGTH)^2
                  Df Sum of Sq       RSS   AIC F value   Pr(>F)
<none>                         111164722 67532
ZIPFFREQ:LANGUAGE  1      6393 111171115 67531  0.4006   0.5268
ZIPFFREQ:GROUP     1     16339 111181061 67532  1.0239   0.3116
ZIPFFREQ:LENGTH    1       189 111164911 67531  0.0118   0.9134
LANGUAGE:GROUP     1    925334 112090056 67588 57.9849 2.99e-14 ***
LANGUAGE:LENGTH    1     10442 111175164 67531  0.6544   0.4186
GROUP:LENGTH       1     24688 111189410 67532  1.5470   0.2136
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Let’s fit the 2nd model with the ns 2-way interaction ZIPFFREQ:LENGTH deleted:

summary(m_12 <- update( # make m_12 an update of
   m_11, .~.            # m_11, namely all of it (.~.), but then
   - ZIPFFREQ:LENGTH))  # remove this interaction

Call:
lm(formula = RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:LANGUAGE +
    ZIPFFREQ:GROUP + LANGUAGE:GROUP + LANGUAGE:LENGTH + GROUP:LENGTH,
    data = d, subset = keep, na.action = na.exclude)

Residuals:
    Min      1Q  Median      3Q     Max
-285.37  -93.76  -21.19   78.96  417.81

Coefficients:
                              Estimate Std. Error t value Pr(>|t|)
(Intercept)                    716.526     27.425  26.127  < 2e-16 ***
ZIPFFREQ                       -21.712      4.944  -4.391 1.14e-05 ***
LANGUAGEenglish                -58.957     32.434  -1.818   0.0691 .
GROUPheritage                  -11.910     32.062  -0.371   0.7103
LENGTH                           5.824      2.462   2.365   0.0180 *
ZIPFFREQ:LANGUAGEenglish        -3.546      5.650  -0.628   0.5303
ZIPFFREQ:GROUPheritage          -5.700      5.637  -1.011   0.3120
LANGUAGEenglish:GROUPheritage   47.486      6.233   7.618 2.92e-14 ***
LANGUAGEenglish:LENGTH          -2.716      3.213  -0.846   0.3978
GROUPheritage:LENGTH             3.774      3.032   1.245   0.2133
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 126.3 on 6967 degrees of freedom
Multiple R-squared:  0.0909,    Adjusted R-squared:  0.08973
F-statistic:  77.4 on 9 and 6967 DF,  p-value: < 2.2e-16
anova(m_11, m_12, # compare m_11 to m_12
   test="F")      # using an F-test
Analysis of Variance Table

Model 1: RT ~ 1 + (ZIPFFREQ + LANGUAGE + GROUP + LENGTH)^2
Model 2: RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:LANGUAGE +
    ZIPFFREQ:GROUP + LANGUAGE:GROUP + LANGUAGE:LENGTH + GROUP:LENGTH
  Res.Df       RSS Df Sum of Sq      F Pr(>F)
1   6966 111164722
2   6967 111164911 -1   -188.53 0.0118 0.9134
drop1(m_12,  # drop each droppable predictor at a time from m_12 &
   test="F") # test its significance w/ an F-test
Single term deletions

Model:
RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:LANGUAGE +
    ZIPFFREQ:GROUP + LANGUAGE:GROUP + LANGUAGE:LENGTH + GROUP:LENGTH
                  Df Sum of Sq       RSS   AIC F value   Pr(>F)
<none>                         111164911 67531
ZIPFFREQ:LANGUAGE  1      6285 111171195 67529  0.3939   0.5303
ZIPFFREQ:GROUP     1     16315 111181225 67530  1.0225   0.3120
LANGUAGE:GROUP     1    925952 112090863 67586 58.0319 2.92e-14 ***
LANGUAGE:LENGTH    1     11407 111176318 67529  0.7149   0.3978
GROUP:LENGTH       1     24722 111189633 67530  1.5494   0.2133
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Let’s fit the 3rd model with the ns interaction ZIPFFREQ:LANGUAGE deleted:

summary(m_13 <- update(  # make m_13 an update of
   m_12, .~.             # m_12, namely all of it (.~.), but then
   - ZIPFFREQ:LANGUAGE)) # remove this interaction

Call:
lm(formula = RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:GROUP +
    LANGUAGE:GROUP + LANGUAGE:LENGTH + GROUP:LENGTH, data = d,
    subset = keep, na.action = na.exclude)

Residuals:
    Min      1Q  Median      3Q     Max
-287.90  -93.71  -21.28   79.06  416.70

Coefficients:
                              Estimate Std. Error t value Pr(>|t|)
(Intercept)                    725.010     23.860  30.386  < 2e-16 ***
ZIPFFREQ                       -23.490      4.052  -5.797 7.05e-09 ***
LANGUAGEenglish                -76.288     17.013  -4.484 7.44e-06 ***
GROUPheritage                  -13.030     32.011  -0.407    0.684
LENGTH                           5.775      2.461   2.347    0.019 *
ZIPFFREQ:GROUPheritage          -5.508      5.628  -0.979    0.328
LANGUAGEenglish:GROUPheritage   47.532      6.233   7.626 2.74e-14 ***
LANGUAGEenglish:LENGTH          -2.536      3.200  -0.793    0.428
GROUPheritage:LENGTH             3.807      3.031   1.256    0.209
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 126.3 on 6968 degrees of freedom
Multiple R-squared:  0.09085,   Adjusted R-squared:  0.08981
F-statistic: 87.04 on 8 and 6968 DF,  p-value: < 2.2e-16
anova(m_12, m_13, # compare m_12 to m_13
   test="F")      # using an F-test
Analysis of Variance Table

Model 1: RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:LANGUAGE +
    ZIPFFREQ:GROUP + LANGUAGE:GROUP + LANGUAGE:LENGTH + GROUP:LENGTH
Model 2: RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:GROUP +
    LANGUAGE:GROUP + LANGUAGE:LENGTH + GROUP:LENGTH
  Res.Df       RSS Df Sum of Sq      F Pr(>F)
1   6967 111164911
2   6968 111171195 -1   -6284.8 0.3939 0.5303
drop1(m_13,  # drop each droppable predictor at a time from m_13 &
   test="F") # test its significance w/ an F-test
Single term deletions

Model:
RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:GROUP +
    LANGUAGE:GROUP + LANGUAGE:LENGTH + GROUP:LENGTH
                Df Sum of Sq       RSS   AIC F value    Pr(>F)
<none>                       111171195 67529
ZIPFFREQ:GROUP   1     15278 111186473 67528  0.9576    0.3278
LANGUAGE:GROUP   1    927877 112099072 67585 58.1576 2.741e-14 ***
LANGUAGE:LENGTH  1     10025 111181220 67528  0.6283    0.4280
GROUP:LENGTH     1     25166 111196362 67528  1.5774    0.2092
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Let’s fit the 4th model with the ns interaction LANGUAGE:LENGTH deleted:

summary(m_14 <- update( # make m_14 an update of
   m_13, .~.            # m_13, namely all of it (.~.), but then
   - LANGUAGE:LENGTH))  # remove this interaction

Call:
lm(formula = RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:GROUP +
    LANGUAGE:GROUP + GROUP:LENGTH, data = d, subset = keep, na.action = na.exclude)

Residuals:
    Min      1Q  Median      3Q     Max
-288.00  -93.78  -21.14   79.11  416.59

Coefficients:
                              Estimate Std. Error t value Pr(>|t|)
(Intercept)                    729.416     23.203  31.436  < 2e-16 ***
ZIPFFREQ                       -23.397      4.050  -5.776 7.96e-09 ***
LANGUAGEenglish                -89.298      4.476 -19.952  < 2e-16 ***
GROUPheritage                  -13.698     31.999  -0.428   0.6686
LENGTH                           4.883      2.189   2.231   0.0257 *
ZIPFFREQ:GROUPheritage          -5.458      5.628  -0.970   0.3322
LANGUAGEenglish:GROUPheritage   47.595      6.232   7.637 2.52e-14 ***
GROUPheritage:LENGTH             3.887      3.030   1.283   0.1996
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 126.3 on 6969 degrees of freedom
Multiple R-squared:  0.09077,   Adjusted R-squared:  0.08985
F-statistic: 99.39 on 7 and 6969 DF,  p-value: < 2.2e-16
anova(m_13, m_14, # compare m_13 to m_14
   test="F")      # using an F-test
Analysis of Variance Table

Model 1: RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:GROUP +
    LANGUAGE:GROUP + LANGUAGE:LENGTH + GROUP:LENGTH
Model 2: RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:GROUP +
    LANGUAGE:GROUP + GROUP:LENGTH
  Res.Df       RSS Df Sum of Sq      F Pr(>F)
1   6968 111171195
2   6969 111181220 -1    -10025 0.6283  0.428
drop1(m_14,  # drop each droppable predictor at a time from m_14 &
   test="F") # test its significance w/ an F-test
Single term deletions

Model:
RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:GROUP +
    LANGUAGE:GROUP + GROUP:LENGTH
               Df Sum of Sq       RSS   AIC F value   Pr(>F)
<none>                      111181220 67528
ZIPFFREQ:GROUP  1     15002 111196222 67526  0.9403   0.3322
LANGUAGE:GROUP  1    930487 112111707 67584 58.3243 2.52e-14 ***
GROUP:LENGTH    1     26256 111207477 67527  1.6458   0.1996
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Let’s fit the 5th model with the ns interaction ZIPFFREQ:GROUP deleted:

summary(m_15 <- update( # make m_15 an update of
   m_14, .~.            # m_14, namely all of it (.~.), but then
   - ZIPFFREQ:GROUP))   # remove this interaction

Call:
lm(formula = RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + LANGUAGE:GROUP +
    GROUP:LENGTH, data = d, subset = keep, na.action = na.exclude)

Residuals:
    Min      1Q  Median      3Q     Max
-283.78  -93.65  -21.27   79.27  418.37

Coefficients:
                              Estimate Std. Error t value Pr(>|t|)
(Intercept)                    743.156     18.374  40.445  < 2e-16 ***
ZIPFFREQ                       -26.223      2.812  -9.325  < 2e-16 ***
LANGUAGEenglish                -89.324      4.475 -19.959  < 2e-16 ***
GROUPheritage                  -40.002     16.975  -2.357   0.0185 *
LENGTH                           4.760      2.185   2.178   0.0294 *
LANGUAGEenglish:GROUPheritage   47.523      6.232   7.626 2.74e-14 ***
GROUPheritage:LENGTH             4.099      3.022   1.357   0.1749
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 126.3 on 6970 degrees of freedom
Multiple R-squared:  0.09065,   Adjusted R-squared:  0.08986
F-statistic: 115.8 on 6 and 6970 DF,  p-value: < 2.2e-16
anova(m_14, m_15, # compare m_14 to m_15
   test="F")      # using an F-test
Analysis of Variance Table

Model 1: RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:GROUP +
    LANGUAGE:GROUP + GROUP:LENGTH
Model 2: RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + LANGUAGE:GROUP +
    GROUP:LENGTH
  Res.Df       RSS Df Sum of Sq      F Pr(>F)
1   6969 111181220
2   6970 111196222 -1    -15002 0.9403 0.3322
drop1(m_15,  # drop each droppable predictor at a time from m_15 &
   test="F") # test its significance w/ an F-test
Single term deletions

Model:
RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + LANGUAGE:GROUP +
    GROUP:LENGTH
               Df Sum of Sq       RSS   AIC F value    Pr(>F)
<none>                      111196222 67526
ZIPFFREQ        1   1387245 112583468 67611 86.9553 < 2.2e-16 ***
LANGUAGE:GROUP  1    927823 112124045 67582 58.1578 2.741e-14 ***
GROUP:LENGTH    1     29362 111225584 67526  1.8405    0.1749
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Let’s fit the 5th model with the ns interaction GROUP:LENGTH deleted:

summary(m_16 <- update( # make m_16 an update of
   m_15, .~.            # m_15, namely all of it (.~.), but then
   - GROUP:LENGTH))     # remove this interaction

Call:
lm(formula = RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + LANGUAGE:GROUP,
    data = d, subset = keep, na.action = na.exclude)

Residuals:
    Min      1Q  Median      3Q     Max
-284.54  -93.37  -21.62   78.96  418.28

Coefficients:
                              Estimate Std. Error t value Pr(>|t|)
(Intercept)                    731.393     16.201  45.145  < 2e-16 ***
ZIPFFREQ                       -26.188      2.812  -9.312  < 2e-16 ***
LANGUAGEenglish                -88.347      4.417 -20.000  < 2e-16 ***
GROUPheritage                  -17.780      4.454  -3.992 6.61e-05 ***
LENGTH                           6.898      1.513   4.558 5.25e-06 ***
LANGUAGEenglish:GROUPheritage   45.640      6.075   7.512 6.54e-14 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 126.3 on 6971 degrees of freedom
Multiple R-squared:  0.09041,   Adjusted R-squared:  0.08975
F-statistic: 138.6 on 5 and 6971 DF,  p-value: < 2.2e-16
anova(m_15, m_16, # compare m_15 to m_16
   test="F")      # using an F-test
Analysis of Variance Table

Model 1: RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + LANGUAGE:GROUP +
    GROUP:LENGTH
Model 2: RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + LANGUAGE:GROUP
  Res.Df       RSS Df Sum of Sq      F Pr(>F)
1   6970 111196222
2   6971 111225584 -1    -29362 1.8405 0.1749
drop1(m_16,  # drop each droppable predictor at a time from m_16 &
   test="F") # test its significance w/ an F-test
Single term deletions

Model:
RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + LANGUAGE:GROUP
               Df Sum of Sq       RSS   AIC F value    Pr(>F)
<none>                      111225584 67526
ZIPFFREQ        1   1383671 112609255 67611  86.721 < 2.2e-16 ***
LENGTH          1    331496 111557080 67545  20.776 5.250e-06 ***
LANGUAGE:GROUP  1    900428 112126012 67581  56.434 6.538e-14 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
m_final_2 <- m_16; rm(m_16); invisible(gc())

We’re done so let’s compute the confidence intervals and generate nd_2:

Confint(m_final_2)
                                Estimate      2.5 %     97.5 %
(Intercept)                   731.393340 699.634508 763.152172
ZIPFFREQ                      -26.188476 -31.701275 -20.675677
LANGUAGEenglish               -88.346655 -97.005939 -79.687371
GROUPheritage                 -17.779932 -26.510381  -9.049483
LENGTH                          6.897995   3.931375   9.864614
LANGUAGEenglish:GROUPheritage  45.639666  33.730092  57.549240
nd_2 <- expand.grid(
   ZIPFFREQ=c(0, 1, mean(d$ZIPFFREQ)),
   LENGTH  =c(0, 1, mean(d$LENGTH)),
   GROUP   =levels(d$GROUP),
   LANGUAGE=levels(d$LANGUAGE))
nd_2 <- cbind(nd_2,
   predict(m_final_2, newdata=nd_2, interval="confidence"))
names(nd_2)[5:7] <- c("PRED", "PRED_LOW", "PRED_UPP")
head(nd_2)
  ZIPFFREQ LENGTH   GROUP LANGUAGE     PRED PRED_LOW PRED_UPP
1 0.000000      0 english  spanish 731.3933 699.6345 763.1522
2 1.000000      0 english  spanish 705.2049 677.9057 732.5040
3 4.609476      0 english  spanish 610.6782 593.3513 628.0050
4 0.000000      1 english  spanish 738.2913 708.1073 768.4754
5 1.000000      1 english  spanish 712.1029 686.6069 737.5988
6 4.609476      1 english  spanish 617.5762 602.9652 632.1872

Again, the maybe (!) nicer way to represent this might be this:

with(nd_2,  # use the data in nd_2 and
   tapply(  # apply
      PRED, # to the predicted reaction times
      # a grouping like this: the numerics in the rows & columns, the factors in the layers/slices
      list(round(ZIPFFREQ, 2), round(LENGTH, 2), "GROUP"=GROUP, "LANGUAGE"=LANGUAGE),
      # just list the values, but rounded to 2 decimals
      c)) %>% round(2)
, , GROUP = english, LANGUAGE = spanish


            0      1    5.2
  0    731.39 738.29 767.26
  1    705.20 712.10 741.07
  4.61 610.68 617.58 646.55

, , GROUP = heritage, LANGUAGE = spanish


            0      1    5.2
  0    713.61 720.51 749.48
  1    687.42 694.32 723.30
  4.61 592.90 599.80 628.77

, , GROUP = english, LANGUAGE = english


            0      1    5.2
  0    643.05 649.94 678.92
  1    616.86 623.76 652.73
  4.61 522.33 529.23 558.20

, , GROUP = heritage, LANGUAGE = english


            0      1    5.2
  0    670.91 677.80 706.78
  1    644.72 651.62 680.59
  4.61 550.19 557.09 586.06

5.3 Visual interpretation

5.3.1 ZIPFFREQ

Let’s visualize the nature of the effect of ZIPFFREQ based on the predictions from effects:

zpf_d <- data.frame( # make zpf_d a data frame of
   zpf <- effect(    # the effect
      "ZIPFFREQ",    # of ZIPFFREQ
      m_final_2,     # in m_final_2
      xlevels=list(ZIPFFREQ=seq(3, 6.2, 0.2))))
plot(zpf,             # plot the effect of the main effect
   ylim=c(350, 1000), # w/ these y-axis limits
   grid=TRUE)         # & w/ a grid

And here a version with everything: observed values, predicted values, and the confidence intervals of the predictions; again, I use with:

with(d[keep,], {
   plot(                               # generate a plot
      xlab="Zipf frequency",                    x=jitter(ZIPFFREQ), # ZIPFFREQ on x-axis
      ylab="RT in ms"      , ylim=c(350, 1000), y=jitter(RT),       # RT on y-axis
      pch=16, col="#00000030"); grid() # w/ filled circles & a grid
   lines(               # draw a line
      x=zpf_d$ZIPFFREQ, # using these x-coordinates
      y=zpf_d$fit,      # & these y-coordinates
      col="green")
   polygon(col="#00FF0020", border=NA, # use a semi-transparent red & no border
      x=c(   zpf_d$ZIPFFREQ,   # x-coordinates left to right, then
         rev(zpf_d$ZIPFFREQ)), # the same values back (in reverse order)
      y=c(   zpf_d$lower,      # y-coordinates lower boundary
         rev(zpf_d$upper)))    # y-coordinates upper boundary
   # and lines for both variables' means
   abline(h=mean(RT, na.rm=TRUE), col="purple"); abline(v=mean(ZIPFFREQ, na.rm=TRUE), col="purple")
})

5.3.2 LENGTH

Let’s visualize the nature of the effect of LENGTH based on the predictions from effects:

le_d <- data.frame( # make le_d a data frame of
   le <- effect(    # the effect
      "LENGTH",     # of LENGTH
      m_final_2,    # in m_final_2
      xlevels=list(LENGTH=3:9)))
plot(le,              # plot the effect of the main effect
   ylim=c(350, 1000), # w/ these y-axis limits
   grid=TRUE)         # & w/ a grid

And here a version with everything: observed values, predicted values, and the confidence intervals of the predictions; again, I use with:

with(d[keep,], {
   plot(                               # generate a plot
      xlab="Length"  ,                    x=jitter(LENGTH), # LENGTH on x-axis
      ylab="RT in ms", ylim=c(350, 1000), y=jitter(RT),     # RT on y-axis
      pch=16, col="#00000030"); grid() # w/ filled circles & a grid
   lines(            # draw a line
      x=le_d$LENGTH, # using these x-coordinates
      y=le_d$fit,    # & these y-coordinates
      col="green")
   polygon(col="#0000FF20", border=NA, # use a semi-transparent red & no border
      x=c(   le_d$LENGTH,   # x-coordinates left to right, then
         rev(le_d$LENGTH)), # the same values back (in reverse order)
      y=c(   le_d$lower,    # y-coordinates lower boundary
         rev(le_d$upper)))  # y-coordinates upper boundary
   # and lines for both variables' means
   abline(h=mean(RT, na.rm=TRUE), col="purple"); abline(v=mean(LENGTH, na.rm=TRUE), col="purple")
})

5.3.3 LANGUAGE:GROUP

Let’s visualize the nature of the effect of LANGUAGE:GROUP based on the predictions from effects:

lagr_d <- data.frame(   # make lagr_d a data frame of
   lagr <- effect(      # the effect
      "LANGUAGE:GROUP", # of LANGUAGE:GROUP
      m_final_2))       # in m_final_2
plot(lagr,            # plot the effect of the interaction
   ylim=c(350, 1000), # w/ these y-axis limits
   grid=TRUE)         # & w/ a grid

plot(lagr,            # plot the effect of the interaction
   ylim=c(350, 1000), # w/ these y-axis limits
   x.var="GROUP",     # w/ this predictor on the x-axis
   grid=TRUE)         # & w/ a grid

And here a version (of the second of the two effects plots) with everything: observed values, predicted values, and the confidence intervals of the predictions:

# split data & predictions up by levels of LANGUAGE:
d_split <- split(d, d$LANGUAGE)
lagr_d_split <- split(lagr_d, lagr_d$LANGUAGE, drop=TRUE)
par(mfrow=c(1, 2))
stripchart(main="RT ~ GROUP for LANGUAGE='span'", # stripchart w/ main heading
           pch=16, col="#00000018",               # filled semi-transparent grey circles
           d_split$spa$RT ~ d_split$spa$GROUP,    # data: RT ~ GROUP (for spa)
           xlab="Group", ylab="RT (in ms)",       # axis labels
           ylim=c(350, 1000), vertical=TRUE, # make the chart have this vertical y-axis
           method="jitter"); grid()          # jitter observations, add grid
# add predicted means:
points(seq(lagr_d_split$spa$fit), lagr_d_split$spa$fit, pch="X", col="blue")
# add confidence intervals:
for (i in seq(lagr_d_split$spa$fit)) {  # as many arrows as needed
   arrows(i, lagr_d_split$spa$lower[i], # each from here
          i, lagr_d_split$spa$upper[i], # to here
          code=3, angle=90, col="blue") # both tips w/ 90 degrees
}
stripchart(main="RT ~ GROUP for LANGUAGE='eng'", # stripchart w/ main heading
           pch=16, col="#00000018",              # filled semi-transparent grey circles
           d_split$eng$RT ~ d_split$eng$GROUP,   # data: RT ~ GROUP (for eng)
           xlab="Group", ylab="RT (in ms)",      # axis labels
           ylim=c(350, 1000), vertical=TRUE, # make the chart have this vertical y-axis
           method="jitter"); grid()          # jitter observations, add grid
# add predicted means:
points(seq(lagr_d_split$eng$fit), lagr_d_split$eng$fit, pch="X", col="red")
# add confidence intervals:
for (i in seq(lagr_d_split$eng$fit)) {  # as many arrows as needed
   arrows(i, lagr_d_split$eng$lower[i], # each from here
          i, lagr_d_split$eng$upper[i], # to here
          code=3, angle=90, col="red")  # both tips w/ 90 degrees
}
par(mfrow=c(1, 1))

6 Write-up

To determine whether the reaction time to a word (in ms) varies as a function of

  • the Zipf frequency of that word (ZIPFFREQ);
  • the length of that word (LENGTH);
  • the language that word was presented in (LANGUAGE: english vs. spanish);
  • the speaker group that words was presented to (GROUP: english vs. heritage);
  • any pairwise interaction of these predictors with the specific expectation that the effect of LENGTH will be stronger for when LANGUAGE is spanish

a linear model selection process was undertaken. In order to deal with the very long right tail of the response variable the response variable was logged; then a backwards stepwise model selection of all predictors and controls (based on significance testing) was applied. The final model resulting from the elimination of ns predictors contained the effects of ZIPFFREQ:LENGTH, LANGUAGE:GROUP, and LANGUAGE:LENGTH and was highly significant (F=154.7, df1=7, df2=7744, p<0.0001), but it explains only little of the response variable (mult. R2=0.123, adj. R2=0.122). The summary table of the model is shown here.

Estimate 95%-CI se t p2-tailed
Intercept (ZIPFFREQ=0, LENGTH=0, LANGUAGE=spanish, GROUP=english) 9.26 [8.881, 9.638] 0.193 47.99 <0.001
ZIPFFREQ0→1 0.007 [-0.077 0.092] 0.043 0.171 0.864
LANGUAGEspanishenglish -0.258 [-0.366 -0.149] 0.056 -4.638 <0.001
GROUPenglishheritage -0.103 [-0.130 -0.076] 0.014 -7.436 <0.001
LENGTH0→1 0.132 [0.061 0.202] 0.036 3.66 <0.001
ZIPFFREQ0→1 + LENGTH0→1 -0.021 [-0.037 -0.006] 0.008 -2.648 0.008
LANGUAGEspanishenglish + GROUPenglishheritage 0.205 [0.168 0.243] 0.019 10.668 <0.001
LANGUAGEspanishenglish + LENGTH0→1 -0.018 [-0.039 0.002] 0.01 -1.754 0.079

The results indicate that

  • as hypothesized, the slowing-down effect of LENGTH is stronger for Spanish than for English words [show plot];
  • the interaction LANGUAGE:GROUP has the following effects [show at least one plot]:
    • reaction times are always higher when LANGUAGE is spanish than when LANGUAGE is english, but that difference is higher when GROUP is english than when GROUP is spanish;
    • when LANGUAGE is english, heritage speakers need more time than learners, but when LANGUAGE is spanish, heritage speakers need less time than learners;
    • all 4 means of LANGUAGE:GROUP are significantly different from each other (non-overlapping CIs);
  • the interaction LANGUAGE:LENGTH has the hypothesized effect that the effect of LENGTH is stronger – in fact, nearly 2.25 times as strong – when LANGUAGE is spanish than when LANGUAGE is english [show at least one plot];
  • the interaction of ZIPFFREQ:LENGTH is not strong: both variables behave as expected – LENGTH slows down, ZIPFFREQ speeds up – but
    • the effect of LENGTH is much stronger with rarer words than with frequent words;
    • the effect of ZIPFFREQ is much stronger with longer words than with shorter words.

A separate analysis that was conducted on a version of the data that was reduced to 6977 (90%) of its original size (7752, after omitting 248 incomplete cases) by eliminating most of the long right tail (all reaction times >967.5) and some on the left tail (all reaction times <364.5).1 In that model selection process,

  • the interaction LANGUAGE:GROUP had the same effect as in the model above;
  • the predictors ZIPFFREQ and LENGTH did not participate in interactions but were just main effects with the effect directions one might expect:
    • higher values of ZIPFFREQ speeds speakers up;
    • higher values of LENGTH slows speakers down.

7 Homework

To prepare for next session, read SFLWR3, Sections 5.3.1-5.3.3.

8 Session info

sessionInfo()
R version 4.4.3 (2025-02-28)
Platform: x86_64-pc-linux-gnu
Running under: Pop!_OS 22.04 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

time zone: America/Los_Angeles
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  compiler  methods
[8] base

other attached packages:
 [1] plotly_4.10.4   ggplot2_3.5.1   multcomp_1.4-28 TH.data_1.1-3
 [5] MASS_7.3-65     survival_3.8-3  mvtnorm_1.3-3   emmeans_1.11.0
 [9] effects_4.2-2   car_3.1-3       carData_3.0-5   STGmisc_1.0
[13] Rcpp_1.0.14     magrittr_2.0.3

loaded via a namespace (and not attached):
 [1] gtable_0.3.6       xfun_0.51          htmlwidgets_1.6.4  insight_1.1.0
 [5] lattice_0.22-6     crosstalk_1.2.1    vctrs_0.6.5        tools_4.4.3
 [9] Rdpack_2.6.3       generics_0.1.3     sandwich_3.1-1     tibble_3.2.1
[13] pkgconfig_2.0.3    Matrix_1.7-3       data.table_1.17.0  lifecycle_1.0.4
[17] farver_2.1.2       munsell_0.5.1      mitools_2.4        codetools_0.2-20
[21] survey_4.4-2       htmltools_0.5.8.1  lazyeval_0.2.2     yaml_2.3.10
[25] Formula_1.2-5      tidyr_1.3.1        pillar_1.10.1      nloptr_2.2.1
[29] reformulas_0.4.0   boot_1.3-31        abind_1.4-8        nlme_3.1-167
[33] tidyselect_1.2.1   digest_0.6.37      purrr_1.0.4        dplyr_1.1.4
[37] splines_4.4.3      fastmap_1.2.0      grid_4.4.3         colorspace_2.1-1
[41] cli_3.6.4          withr_3.0.2        scales_1.3.0       estimability_1.5.1
[45] rmarkdown_2.29     httr_1.4.7         nnet_7.3-20        lme4_1.1-36
[49] zoo_1.8-13         coda_0.19-4.1      evaluate_1.0.3     knitr_1.50
[53] rbibutils_2.3      viridisLite_0.4.2  rlang_1.1.5        xtable_1.8-4
[57] glue_1.8.0         DBI_1.2.3          rstudioapi_0.17.1  minqa_1.2.8
[61] jsonlite_1.9.1     R6_2.6.1          

Footnotes

  1. These boundary values were computed in such a way as to find the shortest interval of values that covers the largest number of values.↩︎