1 Introduction: A multifactorial model selection process
We are dealing with the same data set as last session; as a reminder, the data are in _input/rts.csv and you can find information about the variables/columns in _input/rts.r.
rm(list=ls(all.names=TRUE))library(car); library(effects); library(emmeans); library(magrittr); library(multcomp)summary(d <-read.delim( # summarize d, the result of loadingfile="_input/rts.csv", # this filestringsAsFactors=TRUE)) # change categorical variables into factors
CASE RT LENGTH LANGUAGE GROUP
Min. : 1 Min. : 271.0 Min. :3.000 english:4023 english :3961
1st Qu.:2150 1st Qu.: 505.0 1st Qu.:4.000 spanish:3977 heritage:4039
Median :4310 Median : 595.0 Median :5.000
Mean :4303 Mean : 661.5 Mean :5.198
3rd Qu.:6450 3rd Qu.: 732.0 3rd Qu.:6.000
Max. :8610 Max. :4130.0 Max. :9.000
NA's :248
CORRECT FREQPMW ZIPFFREQ SITE
correct :7749 Min. : 1.00 Min. :3.000 a:2403
incorrect: 251 1st Qu.: 17.00 1st Qu.:4.230 b:2815
Median : 42.00 Median :4.623 c:2782
Mean : 81.14 Mean :4.591
3rd Qu.: 101.00 3rd Qu.:5.004
Max. :1152.00 Max. :6.061
d$LANGUAGE <-factor(d$LANGUAGE, # for didactic reasons only, I amlevels=levels(d$LANGUAGE)[2:1]) # changing the order of the levelsd$RT_log <-log2(d$RT)
However, this session, we will deal with this in a multifactorial way. Specifically, we are asking, does the (logged) reaction time to a word (in ms) vary as a function of
the Zipf frequency of that word (ZIPFFREQ);
the length of that word (LENGTH);
the language that word was presented in (LANGUAGE: spanish vs. english);
the speaker group that words was presented to (GROUP: english vs. heritage);
any pairwise interaction of these predictors? (Specifically, we are expecting that the effect of LENGTH will be stronger for when LANGUAGE is spanish.)
2 Exploration & preparation
Since there are some missing data but only in the response variable, we immediately reduce d to all the complete cases a regression would consider:
summary(d <-droplevels(d[complete.cases(d),]))
CASE RT LENGTH LANGUAGE GROUP
Min. : 1 Min. : 271.0 Min. :3.0 spanish:3775 english :3793
1st Qu.:2123 1st Qu.: 505.0 1st Qu.:4.0 english:3977 heritage:3959
Median :4262 Median : 595.0 Median :5.0
Mean :4268 Mean : 661.5 Mean :5.2
3rd Qu.:6405 3rd Qu.: 732.0 3rd Qu.:6.0
Max. :8610 Max. :4130.0 Max. :9.0
CORRECT FREQPMW ZIPFFREQ SITE RT_log
correct :7749 Min. : 1.00 Min. :3.000 a:2403 Min. : 8.082
incorrect: 3 1st Qu.: 18.00 1st Qu.:4.255 b:2815 1st Qu.: 8.980
Median : 43.00 Median :4.633 c:2534 Median : 9.217
Mean : 82.88 Mean :4.609 Mean : 9.289
3rd Qu.: 104.00 3rd Qu.:5.017 3rd Qu.: 9.516
Max. :1152.00 Max. :6.061 Max. :12.012
Some exploration of the relevant variables; in the interest of time we only look at univariate and monofactorial stuff – a big ‘no no’ for any real analysis! How would we want to explore the response variable?
hist(d$RT_log); hist(d$RT_log, breaks="FD")
How would we want to explore the predictors and their relations to the response?
# the predictors on their ownhist(d$ZIPFFREQ); hist(d$ZIPFFREQ, breaks="FD")
hist(d$LENGTH); hist(d$LENGTH, breaks="FD")
table(d$LANGUAGE)
spanish english
3775 3977
table(d$GROUP)
english heritage
3793 3959
# the predictor(s) w/ the responseplot(main="RT (in ms, logged) as a function of Zipf frequency",pch=16, col="#00000020",xlab="Zipf frequency" , x=d$ZIPFFREQ,ylab="RT (in ms, logged)", y=d$RT_log); grid()abline(lm(d$RT_log ~ d$ZIPFFREQ), col="green", lwd=2)
plot(main="RT (in ms, logged) as a function of length in characters",pch=16, col="#00000020",xlab="Length" , x=jitter(d$LENGTH),ylab="RT (in ms, logged)", y=d$RT_log); grid()abline(lm(d$RT_log ~ d$LENGTH), col="green", lwd=2)
qwe <-boxplot(main="RT (in ms, logged) as a function of stimulus language", d$RT_log ~ d$LANGUAGE, notch=TRUE,xlab="Stimulus language",ylab="RT (in ms, logged)"); grid()
asd <-boxplot(main="RT (in ms, logged) as a function of speaker group", d$RT_log ~ d$GROUP, notch=TRUE,xlab="Speaker group",ylab="RT (in ms, logged)"); grid()
table(d$LANGUAGE, d$GROUP)
english heritage
spanish 1805 1970
english 1988 1989
Nothing super problematic, it seems …
3 Modeling & numerical interpretation
Let’s fit and evaluate our initial regression model:
summary(m_01 <-lm( # make/summarize the linear model m_01: RT_log ~1+# RT_log ~ an overall intercept (1)# & these predictors & their pairwise interactions (ZIPFFREQ+LANGUAGE+GROUP+LENGTH)^2,data=d, # those vars are in dna.action=na.exclude)) # skip cases with NA/missing data (good habit)
anova(m_03, m_04, # compare m_03 to m_04test="F") # using an F-test
Analysis of Variance Table
Model 1: RT_log ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:LANGUAGE +
ZIPFFREQ:LENGTH + LANGUAGE:GROUP + LANGUAGE:LENGTH
Model 2: RT_log ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:LENGTH +
LANGUAGE:GROUP + LANGUAGE:LENGTH
Res.Df RSS Df Sum of Sq F Pr(>F)
1 7743 1387.8
2 7744 1387.9 -1 -0.058712 0.3276 0.5671
drop1(m_04, # drop each droppable predictor at a time from m_04 &test="F") # test its significance w/ an F-test
Single term deletions
Model:
RT_log ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:LENGTH +
LANGUAGE:GROUP + LANGUAGE:LENGTH
Df Sum of Sq RSS AIC F value Pr(>F)
<none> 1387.9 -13319
ZIPFFREQ:LENGTH 1 1.2570 1389.1 -13314 7.0135 0.008106 **
LANGUAGE:GROUP 1 20.3956 1408.3 -13208 113.8018 < 2.2e-16 ***
LANGUAGE:LENGTH 1 0.5515 1388.4 -13318 3.0773 0.079429 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We’re done now – why? Because remember that we said “we are expecting that the effect of LENGTH will be stronger for when LANGUAGE is spanish”, meaning if the coefficient for LANGUAGEenglish:LENGTH is negative – which it is – we can take half its p-value and then it is in fact significant! We call the final model m_final (for easy copying and pasting across analyses) and do some housekeeping:
m_final <- m_04; rm(m_04); invisible(gc())
Normally, we would do some regression diagnostics now – I have to skip those now to discuss how to interpret the model, but we will return to this later! For now, we compute the confidence intervals …
ZIPFFREQ LENGTH GROUP LANGUAGE PRED PRED_LOW PRED_UPP
1 0.000000 0 english spanish 9.259652 8.881491 9.637813
2 1.000000 0 english spanish 9.267022 8.971578 9.562465
3 4.609476 0 english spanish 9.293623 9.228019 9.359227
4 0.000000 1 english spanish 9.391371 9.081696 9.701045
5 1.000000 1 english spanish 9.377381 9.135256 9.619506
6 4.609476 1 english spanish 9.326885 9.272164 9.381606
A maybe (!) nicer way to represent this might be this, but whether this is nicer or not is probably subjective:
with(nd, # use the data in nd andtapply( # apply PRED, # to the predicted reaction times# a grouping like this: the numerics in the rows & columns, the factors in the layers/sliceslist(round(ZIPFFREQ, 2), round(LENGTH, 2), "GROUP"=GROUP, "LANGUAGE"=LANGUAGE),# just list the values, but rounded to 2 decimals c)) %>%round(2)
, , GROUP = english, LANGUAGE = spanish
0 1 5.2
0 9.26 9.39 9.94
1 9.27 9.38 9.84
4.61 9.29 9.33 9.47
, , GROUP = heritage, LANGUAGE = spanish
0 1 5.2
0 9.16 9.29 9.84
1 9.16 9.27 9.74
4.61 9.19 9.22 9.36
, , GROUP = english, LANGUAGE = english
0 1 5.2
0 9.00 9.12 9.59
1 9.01 9.10 9.49
4.61 9.04 9.05 9.11
, , GROUP = heritage, LANGUAGE = english
0 1 5.2
0 9.10 9.22 9.69
1 9.11 9.20 9.59
4.61 9.14 9.15 9.22
4 Visual interpretation
4.1LANGUAGE:GROUP
Let’s visualize the nature of the effect of LANGUAGE:GROUP based on the predictions from effects:
lagr_d <-data.frame( # make lagr_d a data frame of lagr <-effect( # the effect"LANGUAGE:GROUP", # of LANGUAGE:GROUP m_final)) # in m_finalplot(lagr, # plot the effect of the interactionylim=range(d$RT_log, na.rm=TRUE), # w/ these y-axis limitsgrid=TRUE) # & w/ a grid
plot(lagr, # plot the effect of the interactionylim=range(d$RT_log, na.rm=TRUE), # w/ these y-axis limitsx.var="GROUP", # w/ this predictor on the x-axisgrid=TRUE) # & w/ a grid
And here’s a version (of the first of the two effects plots) with everything: observed values, predicted values, and the confidence intervals of the predictions:
# split data & predictions up by levels of GROUP:d_split <-split(d, d$GROUP)lagr_d_split <-split(lagr_d, lagr_d$GROUP, drop=TRUE)par(mfrow=c(1, 2))stripchart(main="RT ~ LANGUAGE for GROUP='english'", # stripchart w/ main headingpch=16, col="#00000020", # filled semi-transparent grey circles d_split$english$RT_log ~ d_split$english$LANGUAGE, # data: RT_log ~ LANG (for eng)xlab="RT (in ms, logged)", ylab="Language", # axis labelsvertical=TRUE, method="jitter"); grid() # vertical plot & jittered observations, add grid# add predicted means:points(seq(lagr_d_split$english$fit), lagr_d_split$english$fit, pch="X", col=c("red", "blue"))# add confidence intervals:for (i inseq(lagr_d_split$english$fit)) { # as many arrows as neededarrows(i, lagr_d_split$english$lower[i], # each from here i, lagr_d_split$english$upper[i], # to herecode=3, angle=90, col=c("red", "blue")[i]) # both tips w/ 90 degrees}stripchart(main="RT ~ LANGUAGE for GROUP='heritage'", # stripchart w/ main headingpch=16, col="#00000020", # filled semi-transparent grey circles d_split$heritage$RT_log ~ d_split$heritage$LANGUAGE, # data: RT_log ~ LANG (for her)xlab="RT (in ms, logged)", ylab="Language", # axis labelsvertical=TRUE, method="jitter"); grid() # vertical plot & jittered observations, add grid# add predicted means:points(seq(lagr_d_split$heritage$fit), lagr_d_split$heritage$fit, pch="X", col=c("red", "blue"))# add confidence intervals:for (i inseq(lagr_d_split$heritage$fit)) { # as many arrows as neededarrows(i, lagr_d_split$heritage$lower[i], # each from here i, lagr_d_split$heritage$upper[i], # to herecode=3, angle=90, col=c("red", "blue")[i]) # both tips w/ 90 degrees}par(mfrow=c(1, 1))
4.2LANGUAGE:LENGTH
Let’s visualize the nature of the effect of LANGUAGE:LENGTH based on the predictions from effects:
(lale_d <-data.frame( # make lale_d a data frame of lale <-effect( # the effect"LANGUAGE:LENGTH", # of LANGUAGE:LENGTH m_final, # in m_finalxlevels=list( # for whenLENGTH=3:9)))) # LENGTH is these values
plot(lale, # plot the effect of the interactionylim=range(d$RT_log, na.rm=TRUE), # w/ these y-axis limitsgrid=TRUE) # & w/ a grid
And here a version with everything: observed values, predicted values (red for LANGUAGE: english, blue for LANGUAGE: spanish), and the confidence intervals of the predictions (I’m building this up very stepwise, more so than I might do in actual research):
d_split <-split(d, d$LANGUAGE)lale_d_split <-split(lale_d, lale_d$LANGUAGE, drop=TRUE)plot(main="RT ~ LANGUAGE:LENGTH", type="n", # plot nothingxlab="Length" , x=d$LENGTH, # x-axisylab="RT in ms", y=d$RT_log); grid() # y-axis & grid# plot observed data points for LANGUAGE being eng:points(jitter(d_split$eng$LENGTH), d_split$eng$RT_log, pch=16, col="#FF000010")lines(lale_d_split$eng$LENGTH, lale_d_split$eng$fit, col="red")# plot observed data points for LANGUAGE being spa:points(jitter(d_split$spa$LENGTH), d_split$spa$RT_log, pch=16, col="#0000FF10")lines(lale_d_split$spa$LENGTH, lale_d_split$spa$fit, col="blue")# plot confidence band for LANGUAGE being eng:polygon(col="#FF000030", border="red", # semi-transparent grey, for once w/ borderx=c( lale_d_split$eng$LENGTH, # x-coordinates left to right, thenrev(lale_d_split$eng$LENGTH)), # the same values back (in reverse order)y=c( lale_d_split$eng$lower, # y-coordinates lower boundaryrev(lale_d_split$eng$upper))) # y-coordinates upper boundary# plot confidence band for LANGUAGE being spa:polygon(col="#0000FF30", border="blue", # semi-transparent grey, for once w/ borderx=c( lale_d_split$spa$LENGTH, # x-coordinates left to right, thenrev(lale_d_split$spa$LENGTH)), # the same values back (in reverse order)y=c( lale_d_split$spa$lower, # y-coordinates lower boundaryrev(lale_d_split$spa$upper))) # y-coordinates upper boundary# plot legendlegend(title="Language", bty="n",x=6, xjust=0.5, y=11.5, yjust=0.5, ncol=2,legend=levels(d$LANGUAGE), fill=c("red", "blue"))
4.3ZIPFFREQ:LENGTH
Let’s visualize the nature of the effect of ZIPFFREQ:LENGTH based on the predictions from effects:
(zile_d <-data.frame( # make zile_d a data frame of zile <-effect( # the effect"ZIPFFREQ:LENGTH", # of ZIPFFREQ:LENGTH m_final, # in m_finalxlevels=list( # for whenZIPFFREQ=seq(3, 6, 0.25), # ZIPFFREQ is these valuesLENGTH=seq(3, 9, 0.5))))) # LENGTH is these values
plot(zile, # plot the effect of the interactionylim=range(d$RT_log, na.rm=TRUE), # w/ these y-axis limitsgrid=TRUE) # & w/ a grid
Not that great … There are two main ways we could try and plot this better. One would be with an actual 3-D plot. The simplest version of this might use rgl::plot3D like this, which would open a new window with a rotatable 3-D plot (but this wouldn’t be rendered into this HTML properly, which is why I am not executing it here):
While this is nice, it’s also hard to publish in print. Let’s do this differently with a numeric heatmap kind of plot that I often use. For that, the numeric predictions are first cut into 9 ordinal bins with lower/higher bin numbers reflecting lower/higher numeric predictions …
zile_d$fitcat <-as.numeric( # create a numeric column fitcat in zile_dcut(zile_d$fit, # by cutting the predicted values upbreaks=seq( # at the values ranging frommin(zile_d$fit), # the smallest obs/pred RT tomax(zile_d$fit), # the largest obs/pred RTlength.out=10), # into 9 groupsinclude.lowest=TRUE, # include the lowest, andlabels=1:9)) # use the numbers 1:9 as labels
… and then we plot them into a coordinate system:
plot(main="RT ~ ZIPFFREQ:LENGTH", type="n", # plot nothing, justxlab="Zipf frequency", xlim=range(zile_d$ZIPFFREQ), x=0, # get the coordinateylab="Length" , ylim=range(zile_d$LENGTH) , y=0) # system set uptext( # plot textx=zile_d$ZIPFFREQ, y=zile_d$LENGTH, # at these coordinates, namelylabels=zile_d$fitcat, # the 'categorical/ordinal' predictions# make the physical size of the numbers dependent on the valuescex=0.5+zile_d$fitcat/7,# make the darkness of the grey reflect the predicted RT_logcol=grey(level=zile_d$fitcat/10)) # see ?grey
However, there’s one thing that’s possibly problematic about this – what is that? The problem is that we are binning with cut on the basis of zile_d$fit (check the above code), but the range of those predictions is much smaller than the range of the actually observed reaction times in d$RT_log:
range(zile_d$fit)
[1] 9.120337 9.674938
range(d$RT_log)
[1] 8.082149 12.011926
That in turn means the binning steps based on zile_d$fit will be based on much smaller differences between predicted values (of only 40, check levels(cut(zile_d$fit, breaks=seq(min(zile_d$fit), max(zile_d$fit), length.out=10)))), meaning a difference of predictions of only 81 ms could lead to a difference in 3 bins – if we were to bin based on the actual value range, that would not happen, because there the binning steps are 429 ms wide (check levels(cut(d$RT_log, breaks=seq(min(d$RT_log), max(d$RT_log), length.out=10)))). Thus, the above plot exaggerates the effects somewhat.
Accordingly, we determine the range of the values to be plotted in a potentially more representative way, by using
as a minimum, the minimum of the predictions and the actual values combined;
as a maximum, the maximum of the predictions and the actual values combined:
# define the minimum of the range to be used for binningmin_of_obs_n_pred <-min(c( d$RT_log, zile_d$fit),na.rm=TRUE)# define the maximum of the range to be used for binningmax_of_obs_n_pred <-max(c( d$RT_log, zile_d$fit),na.rm=TRUE)
We can see that the initial plot indeed exaggerated the range: the upper plot showed prediction bins from 1 to 9, but we now see that the predicctions are actually only in a really small range of values.
zile_d$fitcat2 <-as.numeric( # create a numeric column fitcat2 in zile_dcut(zile_d$fit, # by cutting the predicted values upbreaks=seq( # at the values ranging from min_of_obs_n_pred, # the smallest obs/pred RT_log to <--------- max_of_obs_n_pred, # the largest obs/pred RT_log <---------length.out=10), # into 9 groupsinclude.lowest=TRUE, # include the lowest, andlabels=1:9)) # use the numbers 1:9 as labelstable(zile_d$fitcat, zile_d$fitcat2)
Thus, we plot this again (with some additional info in the form of the decile-based grid), which here doesn’t help much:
plot(main="RT ~ ZIPFFREQ:LENGTH", type="n", # plot nothing, justxlab="Zipf frequency", xlim=range(zile_d$ZIPFFREQ), x=0, # get the coordinateylab="Length" , ylim=range(zile_d$LENGTH) , y=0) # system set uptext( # plot textx=zile_d$ZIPFFREQ, y=zile_d$LENGTH, # at these coordinates, namelylabels=zile_d$fitcat2, # the 'categorical/ordinal' predictions# make the physical size of the numbers dependent on the valuescex=0.5+zile_d$fitcat2/7,# make the darkness of the grey reflect the predicted RT_logcol=grey(level=zile_d$fitcat2/15)) # see ?greyabline(h=quantile(d$LENGTH , probs=seq(0, 1, 0.1)), lty=3, col="#00000020")abline(v=quantile(d$ZIPFFREQ, probs=seq(0, 1, 0.1)), lty=3, col="#00000020")
5 What if we restrict RT?
Given that we already noticed early on that the long tail of RT might cause problems, we used a logged version of this variable. But what if we had done something else, if we had not transformed RT but also not included all its values in the analysis? Here, we run the model selection process again, but now on a subset of the data in d, which is delimited like this:
shortest.densest.region <-function ( nv, # a numeric vectortarget=0.9, # the target proportion to retainplotty=FALSE) { # whether to return hist & ecdf zero2one <-function (nv) {if (length(unique(nv))==1) { output <-rep(0.5, length(nv)) } else { output <-"/"( nv -min(nv, na.rm=TRUE),diff(range(nv, na.rm=TRUE))) }; return(output) } nv <- nv[!is.na(nv)] tabulated.nv <-table(nv) all.subsettings <-mapply(":", seq(tabulated.nv), length(tabulated.nv)) all.coverages <-lapply(all.subsettings, \(af) cumsum(tabulated.nv[af])/length(nv)) results <-data.frame(# the values on the left defining the beg:BEGnum=rep(as.numeric(names(tabulated.nv)), sapply(all.subsettings, length)),# the values on the right defining the end:ENDnum=as.numeric(names(tabulated.nv)[unlist(all.subsettings)]),# the positions of the (sorted) values on the left defining the beg:BEGpos=rep(seq(all.subsettings), sapply(all.subsettings, length)),# the positions of the (sorted) values on the right defining the end:ENDpos=unlist(all.subsettings))# the NUMERICAL distance between the beg & the end: results$RANGEnum <- results$ENDnum-results$BEGnum# the POSITIONAL distance between the beg & the end: results$RANGEpos <- results$ENDpos-results$BEGpos+1# coverage: how much of the tokens as a proportion is between the beg & the end: results$COVERAGES <-unlist(all.coverages)# coverage normalized by the proportion of the NUMERICAL range between beg & end out of whole range: results$COVbyRANGEnum <- results$COVERAGES/results$RANGEnum# coverage normalized by the proportion of the POSITIONAL range between beg & end out of whole range: results$COVbyRANGEpos <- results$COVERAGES/results$RANGEpos# reduce to those ranges meeting the target: results <- results[results$COVERAGES>=target,]# conflating those into a single Euclidean distance: results$EUCL <-sqrt(results$COVbyRANGEnum^2+ results$COVbyRANGEpos^2) results <- results[order(-results$EUCL, results$COVERAGES),] results <- results[,c(7,10,1,2,5,8,3,4,6,9)] output <-list("Smallest difference div. by 2"=nv %>% unique %>% sort %>% diff %>%"/"(2) %>% min,"Sorted by Euclidean based on width & length"=results) output[["Recommended range (based in Eucl)"]]=c("lower"=results$BEGnum[1] - output[[1]],"upper"=results$ENDnum[1] + output[[1]])if (plotty) {par(mfrow=c(1,2))hist(nv, breaks="FD", main="")abline(v=output[[3]], col="blue", lty=3)plot(ecdf(nv), verticals=TRUE)abline(v=output[[3]], col="blue", lty=3)par(mfrow=c(1,1)) }return(output)}
shortest.densest.region(d$RT, target=0.9)[3:4]
$`Recommended range (based in Eucl)`
lower upper
364.5 967.5
$<NA>
NULL
Let’s see how the above exploration results (for response and predictors) change when we use the restricted unlogged response variable; note the use of with(..., { ...}):
# the predictor(s) w/ the responsewith(d[keep,], {plot(main="RT in ms as a function of frequency per million words",pch=16, col="#00000030",xlab="Zipf frequency", xlim=c( 3, 6), x=ZIPFFREQ,ylab="RT (in ms)" , ylim=c(250, 1000), y=RT); grid()abline(lm(RT ~ ZIPFFREQ), col="green", lwd=2)plot(main="RT in ms as a function of length in characters",pch=16, col="#00000020",xlab="Length" , xlim=c( 3, 9), x=jitter(LENGTH),ylab="RT (in ms)", ylim=c(250, 1000), y=RT); grid()abline(lm(RT ~ LENGTH), col="green", lwd=2)qwe <-boxplot(main="RT in ms as a function of stimulus language", RT ~ LANGUAGE, notch=TRUE, outline=FALSE,xlab="Stimulus language",ylab="RT (in ms)" , ylim=c(250, 1000)); grid()qwe <-boxplot(main="RT in ms as a function of speaker group", RT ~ GROUP, notch=TRUE, outline=FALSE,xlab="Speaker group",ylab="RT (in ms)" , ylim=c(250, 1000)); grid()})
This does look much better!
5.2 Modeling & numerical interpretation
Let’s see whether/how the modeling process changes if we only look at the data points we decided to keep:
summary(m_11 <-lm( # make/summarize the linear model m_11: RT ~1+# RT ~ an overall intercept (1)# & these predictors & their pairwise interactions (ZIPFFREQ+LANGUAGE+GROUP+LENGTH)^2,data=d, subset=keep, # those vars, but now in this subsetna.action=na.exclude)) # skip cases with NA/missing data (good habit)
anova(m_13, m_14, # compare m_13 to m_14test="F") # using an F-test
Analysis of Variance Table
Model 1: RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:GROUP +
LANGUAGE:GROUP + LANGUAGE:LENGTH + GROUP:LENGTH
Model 2: RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:GROUP +
LANGUAGE:GROUP + GROUP:LENGTH
Res.Df RSS Df Sum of Sq F Pr(>F)
1 6968 111171195
2 6969 111181220 -1 -10025 0.6283 0.428
drop1(m_14, # drop each droppable predictor at a time from m_14 &test="F") # test its significance w/ an F-test
Single term deletions
Model:
RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:GROUP +
LANGUAGE:GROUP + GROUP:LENGTH
Df Sum of Sq RSS AIC F value Pr(>F)
<none> 111181220 67528
ZIPFFREQ:GROUP 1 15002 111196222 67526 0.9403 0.3322
LANGUAGE:GROUP 1 930487 112111707 67584 58.3243 2.52e-14 ***
GROUP:LENGTH 1 26256 111207477 67527 1.6458 0.1996
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Let’s fit the 5th model with the ns interaction ZIPFFREQ:GROUP deleted:
summary(m_15 <-update( # make m_15 an update of m_14, .~. # m_14, namely all of it (.~.), but then- ZIPFFREQ:GROUP)) # remove this interaction
Call:
lm(formula = RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + LANGUAGE:GROUP +
GROUP:LENGTH, data = d, subset = keep, na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-283.78 -93.65 -21.27 79.27 418.37
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 743.156 18.374 40.445 < 2e-16 ***
ZIPFFREQ -26.223 2.812 -9.325 < 2e-16 ***
LANGUAGEenglish -89.324 4.475 -19.959 < 2e-16 ***
GROUPheritage -40.002 16.975 -2.357 0.0185 *
LENGTH 4.760 2.185 2.178 0.0294 *
LANGUAGEenglish:GROUPheritage 47.523 6.232 7.626 2.74e-14 ***
GROUPheritage:LENGTH 4.099 3.022 1.357 0.1749
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 126.3 on 6970 degrees of freedom
Multiple R-squared: 0.09065, Adjusted R-squared: 0.08986
F-statistic: 115.8 on 6 and 6970 DF, p-value: < 2.2e-16
anova(m_14, m_15, # compare m_14 to m_15test="F") # using an F-test
Analysis of Variance Table
Model 1: RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + ZIPFFREQ:GROUP +
LANGUAGE:GROUP + GROUP:LENGTH
Model 2: RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + LANGUAGE:GROUP +
GROUP:LENGTH
Res.Df RSS Df Sum of Sq F Pr(>F)
1 6969 111181220
2 6970 111196222 -1 -15002 0.9403 0.3322
drop1(m_15, # drop each droppable predictor at a time from m_15 &test="F") # test its significance w/ an F-test
Single term deletions
Model:
RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + LANGUAGE:GROUP +
GROUP:LENGTH
Df Sum of Sq RSS AIC F value Pr(>F)
<none> 111196222 67526
ZIPFFREQ 1 1387245 112583468 67611 86.9553 < 2.2e-16 ***
LANGUAGE:GROUP 1 927823 112124045 67582 58.1578 2.741e-14 ***
GROUP:LENGTH 1 29362 111225584 67526 1.8405 0.1749
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Let’s fit the 5th model with the ns interaction GROUP:LENGTH deleted:
summary(m_16 <-update( # make m_16 an update of m_15, .~. # m_15, namely all of it (.~.), but then- GROUP:LENGTH)) # remove this interaction
Call:
lm(formula = RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + LANGUAGE:GROUP,
data = d, subset = keep, na.action = na.exclude)
Residuals:
Min 1Q Median 3Q Max
-284.54 -93.37 -21.62 78.96 418.28
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 731.393 16.201 45.145 < 2e-16 ***
ZIPFFREQ -26.188 2.812 -9.312 < 2e-16 ***
LANGUAGEenglish -88.347 4.417 -20.000 < 2e-16 ***
GROUPheritage -17.780 4.454 -3.992 6.61e-05 ***
LENGTH 6.898 1.513 4.558 5.25e-06 ***
LANGUAGEenglish:GROUPheritage 45.640 6.075 7.512 6.54e-14 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 126.3 on 6971 degrees of freedom
Multiple R-squared: 0.09041, Adjusted R-squared: 0.08975
F-statistic: 138.6 on 5 and 6971 DF, p-value: < 2.2e-16
anova(m_15, m_16, # compare m_15 to m_16test="F") # using an F-test
Analysis of Variance Table
Model 1: RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + LANGUAGE:GROUP +
GROUP:LENGTH
Model 2: RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + LANGUAGE:GROUP
Res.Df RSS Df Sum of Sq F Pr(>F)
1 6970 111196222
2 6971 111225584 -1 -29362 1.8405 0.1749
drop1(m_16, # drop each droppable predictor at a time from m_16 &test="F") # test its significance w/ an F-test
Single term deletions
Model:
RT ~ ZIPFFREQ + LANGUAGE + GROUP + LENGTH + LANGUAGE:GROUP
Df Sum of Sq RSS AIC F value Pr(>F)
<none> 111225584 67526
ZIPFFREQ 1 1383671 112609255 67611 86.721 < 2.2e-16 ***
LENGTH 1 331496 111557080 67545 20.776 5.250e-06 ***
LANGUAGE:GROUP 1 900428 112126012 67581 56.434 6.538e-14 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
m_final_2 <- m_16; rm(m_16); invisible(gc())
We’re done so let’s compute the confidence intervals and generate nd_2:
ZIPFFREQ LENGTH GROUP LANGUAGE PRED PRED_LOW PRED_UPP
1 0.000000 0 english spanish 731.3933 699.6345 763.1522
2 1.000000 0 english spanish 705.2049 677.9057 732.5040
3 4.609476 0 english spanish 610.6782 593.3513 628.0050
4 0.000000 1 english spanish 738.2913 708.1073 768.4754
5 1.000000 1 english spanish 712.1029 686.6069 737.5988
6 4.609476 1 english spanish 617.5762 602.9652 632.1872
Again, the maybe (!) nicer way to represent this might be this:
with(nd_2, # use the data in nd_2 andtapply( # apply PRED, # to the predicted reaction times# a grouping like this: the numerics in the rows & columns, the factors in the layers/sliceslist(round(ZIPFFREQ, 2), round(LENGTH, 2), "GROUP"=GROUP, "LANGUAGE"=LANGUAGE),# just list the values, but rounded to 2 decimals c)) %>%round(2)
, , GROUP = english, LANGUAGE = spanish
0 1 5.2
0 731.39 738.29 767.26
1 705.20 712.10 741.07
4.61 610.68 617.58 646.55
, , GROUP = heritage, LANGUAGE = spanish
0 1 5.2
0 713.61 720.51 749.48
1 687.42 694.32 723.30
4.61 592.90 599.80 628.77
, , GROUP = english, LANGUAGE = english
0 1 5.2
0 643.05 649.94 678.92
1 616.86 623.76 652.73
4.61 522.33 529.23 558.20
, , GROUP = heritage, LANGUAGE = english
0 1 5.2
0 670.91 677.80 706.78
1 644.72 651.62 680.59
4.61 550.19 557.09 586.06
5.3 Visual interpretation
5.3.1ZIPFFREQ
Let’s visualize the nature of the effect of ZIPFFREQ based on the predictions from effects:
zpf_d <-data.frame( # make zpf_d a data frame of zpf <-effect( # the effect"ZIPFFREQ", # of ZIPFFREQ m_final_2, # in m_final_2xlevels=list(ZIPFFREQ=seq(3, 6.2, 0.2))))plot(zpf, # plot the effect of the main effectylim=c(350, 1000), # w/ these y-axis limitsgrid=TRUE) # & w/ a grid
And here a version with everything: observed values, predicted values, and the confidence intervals of the predictions; again, I use with:
with(d[keep,], {plot( # generate a plotxlab="Zipf frequency", x=jitter(ZIPFFREQ), # ZIPFFREQ on x-axisylab="RT in ms" , ylim=c(350, 1000), y=jitter(RT), # RT on y-axispch=16, col="#00000030"); grid() # w/ filled circles & a gridlines( # draw a linex=zpf_d$ZIPFFREQ, # using these x-coordinatesy=zpf_d$fit, # & these y-coordinatescol="green")polygon(col="#00FF0020", border=NA, # use a semi-transparent red & no borderx=c( zpf_d$ZIPFFREQ, # x-coordinates left to right, thenrev(zpf_d$ZIPFFREQ)), # the same values back (in reverse order)y=c( zpf_d$lower, # y-coordinates lower boundaryrev(zpf_d$upper))) # y-coordinates upper boundary# and lines for both variables' meansabline(h=mean(RT, na.rm=TRUE), col="purple"); abline(v=mean(ZIPFFREQ, na.rm=TRUE), col="purple")})
5.3.2LENGTH
Let’s visualize the nature of the effect of LENGTH based on the predictions from effects:
le_d <-data.frame( # make le_d a data frame of le <-effect( # the effect"LENGTH", # of LENGTH m_final_2, # in m_final_2xlevels=list(LENGTH=3:9)))plot(le, # plot the effect of the main effectylim=c(350, 1000), # w/ these y-axis limitsgrid=TRUE) # & w/ a grid
And here a version with everything: observed values, predicted values, and the confidence intervals of the predictions; again, I use with:
with(d[keep,], {plot( # generate a plotxlab="Length" , x=jitter(LENGTH), # LENGTH on x-axisylab="RT in ms", ylim=c(350, 1000), y=jitter(RT), # RT on y-axispch=16, col="#00000030"); grid() # w/ filled circles & a gridlines( # draw a linex=le_d$LENGTH, # using these x-coordinatesy=le_d$fit, # & these y-coordinatescol="green")polygon(col="#0000FF20", border=NA, # use a semi-transparent red & no borderx=c( le_d$LENGTH, # x-coordinates left to right, thenrev(le_d$LENGTH)), # the same values back (in reverse order)y=c( le_d$lower, # y-coordinates lower boundaryrev(le_d$upper))) # y-coordinates upper boundary# and lines for both variables' meansabline(h=mean(RT, na.rm=TRUE), col="purple"); abline(v=mean(LENGTH, na.rm=TRUE), col="purple")})
5.3.3LANGUAGE:GROUP
Let’s visualize the nature of the effect of LANGUAGE:GROUP based on the predictions from effects:
lagr_d <-data.frame( # make lagr_d a data frame of lagr <-effect( # the effect"LANGUAGE:GROUP", # of LANGUAGE:GROUP m_final_2)) # in m_final_2plot(lagr, # plot the effect of the interactionylim=c(350, 1000), # w/ these y-axis limitsgrid=TRUE) # & w/ a grid
plot(lagr, # plot the effect of the interactionylim=c(350, 1000), # w/ these y-axis limitsx.var="GROUP", # w/ this predictor on the x-axisgrid=TRUE) # & w/ a grid
And here a version (of the second of the two effects plots) with everything: observed values, predicted values, and the confidence intervals of the predictions:
# split data & predictions up by levels of LANGUAGE:d_split <-split(d, d$LANGUAGE)lagr_d_split <-split(lagr_d, lagr_d$LANGUAGE, drop=TRUE)par(mfrow=c(1, 2))stripchart(main="RT ~ GROUP for LANGUAGE='span'", # stripchart w/ main headingpch=16, col="#00000018", # filled semi-transparent grey circles d_split$spa$RT ~ d_split$spa$GROUP, # data: RT ~ GROUP (for spa)xlab="Group", ylab="RT (in ms)", # axis labelsylim=c(350, 1000), vertical=TRUE, # make the chart have this vertical y-axismethod="jitter"); grid() # jitter observations, add grid# add predicted means:points(seq(lagr_d_split$spa$fit), lagr_d_split$spa$fit, pch="X", col="blue")# add confidence intervals:for (i inseq(lagr_d_split$spa$fit)) { # as many arrows as neededarrows(i, lagr_d_split$spa$lower[i], # each from here i, lagr_d_split$spa$upper[i], # to herecode=3, angle=90, col="blue") # both tips w/ 90 degrees}stripchart(main="RT ~ GROUP for LANGUAGE='eng'", # stripchart w/ main headingpch=16, col="#00000018", # filled semi-transparent grey circles d_split$eng$RT ~ d_split$eng$GROUP, # data: RT ~ GROUP (for eng)xlab="Group", ylab="RT (in ms)", # axis labelsylim=c(350, 1000), vertical=TRUE, # make the chart have this vertical y-axismethod="jitter"); grid() # jitter observations, add grid# add predicted means:points(seq(lagr_d_split$eng$fit), lagr_d_split$eng$fit, pch="X", col="red")# add confidence intervals:for (i inseq(lagr_d_split$eng$fit)) { # as many arrows as neededarrows(i, lagr_d_split$eng$lower[i], # each from here i, lagr_d_split$eng$upper[i], # to herecode=3, angle=90, col="red") # both tips w/ 90 degrees}par(mfrow=c(1, 1))
6 Write-up
To determine whether the reaction time to a word (in ms) varies as a function of
the Zipf frequency of that word (ZIPFFREQ);
the length of that word (LENGTH);
the language that word was presented in (LANGUAGE: english vs. spanish);
the speaker group that words was presented to (GROUP: english vs. heritage);
any pairwise interaction of these predictors with the specific expectation that the effect of LENGTH will be stronger for when LANGUAGE is spanish
a linear model selection process was undertaken. In order to deal with the very long right tail of the response variable the response variable was logged; then a backwards stepwise model selection of all predictors and controls (based on significance testing) was applied. The final model resulting from the elimination of ns predictors contained the effects of ZIPFFREQ:LENGTH, LANGUAGE:GROUP, and LANGUAGE:LENGTH and was highly significant (F=154.7, df1=7, df2=7744, p<0.0001), but it explains only little of the response variable (mult. R2=0.123, adj. R2=0.122). The summary table of the model is shown here.
as hypothesized, the slowing-down effect of LENGTH is stronger for Spanish than for English words [show plot];
the interaction LANGUAGE:GROUP has the following effects [show at least one plot]:
reaction times are always higher when LANGUAGE is spanish than when LANGUAGE is english, but that difference is higher when GROUP is english than when GROUP is spanish;
when LANGUAGE is english, heritage speakers need more time than learners, but when LANGUAGE is spanish, heritage speakers need less time than learners;
all 4 means of LANGUAGE:GROUP are significantly different from each other (non-overlapping CIs);
the interaction LANGUAGE:LENGTH has the hypothesized effect that the effect of LENGTH is stronger – in fact, nearly 2.25 times as strong – when LANGUAGE is spanish than when LANGUAGE is english [show at least one plot];
the interaction of ZIPFFREQ:LENGTH is not strong: both variables behave as expected – LENGTH slows down, ZIPFFREQ speeds up – but
the effect of LENGTH is much stronger with rarer words than with frequent words;
the effect of ZIPFFREQ is much stronger with longer words than with shorter words.
A separate analysis that was conducted on a version of the data that was reduced to 6977 (90%) of its original size (7752, after omitting 248 incomplete cases) by eliminating most of the long right tail (all reaction times >967.5) and some on the left tail (all reaction times <364.5).1 In that model selection process,
the interaction LANGUAGE:GROUP had the same effect as in the model above;
the predictors ZIPFFREQ and LENGTH did not participate in interactions but were just main effects with the effect directions one might expect:
higher values of ZIPFFREQ speeds speakers up;
higher values of LENGTH slows speakers down.
7 Homework
To prepare for next session, read SFLWR3, Sections 5.3.1-5.3.3.
These boundary values were computed in such a way as to find the shortest interval of values that covers the largest number of values.↩︎
Source Code
---title: "Ling 202: session 03: linear regr. 2 (key)"author: - name: "[Stefan Th. Gries](https://www.stgries.info)" affiliation: - UC Santa Barbara - JLU Giessen orcid: 0000-0002-6497-3958date: "2025-04-16 12:34:56"date-format: "DD MMM YYYY HH-mm-ss"editor: sourceformat: html: page-layout: full code-fold: false code-link: true code-copy: true code-tools: true code-line-numbers: true code-overflow: scroll number-sections: true smooth-scroll: true toc: true toc-depth: 4 number-depth: 4 toc-location: left monofont: lucida console tbl-cap-location: top fig-cap-location: bottom fig-width: 5 fig-height: 5 fig-format: png fig-dpi: 300 fig-align: center embed-resources: trueexecute: cache: false echo: true eval: true warning: false---# Introduction: A multifactorial model selection processWe are dealing with the same data set as last session; as a reminder, the data are in [_input/rts.csv](_input/rts.csv) and you can find information about the variables/columns in [_input/rts.r](_input/rts.r).```{r}rm(list=ls(all.names=TRUE))library(car); library(effects); library(emmeans); library(magrittr); library(multcomp)summary(d <-read.delim( # summarize d, the result of loadingfile="_input/rts.csv", # this filestringsAsFactors=TRUE)) # change categorical variables into factorsd$LANGUAGE <-factor(d$LANGUAGE, # for didactic reasons only, I amlevels=levels(d$LANGUAGE)[2:1]) # changing the order of the levelsd$RT_log <-log2(d$RT)```However, this session, we will deal with this in a multifactorial way. Specifically, we are asking, does the (logged) reaction time to a word (in ms) vary as a function of* the Zipf frequency of that word (`ZIPFFREQ`);* the length of that word (`LENGTH`);* the language that word was presented in (`LANGUAGE`: *spanish* vs. *english*);* the speaker group that words was presented to (`GROUP`: *english* vs. *heritage*);* any pairwise interaction of these predictors? (Specifically, we are expecting that the effect of `LENGTH` will be stronger for when `LANGUAGE` is *spanish*.)# Exploration & preparationSince there are some missing data but only in the response variable, we immediately reduce `d` to all the complete cases a regression would consider:```{r}summary(d <-droplevels(d[complete.cases(d),]))```Some exploration of the relevant variables; in the interest of time we only look at univariate and monofactorial stuff -- a big 'no no' for any real analysis! How would we want to explore the response variable?```{r}hist(d$RT_log); hist(d$RT_log, breaks="FD")```How would we want to explore the predictors and their relations to the response?```{r}# the predictors on their ownhist(d$ZIPFFREQ); hist(d$ZIPFFREQ, breaks="FD")hist(d$LENGTH); hist(d$LENGTH, breaks="FD")table(d$LANGUAGE)table(d$GROUP)# the predictor(s) w/ the responseplot(main="RT (in ms, logged) as a function of Zipf frequency",pch=16, col="#00000020",xlab="Zipf frequency" , x=d$ZIPFFREQ,ylab="RT (in ms, logged)", y=d$RT_log); grid()abline(lm(d$RT_log ~ d$ZIPFFREQ), col="green", lwd=2)plot(main="RT (in ms, logged) as a function of length in characters",pch=16, col="#00000020",xlab="Length" , x=jitter(d$LENGTH),ylab="RT (in ms, logged)", y=d$RT_log); grid()abline(lm(d$RT_log ~ d$LENGTH), col="green", lwd=2)qwe <-boxplot(main="RT (in ms, logged) as a function of stimulus language", d$RT_log ~ d$LANGUAGE, notch=TRUE,xlab="Stimulus language",ylab="RT (in ms, logged)"); grid()asd <-boxplot(main="RT (in ms, logged) as a function of speaker group", d$RT_log ~ d$GROUP, notch=TRUE,xlab="Speaker group",ylab="RT (in ms, logged)"); grid()table(d$LANGUAGE, d$GROUP)```Nothing super problematic, it seems ...# Modeling & numerical interpretationLet's fit and evaluate our initial regression model:```{r}summary(m_01 <-lm( # make/summarize the linear model m_01: RT_log ~1+# RT_log ~ an overall intercept (1)# & these predictors & their pairwise interactions (ZIPFFREQ+LANGUAGE+GROUP+LENGTH)^2,data=d, # those vars are in dna.action=na.exclude)) # skip cases with NA/missing data (good habit)drop1(m_01, # drop each droppable predictor at a time from m_01 &test="F") # test its significance w/ an F-test```Let's fit the 2nd model with that interaction `ZIPFFREQ:GROUP` deleted:```{r}summary(m_02 <-update( # make m_02 an update of m_01, .~. # m_01, namely all of it (.~.), but then- ZIPFFREQ:GROUP)) # remove this interactionanova(m_01, m_02, # compare m_01 to m_02test="F") # using an F-testdrop1(m_02, # drop each droppable predictor at a time from m_02 &test="F") # test its significance w/ an F-test```Let's fit the 3rd model with the ns interaction `GROUP:LENGTH` deleted:```{r}summary(m_03 <-update( # make m_03 an update of m_02, .~. # m_02, namely all of it (.~.), but then- GROUP:LENGTH)) # remove this interactionanova(m_02, m_03, # compare m_02 to m_03test="F") # using an F-testdrop1(m_03, # drop each droppable predictor at a time from m_03 &test="F") # test its significance w/ an F-test```Let's fit the 4th model with the ns interaction `ZIPFFREQ:LANGUAGE` deleted:```{r}summary(m_04 <-update( # make m_04 an update of m_03, .~. # m_03, namely all of it (.~.), but then- ZIPFFREQ:LANGUAGE)) # remove this interactionanova(m_03, m_04, # compare m_03 to m_04test="F") # using an F-testdrop1(m_04, # drop each droppable predictor at a time from m_04 &test="F") # test its significance w/ an F-test```We're done now -- why? Because remember that we said "we are expecting that the effect of `LENGTH` will be stronger for when `LANGUAGE` is *spanish*", meaning if the coefficient for `LANGUAGEenglish:LENGTH` is negative -- which it is -- we can take half its *p*-value and then it is in fact significant! We call the final model `m_final` (for easy copying and pasting across analyses) and do some housekeeping:```{r}m_final <- m_04; rm(m_04); invisible(gc())```Normally, we would do some regression diagnostics **now** -- I have to skip those now to discuss how to interpret the model, but we will return to this later! For now, we compute the confidence intervals ...```{r}Confint(m_final)```... , add the predictions of the model to the original data frame:```{r}d$PRED <-# make d$PRED thepredict( # predictions for all cases m_final) # based on m_finalhead(d) # check the result```..., and generate an often super-useful kind of data frame that I always call `nd` (for *new data*) that contains all combinations of* all levels of all categorical predictors and* useful values of all numeric predictors (where 'useful' is usually 0, 1 and the mean):```{r}nd <-expand.grid(ZIPFFREQ=c(0, 1, mean(d$ZIPFFREQ)),LENGTH =c(0, 1, mean(d$LENGTH)),GROUP =levels(d$GROUP),LANGUAGE=levels(d$LANGUAGE))```To that, we add the predictions the model makes for all these combinations and give it some nice(r) names:```{r}nd <-cbind(nd,predict(m_final, newdata=nd, interval="confidence"))names(nd)[5:7] <-c("PRED", "PRED_LOW", "PRED_UPP")head(nd)```A maybe (!) nicer way to represent this might be this, but whether this is nicer or not is probably subjective:```{r}with(nd, # use the data in nd andtapply( # apply PRED, # to the predicted reaction times# a grouping like this: the numerics in the rows & columns, the factors in the layers/sliceslist(round(ZIPFFREQ, 2), round(LENGTH, 2), "GROUP"=GROUP, "LANGUAGE"=LANGUAGE),# just list the values, but rounded to 2 decimals c)) %>%round(2)```# Visual interpretation## `LANGUAGE:GROUP`Let's visualize the nature of the effect of `LANGUAGE:GROUP` based on the predictions from `effects`:```{r}lagr_d <-data.frame( # make lagr_d a data frame of lagr <-effect( # the effect"LANGUAGE:GROUP", # of LANGUAGE:GROUP m_final)) # in m_finalplot(lagr, # plot the effect of the interactionylim=range(d$RT_log, na.rm=TRUE), # w/ these y-axis limitsgrid=TRUE) # & w/ a gridplot(lagr, # plot the effect of the interactionylim=range(d$RT_log, na.rm=TRUE), # w/ these y-axis limitsx.var="GROUP", # w/ this predictor on the x-axisgrid=TRUE) # & w/ a grid```And here's a version (of the first of the two effects plots) with everything: observed values, predicted values, and the confidence intervals of the predictions:```{r}#| fig-width: 10#| fig-show: hold# split data & predictions up by levels of GROUP:d_split <-split(d, d$GROUP)lagr_d_split <-split(lagr_d, lagr_d$GROUP, drop=TRUE)par(mfrow=c(1, 2))stripchart(main="RT ~ LANGUAGE for GROUP='english'", # stripchart w/ main headingpch=16, col="#00000020", # filled semi-transparent grey circles d_split$english$RT_log ~ d_split$english$LANGUAGE, # data: RT_log ~ LANG (for eng)xlab="RT (in ms, logged)", ylab="Language", # axis labelsvertical=TRUE, method="jitter"); grid() # vertical plot & jittered observations, add grid# add predicted means:points(seq(lagr_d_split$english$fit), lagr_d_split$english$fit, pch="X", col=c("red", "blue"))# add confidence intervals:for (i inseq(lagr_d_split$english$fit)) { # as many arrows as neededarrows(i, lagr_d_split$english$lower[i], # each from here i, lagr_d_split$english$upper[i], # to herecode=3, angle=90, col=c("red", "blue")[i]) # both tips w/ 90 degrees}stripchart(main="RT ~ LANGUAGE for GROUP='heritage'", # stripchart w/ main headingpch=16, col="#00000020", # filled semi-transparent grey circles d_split$heritage$RT_log ~ d_split$heritage$LANGUAGE, # data: RT_log ~ LANG (for her)xlab="RT (in ms, logged)", ylab="Language", # axis labelsvertical=TRUE, method="jitter"); grid() # vertical plot & jittered observations, add grid# add predicted means:points(seq(lagr_d_split$heritage$fit), lagr_d_split$heritage$fit, pch="X", col=c("red", "blue"))# add confidence intervals:for (i inseq(lagr_d_split$heritage$fit)) { # as many arrows as neededarrows(i, lagr_d_split$heritage$lower[i], # each from here i, lagr_d_split$heritage$upper[i], # to herecode=3, angle=90, col=c("red", "blue")[i]) # both tips w/ 90 degrees}par(mfrow=c(1, 1))```## `LANGUAGE:LENGTH`Let's visualize the nature of the effect of `LANGUAGE:LENGTH` based on the predictions from `effects`:```{r}(lale_d <-data.frame( # make lale_d a data frame of lale <-effect( # the effect"LANGUAGE:LENGTH", # of LANGUAGE:LENGTH m_final, # in m_finalxlevels=list( # for whenLENGTH=3:9)))) # LENGTH is these valuesplot(lale, # plot the effect of the interactionylim=range(d$RT_log, na.rm=TRUE), # w/ these y-axis limitsgrid=TRUE) # & w/ a grid```And here a version with everything: observed values, predicted values (red for `LANGUAGE`: *english*, blue for `LANGUAGE`: *spanish*), and the confidence intervals of the predictions (I'm building this up very stepwise, more so than I might do in actual research):```{r}d_split <-split(d, d$LANGUAGE)lale_d_split <-split(lale_d, lale_d$LANGUAGE, drop=TRUE)plot(main="RT ~ LANGUAGE:LENGTH", type="n", # plot nothingxlab="Length" , x=d$LENGTH, # x-axisylab="RT in ms", y=d$RT_log); grid() # y-axis & grid# plot observed data points for LANGUAGE being eng:points(jitter(d_split$eng$LENGTH), d_split$eng$RT_log, pch=16, col="#FF000010")lines(lale_d_split$eng$LENGTH, lale_d_split$eng$fit, col="red")# plot observed data points for LANGUAGE being spa:points(jitter(d_split$spa$LENGTH), d_split$spa$RT_log, pch=16, col="#0000FF10")lines(lale_d_split$spa$LENGTH, lale_d_split$spa$fit, col="blue")# plot confidence band for LANGUAGE being eng:polygon(col="#FF000030", border="red", # semi-transparent grey, for once w/ borderx=c( lale_d_split$eng$LENGTH, # x-coordinates left to right, thenrev(lale_d_split$eng$LENGTH)), # the same values back (in reverse order)y=c( lale_d_split$eng$lower, # y-coordinates lower boundaryrev(lale_d_split$eng$upper))) # y-coordinates upper boundary# plot confidence band for LANGUAGE being spa:polygon(col="#0000FF30", border="blue", # semi-transparent grey, for once w/ borderx=c( lale_d_split$spa$LENGTH, # x-coordinates left to right, thenrev(lale_d_split$spa$LENGTH)), # the same values back (in reverse order)y=c( lale_d_split$spa$lower, # y-coordinates lower boundaryrev(lale_d_split$spa$upper))) # y-coordinates upper boundary# plot legendlegend(title="Language", bty="n",x=6, xjust=0.5, y=11.5, yjust=0.5, ncol=2,legend=levels(d$LANGUAGE), fill=c("red", "blue"))``````{r}#| echo: false#| eval: false# These are the slopes of LENGTH one might report,# which differ by coef(m_final)[8]emmeans::emtrends( # of trend lines (i.e. slopes of regression lines)object=m_final, # for the model m_finalvar="LENGTH", # for this numeric predictorspecs=~ LANGUAGE) # for each level of this categorical predictor```## `ZIPFFREQ:LENGTH`Let's visualize the nature of the effect of `ZIPFFREQ:LENGTH` based on the predictions from `effects`:```{r}(zile_d <-data.frame( # make zile_d a data frame of zile <-effect( # the effect"ZIPFFREQ:LENGTH", # of ZIPFFREQ:LENGTH m_final, # in m_finalxlevels=list( # for whenZIPFFREQ=seq(3, 6, 0.25), # ZIPFFREQ is these valuesLENGTH=seq(3, 9, 0.5))))) # LENGTH is these valuesplot(zile, # plot the effect of the interactionylim=range(d$RT_log, na.rm=TRUE), # w/ these y-axis limitsgrid=TRUE) # & w/ a grid```Not that great ... There are two main ways we could try and plot this better. One would be with an actual 3-D plot. The simplest version of this might use `rgl::plot3D` like this, which would open a new window with a rotatable 3-D plot (but this wouldn't be rendered into this HTML properly, which is why I am not executing it here):```{r}#| eval: falsergl::plot3d(zile_d$ZIPFFREQ, zile_d$LENGTH, zile_d$fit)```Another option would be to use `plotly::plot_ly`:```{r}library(plotly)fig <-plot_ly(zile_d,x=~ZIPFFREQ, y=~LENGTH, z=~fit, color=~fit, colors=c("#FFFFFF20", "#00000020"))fig <- fig %>%add_markers()fig <- fig %>%layout(scene =list(xaxis =list(title ='Zipf frequency'),yaxis =list(title ='Length'),zaxis =list(title ='Predicted RT')))fig```While this is nice, it's also hard to publish in print. Let's do this differently with a numeric heatmap kind of plot that I often use. For that, the numeric predictions are first cut into 9 ordinal bins with lower/higher bin numbers reflecting lower/higher numeric predictions ...```{r}zile_d$fitcat <-as.numeric( # create a numeric column fitcat in zile_dcut(zile_d$fit, # by cutting the predicted values upbreaks=seq( # at the values ranging frommin(zile_d$fit), # the smallest obs/pred RT tomax(zile_d$fit), # the largest obs/pred RTlength.out=10), # into 9 groupsinclude.lowest=TRUE, # include the lowest, andlabels=1:9)) # use the numbers 1:9 as labels```... and then we plot them into a coordinate system:```{r}plot(main="RT ~ ZIPFFREQ:LENGTH", type="n", # plot nothing, justxlab="Zipf frequency", xlim=range(zile_d$ZIPFFREQ), x=0, # get the coordinateylab="Length" , ylim=range(zile_d$LENGTH) , y=0) # system set uptext( # plot textx=zile_d$ZIPFFREQ, y=zile_d$LENGTH, # at these coordinates, namelylabels=zile_d$fitcat, # the 'categorical/ordinal' predictions# make the physical size of the numbers dependent on the valuescex=0.5+zile_d$fitcat/7,# make the darkness of the grey reflect the predicted RT_logcol=grey(level=zile_d$fitcat/10)) # see ?grey```However, there's one thing that's possibly problematic about this -- what is that? The problem is that we are binning with `cut` on the basis of `zile_d$fit` (check the above code), but the range of those predictions is much smaller than the range of the actually observed reaction times in `d$RT_log`:```{r}range(zile_d$fit)range(d$RT_log)```That in turn means the binning steps based on `zile_d$fit` will be based on much smaller differences between predicted values (of only 40, check `levels(cut(zile_d$fit, breaks=seq(min(zile_d$fit), max(zile_d$fit), length.out=10)))`), meaning a difference of predictions of only 81 ms could lead to a difference in 3 bins -- if we were to bin based on the actual value range, that would not happen, because there the binning steps are 429 ms wide (check `levels(cut(d$RT_log, breaks=seq(min(d$RT_log), max(d$RT_log), length.out=10)))`). Thus, the above plot exaggerates the effects somewhat.Accordingly, we determine the range of the values to be plotted in a potentially more representative way, by using* as a minimum, the minimum of the predictions *and * the actual values combined;* as a maximum, the maximum of the predictions *and * the actual values combined:```{r}# define the minimum of the range to be used for binningmin_of_obs_n_pred <-min(c( d$RT_log, zile_d$fit),na.rm=TRUE)# define the maximum of the range to be used for binningmax_of_obs_n_pred <-max(c( d$RT_log, zile_d$fit),na.rm=TRUE)```We can see that the initial plot indeed exaggerated the range: the upper plot showed prediction bins from 1 to 9, but we now see that the predicctions are actually only in a really small range of values.```{r}zile_d$fitcat2 <-as.numeric( # create a numeric column fitcat2 in zile_dcut(zile_d$fit, # by cutting the predicted values upbreaks=seq( # at the values ranging from min_of_obs_n_pred, # the smallest obs/pred RT_log to <--------- max_of_obs_n_pred, # the largest obs/pred RT_log <---------length.out=10), # into 9 groupsinclude.lowest=TRUE, # include the lowest, andlabels=1:9)) # use the numbers 1:9 as labelstable(zile_d$fitcat, zile_d$fitcat2)```Thus, we plot this again (with some additional info in the form of the decile-based grid), which here doesn't help much:```{r}plot(main="RT ~ ZIPFFREQ:LENGTH", type="n", # plot nothing, justxlab="Zipf frequency", xlim=range(zile_d$ZIPFFREQ), x=0, # get the coordinateylab="Length" , ylim=range(zile_d$LENGTH) , y=0) # system set uptext( # plot textx=zile_d$ZIPFFREQ, y=zile_d$LENGTH, # at these coordinates, namelylabels=zile_d$fitcat2, # the 'categorical/ordinal' predictions# make the physical size of the numbers dependent on the valuescex=0.5+zile_d$fitcat2/7,# make the darkness of the grey reflect the predicted RT_logcol=grey(level=zile_d$fitcat2/15)) # see ?greyabline(h=quantile(d$LENGTH , probs=seq(0, 1, 0.1)), lty=3, col="#00000020")abline(v=quantile(d$ZIPFFREQ, probs=seq(0, 1, 0.1)), lty=3, col="#00000020")``````{r}#| echo: false#| eval: false# LANGUAGE:GROUP# scenario 1a: what are the meansnd[nd$ZIPFFREQ==0& nd$LENGTH==0,]# scenario 1b: are they different from 0?c(1,0,0,0,0,0,0,0) %>%matrix(nrow=1) %>%glht(m_final, .) %>% summary # LANG:engl & GRP:englc(1,0,0,1,0,0,0,0) %>%matrix(nrow=1) %>%glht(m_final, .) %>% summary # LANG:engl & GRP:heric(1,0,1,0,0,0,0,0) %>%matrix(nrow=1) %>%glht(m_final, .) %>% summary # LANG:span & GRP:englc(1,0,1,1,0,0,1,0) %>%matrix(nrow=1) %>%glht(m_final, .) %>% summary # LANG:span & GRP:heri# scenario 1c: are they different from each other?pairs(emmeans::emmeans(object=m_final, ~ GROUP | LANGUAGE, at=list(ZIPFFREQ=0, LENGTH=0)), adjust="none")pairs(emmeans::emmeans(object=m_final, ~ LANGUAGE | GROUP, at=list(ZIPFFREQ=0, LENGTH=0)), adjust="none")pairs(emmeans::emmeans(object=m_final, ~ LANGUAGE:GROUP, at=list(ZIPFFREQ=0, LENGTH=0)), adjust="none")# without at: when ZIPFFREQ and LENGTH are at mean for the variable LENGTH interacting with them # or of course long glht comparisons# LANGUAGE:LENGTHnd[nd$ZIPFFREQ %in%0:1& nd$LENGTH %in%0:1,]# scenario 3a: what are the slopes of LENGTH for each level of LANGUAGE# scenario 3b: are they different from 0?c(0,0,0,0,1,0,0,0) %>%matrix(nrow=1) %>%glht(m_final, .) %>% summary # LANGUAGE: english & GROUP:englishc(0,0,0,0,1,0,0,1) %>%matrix(nrow=1) %>%glht(m_final, .) %>% summary # LANGUAGE: spanish & GROUP:english# scenario 3c: are they different from each other?c(0,0,0,0,0,0,0,1) %>%matrix(nrow=1) %>%glht(m_final, .) %>% summarypairs(emmeans::emtrends(object=m_final, ~ LANGUAGE, var="LENGTH"), adjust="none")(c(0,0,0,0,1,0,0,0) -c(0,0,0,0,1,0,0,1)) %>%matrix(nrow=1) %>%glht(m_final, .) %>% summary # LANGUAGE: english & GROUP:english# ZIPFFREQ:LENGTHnd[nd$GROUP=="english"& nd$LANGUAGE=="english"& nd$ZIPFFREQ %in%0:1& nd$LENGTH %in%0:1,]# scenario 3b: are all the following different from 0?# scenario 3a: what are the slopes of LENGTH when ZIPFFREQ=0?c(0,0,0,0,1,0,0,1) %>%matrix(nrow=1) %>%glht(m_final, .) %>% summary # LANGUAGE: english & GROUP:english# scenario 3a: what are the slopes of LENGTH when ZIPFFREQ=1?c(0,0,0,0,1,1,0,1) %>%matrix(nrow=1) %>%glht(m_final, .) %>% summary # LANGUAGE: english & GROUP:english# scenario 3a: what are the slopes of ZIPFFREQ when LENGTH=0?c(0,1,0,0,0,0,0,0) %>%matrix(nrow=1) %>%glht(m_final, .) %>% summary # LANGUAGE: english & GROUP:english# scenario 3a: what are the slopes of ZIPFFREQ when LENGTH=1?c(0,1,0,0,0,1,0,0) %>%matrix(nrow=1) %>%glht(m_final, .) %>% summary # LANGUAGE: english & GROUP:english# scenario 3b: are they different from each other?c(0,0,0,0,0,1,0,0) %>%matrix(nrow=1) %>%glht(m_final, .) %>% summary # LANGUAGE: english & GROUP:english```# What if we restrict `RT`?Given that we already noticed early on that the long tail of `RT` might cause problems, we used a logged version of this variable. But what if we had done something else, if we had not transformed `RT` but also not included all its values in the analysis? Here, we run the model selection process again, but now on a subset of the data in `d`, which is delimited like this:```{r}shortest.densest.region <-function ( nv, # a numeric vectortarget=0.9, # the target proportion to retainplotty=FALSE) { # whether to return hist & ecdf zero2one <-function (nv) {if (length(unique(nv))==1) { output <-rep(0.5, length(nv)) } else { output <-"/"( nv -min(nv, na.rm=TRUE),diff(range(nv, na.rm=TRUE))) }; return(output) } nv <- nv[!is.na(nv)] tabulated.nv <-table(nv) all.subsettings <-mapply(":", seq(tabulated.nv), length(tabulated.nv)) all.coverages <-lapply(all.subsettings, \(af) cumsum(tabulated.nv[af])/length(nv)) results <-data.frame(# the values on the left defining the beg:BEGnum=rep(as.numeric(names(tabulated.nv)), sapply(all.subsettings, length)),# the values on the right defining the end:ENDnum=as.numeric(names(tabulated.nv)[unlist(all.subsettings)]),# the positions of the (sorted) values on the left defining the beg:BEGpos=rep(seq(all.subsettings), sapply(all.subsettings, length)),# the positions of the (sorted) values on the right defining the end:ENDpos=unlist(all.subsettings))# the NUMERICAL distance between the beg & the end: results$RANGEnum <- results$ENDnum-results$BEGnum# the POSITIONAL distance between the beg & the end: results$RANGEpos <- results$ENDpos-results$BEGpos+1# coverage: how much of the tokens as a proportion is between the beg & the end: results$COVERAGES <-unlist(all.coverages)# coverage normalized by the proportion of the NUMERICAL range between beg & end out of whole range: results$COVbyRANGEnum <- results$COVERAGES/results$RANGEnum# coverage normalized by the proportion of the POSITIONAL range between beg & end out of whole range: results$COVbyRANGEpos <- results$COVERAGES/results$RANGEpos# reduce to those ranges meeting the target: results <- results[results$COVERAGES>=target,]# conflating those into a single Euclidean distance: results$EUCL <-sqrt(results$COVbyRANGEnum^2+ results$COVbyRANGEpos^2) results <- results[order(-results$EUCL, results$COVERAGES),] results <- results[,c(7,10,1,2,5,8,3,4,6,9)] output <-list("Smallest difference div. by 2"=nv %>% unique %>% sort %>% diff %>%"/"(2) %>% min,"Sorted by Euclidean based on width & length"=results) output[["Recommended range (based in Eucl)"]]=c("lower"=results$BEGnum[1] - output[[1]],"upper"=results$ENDnum[1] + output[[1]])if (plotty) {par(mfrow=c(1,2))hist(nv, breaks="FD", main="")abline(v=output[[3]], col="blue", lty=3)plot(ecdf(nv), verticals=TRUE)abline(v=output[[3]], col="blue", lty=3)par(mfrow=c(1,1)) }return(output)}``````{r}shortest.densest.region(d$RT, target=0.9)[3:4]table(keep <- d$RT>364.5& d$RT<967.5)hist(d$RT , breaks="FD", xlim=c(0, 4250))abline(v=c(364.5, 967.5), col="green")```## Exploration & preparationLet's see how the above exploration results (for response and predictors) change when we use the restricted unlogged response variable; note the use of `with(..., { ...})`:```{r}# the predictor(s) w/ the responsewith(d[keep,], {plot(main="RT in ms as a function of frequency per million words",pch=16, col="#00000030",xlab="Zipf frequency", xlim=c( 3, 6), x=ZIPFFREQ,ylab="RT (in ms)" , ylim=c(250, 1000), y=RT); grid()abline(lm(RT ~ ZIPFFREQ), col="green", lwd=2)plot(main="RT in ms as a function of length in characters",pch=16, col="#00000020",xlab="Length" , xlim=c( 3, 9), x=jitter(LENGTH),ylab="RT (in ms)", ylim=c(250, 1000), y=RT); grid()abline(lm(RT ~ LENGTH), col="green", lwd=2)qwe <-boxplot(main="RT in ms as a function of stimulus language", RT ~ LANGUAGE, notch=TRUE, outline=FALSE,xlab="Stimulus language",ylab="RT (in ms)" , ylim=c(250, 1000)); grid()qwe <-boxplot(main="RT in ms as a function of speaker group", RT ~ GROUP, notch=TRUE, outline=FALSE,xlab="Speaker group",ylab="RT (in ms)" , ylim=c(250, 1000)); grid()})```This does look much better!## Modeling & numerical interpretationLet's see whether/how the modeling process changes if we only look at the data points we decided to keep:```{r}summary(m_11 <-lm( # make/summarize the linear model m_11: RT ~1+# RT ~ an overall intercept (1)# & these predictors & their pairwise interactions (ZIPFFREQ+LANGUAGE+GROUP+LENGTH)^2,data=d, subset=keep, # those vars, but now in this subsetna.action=na.exclude)) # skip cases with NA/missing data (good habit)drop1(m_11, # drop each droppable predictor at a time from m_11 &test="F") # test its significance w/ an F-test```Let's fit the 2nd model with the ns 2-way interaction `ZIPFFREQ:LENGTH` deleted:```{r}summary(m_12 <-update( # make m_12 an update of m_11, .~. # m_11, namely all of it (.~.), but then- ZIPFFREQ:LENGTH)) # remove this interactionanova(m_11, m_12, # compare m_11 to m_12test="F") # using an F-testdrop1(m_12, # drop each droppable predictor at a time from m_12 &test="F") # test its significance w/ an F-test```Let's fit the 3rd model with the ns interaction `ZIPFFREQ:LANGUAGE` deleted:```{r}summary(m_13 <-update( # make m_13 an update of m_12, .~. # m_12, namely all of it (.~.), but then- ZIPFFREQ:LANGUAGE)) # remove this interactionanova(m_12, m_13, # compare m_12 to m_13test="F") # using an F-testdrop1(m_13, # drop each droppable predictor at a time from m_13 &test="F") # test its significance w/ an F-test```Let's fit the 4th model with the ns interaction `LANGUAGE:LENGTH` deleted:```{r}summary(m_14 <-update( # make m_14 an update of m_13, .~. # m_13, namely all of it (.~.), but then- LANGUAGE:LENGTH)) # remove this interactionanova(m_13, m_14, # compare m_13 to m_14test="F") # using an F-testdrop1(m_14, # drop each droppable predictor at a time from m_14 &test="F") # test its significance w/ an F-test```Let's fit the 5th model with the ns interaction `ZIPFFREQ:GROUP` deleted:```{r}summary(m_15 <-update( # make m_15 an update of m_14, .~. # m_14, namely all of it (.~.), but then- ZIPFFREQ:GROUP)) # remove this interactionanova(m_14, m_15, # compare m_14 to m_15test="F") # using an F-testdrop1(m_15, # drop each droppable predictor at a time from m_15 &test="F") # test its significance w/ an F-test```Let's fit the 5th model with the ns interaction `GROUP:LENGTH` deleted:```{r}summary(m_16 <-update( # make m_16 an update of m_15, .~. # m_15, namely all of it (.~.), but then- GROUP:LENGTH)) # remove this interactionanova(m_15, m_16, # compare m_15 to m_16test="F") # using an F-testdrop1(m_16, # drop each droppable predictor at a time from m_16 &test="F") # test its significance w/ an F-testm_final_2 <- m_16; rm(m_16); invisible(gc())```We're done so let's compute the confidence intervals and generate `nd_2`:```{r}Confint(m_final_2)nd_2 <-expand.grid(ZIPFFREQ=c(0, 1, mean(d$ZIPFFREQ)),LENGTH =c(0, 1, mean(d$LENGTH)),GROUP =levels(d$GROUP),LANGUAGE=levels(d$LANGUAGE))nd_2 <-cbind(nd_2,predict(m_final_2, newdata=nd_2, interval="confidence"))names(nd_2)[5:7] <-c("PRED", "PRED_LOW", "PRED_UPP")head(nd_2)```Again, the maybe (!) nicer way to represent this might be this:```{r}with(nd_2, # use the data in nd_2 andtapply( # apply PRED, # to the predicted reaction times# a grouping like this: the numerics in the rows & columns, the factors in the layers/sliceslist(round(ZIPFFREQ, 2), round(LENGTH, 2), "GROUP"=GROUP, "LANGUAGE"=LANGUAGE),# just list the values, but rounded to 2 decimals c)) %>%round(2)```## Visual interpretation### `ZIPFFREQ`Let's visualize the nature of the effect of `ZIPFFREQ` based on the predictions from `effects`:```{r}zpf_d <-data.frame( # make zpf_d a data frame of zpf <-effect( # the effect"ZIPFFREQ", # of ZIPFFREQ m_final_2, # in m_final_2xlevels=list(ZIPFFREQ=seq(3, 6.2, 0.2))))plot(zpf, # plot the effect of the main effectylim=c(350, 1000), # w/ these y-axis limitsgrid=TRUE) # & w/ a grid```And here a version with everything: observed values, predicted values, and the confidence intervals of the predictions; again, I use `with`:```{r}with(d[keep,], {plot( # generate a plotxlab="Zipf frequency", x=jitter(ZIPFFREQ), # ZIPFFREQ on x-axisylab="RT in ms" , ylim=c(350, 1000), y=jitter(RT), # RT on y-axispch=16, col="#00000030"); grid() # w/ filled circles & a gridlines( # draw a linex=zpf_d$ZIPFFREQ, # using these x-coordinatesy=zpf_d$fit, # & these y-coordinatescol="green")polygon(col="#00FF0020", border=NA, # use a semi-transparent red & no borderx=c( zpf_d$ZIPFFREQ, # x-coordinates left to right, thenrev(zpf_d$ZIPFFREQ)), # the same values back (in reverse order)y=c( zpf_d$lower, # y-coordinates lower boundaryrev(zpf_d$upper))) # y-coordinates upper boundary# and lines for both variables' meansabline(h=mean(RT, na.rm=TRUE), col="purple"); abline(v=mean(ZIPFFREQ, na.rm=TRUE), col="purple")})```### `LENGTH`Let's visualize the nature of the effect of `LENGTH` based on the predictions from `effects`:```{r}le_d <-data.frame( # make le_d a data frame of le <-effect( # the effect"LENGTH", # of LENGTH m_final_2, # in m_final_2xlevels=list(LENGTH=3:9)))plot(le, # plot the effect of the main effectylim=c(350, 1000), # w/ these y-axis limitsgrid=TRUE) # & w/ a grid```And here a version with everything: observed values, predicted values, and the confidence intervals of the predictions; again, I use `with`:```{r}with(d[keep,], {plot( # generate a plotxlab="Length" , x=jitter(LENGTH), # LENGTH on x-axisylab="RT in ms", ylim=c(350, 1000), y=jitter(RT), # RT on y-axispch=16, col="#00000030"); grid() # w/ filled circles & a gridlines( # draw a linex=le_d$LENGTH, # using these x-coordinatesy=le_d$fit, # & these y-coordinatescol="green")polygon(col="#0000FF20", border=NA, # use a semi-transparent red & no borderx=c( le_d$LENGTH, # x-coordinates left to right, thenrev(le_d$LENGTH)), # the same values back (in reverse order)y=c( le_d$lower, # y-coordinates lower boundaryrev(le_d$upper))) # y-coordinates upper boundary# and lines for both variables' meansabline(h=mean(RT, na.rm=TRUE), col="purple"); abline(v=mean(LENGTH, na.rm=TRUE), col="purple")})```### `LANGUAGE:GROUP`Let's visualize the nature of the effect of `LANGUAGE:GROUP` based on the predictions from `effects`:```{r}lagr_d <-data.frame( # make lagr_d a data frame of lagr <-effect( # the effect"LANGUAGE:GROUP", # of LANGUAGE:GROUP m_final_2)) # in m_final_2plot(lagr, # plot the effect of the interactionylim=c(350, 1000), # w/ these y-axis limitsgrid=TRUE) # & w/ a gridplot(lagr, # plot the effect of the interactionylim=c(350, 1000), # w/ these y-axis limitsx.var="GROUP", # w/ this predictor on the x-axisgrid=TRUE) # & w/ a grid```And here a version (of the second of the two effects plots) with everything: observed values, predicted values, and the confidence intervals of the predictions:```{r}#| fig-width: 10#| fig-show: hold# split data & predictions up by levels of LANGUAGE:d_split <-split(d, d$LANGUAGE)lagr_d_split <-split(lagr_d, lagr_d$LANGUAGE, drop=TRUE)par(mfrow=c(1, 2))stripchart(main="RT ~ GROUP for LANGUAGE='span'", # stripchart w/ main headingpch=16, col="#00000018", # filled semi-transparent grey circles d_split$spa$RT ~ d_split$spa$GROUP, # data: RT ~ GROUP (for spa)xlab="Group", ylab="RT (in ms)", # axis labelsylim=c(350, 1000), vertical=TRUE, # make the chart have this vertical y-axismethod="jitter"); grid() # jitter observations, add grid# add predicted means:points(seq(lagr_d_split$spa$fit), lagr_d_split$spa$fit, pch="X", col="blue")# add confidence intervals:for (i inseq(lagr_d_split$spa$fit)) { # as many arrows as neededarrows(i, lagr_d_split$spa$lower[i], # each from here i, lagr_d_split$spa$upper[i], # to herecode=3, angle=90, col="blue") # both tips w/ 90 degrees}stripchart(main="RT ~ GROUP for LANGUAGE='eng'", # stripchart w/ main headingpch=16, col="#00000018", # filled semi-transparent grey circles d_split$eng$RT ~ d_split$eng$GROUP, # data: RT ~ GROUP (for eng)xlab="Group", ylab="RT (in ms)", # axis labelsylim=c(350, 1000), vertical=TRUE, # make the chart have this vertical y-axismethod="jitter"); grid() # jitter observations, add grid# add predicted means:points(seq(lagr_d_split$eng$fit), lagr_d_split$eng$fit, pch="X", col="red")# add confidence intervals:for (i inseq(lagr_d_split$eng$fit)) { # as many arrows as neededarrows(i, lagr_d_split$eng$lower[i], # each from here i, lagr_d_split$eng$upper[i], # to herecode=3, angle=90, col="red") # both tips w/ 90 degrees}par(mfrow=c(1, 1))```# Write-upTo determine whether the reaction time to a word (in ms) varies as a function of* the Zipf frequency of that word (`ZIPFFREQ`);* the length of that word (`LENGTH`);* the language that word was presented in (`LANGUAGE`: *english* vs. *spanish*);* the speaker group that words was presented to (`GROUP`: *english* vs. *heritage*);* any pairwise interaction of these predictors with the specific expectation that the effect of `LENGTH` will be stronger for when `LANGUAGE` is *spanish*a linear model selection process was undertaken. In order to deal with the very long right tail of the response variable the response variable was logged; then a backwards stepwise model selection of all predictors and controls (based on significance testing) was applied. The final model resulting from the elimination of ns predictors contained the effects of `ZIPFFREQ:LENGTH`, `LANGUAGE:GROUP`, and `LANGUAGE:LENGTH` and was highly significant (*F*=154.7, *df*~1~=7, *df*~2~=7744, *p*<0.0001), but it explains only little of the response variable (mult. *R*^2^=0.123, adj. *R*^2^=0.122). The summary table of the model is shown here.| | Estimate | 95%-CI | *se* | *t* | *p*~2-tailed~ ||:----------------------------------------------------------------------|:---------|:----------------|:------|:-------|:--------------|| Intercept (ZIPFFREQ=0, LENGTH=0, LANGUAGE=*spanish*, GROUP=*english*) | 9.26 | [8.881, 9.638] | 0.193 | 47.99 | <0.001 || ZIPFFREQ~0→1~ | 0.007 | [-0.077 0.092] | 0.043 | 0.171 | 0.864 || LANGUAGE~*spanish*→*english*~ | -0.258 | [-0.366 -0.149] | 0.056 | -4.638 | <0.001 || GROUP~*english*→*heritage*~ | -0.103 | [-0.130 -0.076] | 0.014 | -7.436 | <0.001 || LENGTH~0→1~ | 0.132 | [0.061 0.202] | 0.036 | 3.66 | <0.001 || ZIPFFREQ~0→1~ + LENGTH~0→1~ | -0.021 | [-0.037 -0.006] | 0.008 | -2.648 | 0.008 || LANGUAGE~*spanish*→*english*~ + GROUP~*english*→*heritage*~ | 0.205 | [0.168 0.243] | 0.019 | 10.668 | <0.001 || LANGUAGE~*spanish*→*english*~ + LENGTH~0→1~ | -0.018 | [-0.039 0.002] | 0.01 | -1.754 | 0.079 |The results indicate that* as hypothesized, the slowing-down effect of `LENGTH` is stronger for Spanish than for English words [show plot];* the interaction `LANGUAGE:GROUP` has the following effects [show at least one plot]: + reaction times are always higher when `LANGUAGE` is *spanish* than when `LANGUAGE` is *english*, but that difference is higher when `GROUP` is *english* than when `GROUP` is *spanish*; + when `LANGUAGE` is *english*, heritage speakers need more time than learners, but when `LANGUAGE` is *spanish*, heritage speakers need less time than learners; + all 4 means of `LANGUAGE:GROUP` are significantly different from each other (non-overlapping CIs);* the interaction `LANGUAGE:LENGTH` has the hypothesized effect that the effect of `LENGTH` is stronger -- in fact, nearly 2.25 times as strong -- when `LANGUAGE` is *spanish* than when `LANGUAGE` is *english* [show at least one plot];* the interaction of `ZIPFFREQ:LENGTH` is not strong: both variables behave as expected -- `LENGTH` slows down, `ZIPFFREQ` speeds up -- but + the effect of `LENGTH` is much stronger with rarer words than with frequent words; + the effect of `ZIPFFREQ` is much stronger with longer words than with shorter words.A separate analysis that was conducted on a version of the data that was reduced to 6977 (90%) of its original size (7752, after omitting 248 incomplete cases) by eliminating most of the long right tail (all reaction times >967.5) and some on the left tail (all reaction times <364.5).^[These boundary values were computed in such a way as to find the shortest interval of values that covers the largest number of values.] In that model selection process,* the interaction `LANGUAGE:GROUP` had the same effect as in the model above;* the predictors `ZIPFFREQ` and `LENGTH` did not participate in interactions but were just main effects with the effect directions one might expect: + higher values of `ZIPFFREQ` speeds speakers up; + higher values of `LENGTH` slows speakers down.# HomeworkTo prepare for next session, read *SFLWR*^3^, Sections 5.3.1-5.3.3.# Session info```{r sessionInfo}sessionInfo()```