Ling 202: session 06: multinomial regression (key)

Author
Affiliations

UC Santa Barbara

JLU Giessen

Published

01 Apr 2025 12-34-56

1 Intro

Let’s load a data set for an example of a multinomial model; the data are in _input/dataltvoice.csv and you can find information about the variables/columns in _input/dataltvoice.r.

rm(list=ls(all.names=TRUE))
library(car); library(effects); library(nnet) # & now load small helper functions:
source("_helpers/R2s.r"); source("_helpers/summarize.multinom.r"); source("_helpers/cohens.kappa.r")
summary(d <- read.delim(   # summarize d, the result of loading
   file="_input/dataltvoice.csv", # this file
   stringsAsFactors=TRUE)) # change categorical variables into factors
      CASE        TRANSITIVITY    REC_LEN          PAT_LEN
 Min.   :   1.0   d_a:810      Min.   : 1.000   Min.   : 1.000
 1st Qu.: 474.8   d_p:349      1st Qu.: 1.000   1st Qu.: 2.000
 Median : 935.5   p_a:476      Median : 2.000   Median : 3.000
 Mean   : 939.4   p_p:141      Mean   : 3.128   Mean   : 4.443
 3rd Qu.:1412.2                3rd Qu.: 3.000   3rd Qu.: 6.000
 Max.   :1871.0                Max.   :44.000   Max.   :33.000
      REC_ANIM           PAT_SEM     REC_ACCESSIB PAT_ACCESSIB
 animate  :1106   abstract   :1265   given:993    given: 312
 inanimate: 670   concrete   : 335   new  :783    new  :1464
                  information: 176


                                                              

We want to know whether TRANSITIVITY is predictable from

  • REC_ANIM, REC_ACCESSIB, PAT_ACCESSIB;
  • the length difference of the recipient and the patient;
  • the interaction of the accessibility predictors;
  • the interaction of the REC_ANIM and the length difference.

2 Deviance & baseline(s)

Let’s already compute the baselines for what will be the response variable, TRANSITIVITY:

(baselines <- c(
   "baseline 1"=max(              # make baselines[1] the highest
      prop.table(                 # proportion in the
         table(d$TRANSITIVITY))), # frequency table of the response
   "baseline 2"=sum(                 # make baselines[2] the sum of the
      prop.table(                    # proportions in the
         table(d$TRANSITIVITY))^2))) # frequency table of the response squared
baseline 1 baseline 2
 0.4560811  0.3247625 

Let’s also compute the deviance of the null model:

deviance(m.00 <- multinom(TRANSITIVITY ~ 1, data=d, na.action=na.exclude))
# weights:  8 (3 variable)
initial  value 2462.058785
iter  10 value 2187.711475
iter  10 value 2187.711469
final  value 2187.711344
converged
[1] 4375.423

3 Exploration & preparation

Some exploration of the relevant variables and we prepare our length-difference predictor:

# the predictor(s)/response on its/their own
hist(d$REC_LEN)

hist(d$PAT_LEN)

hist(d$LEN_RECmPAT_LOG <- log2(d$REC_LEN)-log2(d$PAT_LEN), main="")

table(d$REC_ANIM)

  animate inanimate
     1106       670 
table(d$REC_ACCESSIB)

given   new
  993   783 
table(d$PAT_ACCESSIB)

given   new
  312  1464 
table(d$REC_ACCESSIB, d$PAT_ACCESSIB)

        given new
  given   205 788
  new     107 676
# the predictor(s) w/ the response
spineplot(d$TRANSITIVITY ~ d$LEN_RECmPAT_LOG)

table(d$REC_ANIM, d$TRANSITIVITY)

            d_a d_p p_a p_p
  animate   588 242 197  79
  inanimate 222 107 279  62
ftable(d$REC_ACCESSIB, d$PAT_ACCESSIB, d$TRANSITIVITY)
             d_a d_p p_a p_p

given given   98  48  39  20
      new    469 165 117  37
new   given   21  21  36  29
      new    222 115 284  55

4 Modeling & numerical interpretation

Let’s fit our initial regression model:

summary(m.01 <- multinom(        # make/summarize the multinomial model m.01:
   TRANSITIVITY ~ 1 +            # TRANSITIVITY ~ an overall intercept (1)
   REC_ACCESSIB + PAT_ACCESSIB + # the accessibility predictors & ...
   REC_ACCESSIB:PAT_ACCESSIB +   # their interaction
   REC_ANIM + LEN_RECmPAT_LOG +  # the rec animacy & length diff preds & ...
   REC_ANIM:LEN_RECmPAT_LOG,     # their interaction
   data=d,                # those vars are in d
   model=TRUE,            # & save the model frame (optional)
   na.action=na.exclude)) # skip cases with NA/missing data
# weights:  32 (21 variable)
initial  value 2462.058785
iter  10 value 1814.322650
iter  20 value 1767.454974
final  value 1762.521921
converged
Call:
multinom(formula = TRANSITIVITY ~ 1 + REC_ACCESSIB + PAT_ACCESSIB +
    REC_ACCESSIB:PAT_ACCESSIB + REC_ANIM + LEN_RECmPAT_LOG +
    REC_ANIM:LEN_RECmPAT_LOG, data = d, na.action = na.exclude,
    model = TRUE)

Coefficients:
    (Intercept) REC_ACCESSIBnew PAT_ACCESSIBnew REC_ANIMinanimate
d_p  -0.1210011       0.4906650      -0.2429275       -0.06653502
p_a  -0.3860966       0.8360726      -0.4249038        0.88620903
p_p  -0.8663412       1.3627552      -0.8513750        0.41282684
    LEN_RECmPAT_LOG REC_ACCESSIBnew:PAT_ACCESSIBnew
d_p       0.4666473                      -0.4347810
p_a       1.1615820                      -0.1403867
p_p       1.0006701                      -1.0594962
    REC_ANIMinanimate:LEN_RECmPAT_LOG
d_p                       -0.07724824
p_a                       -0.24365931
p_p                       -0.10326591

Std. Errors:
    (Intercept) REC_ACCESSIBnew PAT_ACCESSIBnew REC_ANIMinanimate
d_p   0.1989219       0.3658536       0.2039208         0.1787075
p_a   0.2291365       0.3830872       0.2485331         0.1642522
p_p   0.2751310       0.4128681       0.3179271         0.2245364
    LEN_RECmPAT_LOG REC_ACCESSIBnew:PAT_ACCESSIBnew
d_p      0.06357114                       0.3951689
p_a      0.08302891                       0.4162775
p_p      0.10013924                       0.4785921
    REC_ANIMinanimate:LEN_RECmPAT_LOG
d_p                         0.1063059
p_a                         0.1164769
p_p                         0.1458351

Residual Deviance: 3525.044
AIC: 3567.044 

Let’s also compute the z-scores for these coefficient values (however uninterpretable those are):

(m.01.zscores <- summary(m.01)$coefficients / summary(m.01)$standard.errors)
    (Intercept) REC_ACCESSIBnew PAT_ACCESSIBnew REC_ANIMinanimate
d_p  -0.6082847        1.341151       -1.191284        -0.3723124
p_a  -1.6850072        2.182460       -1.709647         5.3954176
p_p  -3.1488316        3.300703       -2.677894         1.8385745
    LEN_RECmPAT_LOG REC_ACCESSIBnew:PAT_ACCESSIBnew
d_p        7.340553                       -1.100241
p_a       13.990092                       -0.337243
p_p        9.992787                       -2.213777
    REC_ANIMinanimate:LEN_RECmPAT_LOG
d_p                        -0.7266601
p_a                        -2.0919113
p_p                        -0.7081005

And from that we compute the p-values:

(m.01.pvalues <- 2*pnorm(abs(m.01.zscores), lower.tail=FALSE))
    (Intercept) REC_ACCESSIBnew PAT_ACCESSIBnew REC_ANIMinanimate
d_p 0.542998691    0.1798713695     0.233542272      7.096603e-01
p_a 0.091987185    0.0290755883     0.087331199      6.836435e-08
p_p 0.001639247    0.0009644279     0.007408676      6.597780e-02
    LEN_RECmPAT_LOG REC_ACCESSIBnew:PAT_ACCESSIBnew
d_p    2.127136e-13                       0.2712271
p_a    1.791808e-44                       0.7359337
p_p    1.639074e-23                       0.0268441
    REC_ANIMinanimate:LEN_RECmPAT_LOG
d_p                        0.46743422
p_a                        0.03644645
p_p                        0.47888286

So there are definitely significant results. Since we can’t use drop1, let’s do anova two times, once for each interaction:

summary(m.02a <- update(         # make m.02a an update of
   m.01, .~.                     # m.01, namely all of it (.~.), but then
   - REC_ACCESSIB:PAT_ACCESSIB)) # remove this interaction
# weights:  28 (18 variable)
initial  value 2462.058785
iter  10 value 1801.290105
iter  20 value 1767.457695
final  value 1765.590290
converged
Call:
multinom(formula = TRANSITIVITY ~ REC_ACCESSIB + PAT_ACCESSIB +
    REC_ANIM + LEN_RECmPAT_LOG + REC_ANIM:LEN_RECmPAT_LOG, data = d,
    na.action = na.exclude, model = TRUE)

Coefficients:
    (Intercept) REC_ACCESSIBnew PAT_ACCESSIBnew REC_ANIMinanimate
d_p -0.03493706       0.1158438      -0.3509796       -0.07198044
p_a -0.40506728       0.7027753      -0.3967182        0.88305548
p_p -0.55675057       0.6115019      -1.3339509        0.40044051
    LEN_RECmPAT_LOG REC_ANIMinanimate:LEN_RECmPAT_LOG
d_p       0.4661081                       -0.08120209
p_a       1.1589835                       -0.24308442
p_p       1.0015156                       -0.11875462

Std. Errors:
    (Intercept) REC_ACCESSIBnew PAT_ACCESSIBnew REC_ANIMinanimate
d_p   0.1803328       0.1431314       0.1737402         0.1786000
p_a   0.1995348       0.1519139       0.1959591         0.1642183
p_p   0.2231066       0.2124143       0.2304589         0.2240036
    LEN_RECmPAT_LOG REC_ANIMinanimate:LEN_RECmPAT_LOG
d_p      0.06351107                         0.1061884
p_a      0.08292303                         0.1163086
p_p      0.10006980                         0.1455638

Residual Deviance: 3531.181
AIC: 3567.181 
anova(m.01, m.02a, # compare m.01 to m.02a
   test="Chisq")   # using a LRT
Likelihood ratio tests of Multinomial Models

Response: TRANSITIVITY
                                                                                                                Model
1                                 REC_ACCESSIB + PAT_ACCESSIB + REC_ANIM + LEN_RECmPAT_LOG + REC_ANIM:LEN_RECmPAT_LOG
2 1 + REC_ACCESSIB + PAT_ACCESSIB + REC_ACCESSIB:PAT_ACCESSIB + REC_ANIM + LEN_RECmPAT_LOG + REC_ANIM:LEN_RECmPAT_LOG
  Resid. df Resid. Dev   Test    Df LR stat.   Pr(Chi)
1      5310   3531.181
2      5307   3525.044 1 vs 2     3 6.136739 0.1051438
summary(m.02b <- update(        # make m.02b an update of
   m.01, .~.                    # m.01, namely all of it (.~.), but then
   - REC_ANIM:LEN_RECmPAT_LOG)) # remove this interaction
# weights:  28 (18 variable)
initial  value 2462.058785
iter  10 value 1790.360208
iter  20 value 1764.915297
final  value 1764.868000
converged
Call:
multinom(formula = TRANSITIVITY ~ REC_ACCESSIB + PAT_ACCESSIB +
    REC_ANIM + LEN_RECmPAT_LOG + REC_ACCESSIB:PAT_ACCESSIB, data = d,
    na.action = na.exclude, model = TRUE)

Coefficients:
    (Intercept) REC_ACCESSIBnew PAT_ACCESSIBnew REC_ANIMinanimate
d_p  -0.1614717       0.4998229      -0.2469363        0.04513013
p_a  -0.4024288       0.8670756      -0.4384312        0.95737277
p_p  -0.9283095       1.3712701      -0.8553391        0.56971191
    LEN_RECmPAT_LOG REC_ACCESSIBnew:PAT_ACCESSIBnew
d_p       0.4398474                      -0.4412384
p_a       1.0442234                      -0.1707976
p_p       0.9617214                      -1.0607971

Std. Errors:
    (Intercept) REC_ACCESSIBnew PAT_ACCESSIBnew REC_ANIMinanimate
d_p   0.1945742       0.3655852       0.2036241         0.1454223
p_a   0.2249132       0.3805211       0.2468435         0.1492921
p_p   0.2728389       0.4126677       0.3177536         0.2076899
    LEN_RECmPAT_LOG REC_ACCESSIBnew:PAT_ACCESSIBnew
d_p      0.05175417                       0.3949123
p_a      0.05886288                       0.4139378
p_p      0.07451965                       0.4781441

Residual Deviance: 3529.736
AIC: 3565.736 
anova(m.01, m.02b, # compare m.01 to m.02b
   test="Chisq")   # using a LRT
Likelihood ratio tests of Multinomial Models

Response: TRANSITIVITY
                                                                                                                Model
1                                REC_ACCESSIB + PAT_ACCESSIB + REC_ANIM + LEN_RECmPAT_LOG + REC_ACCESSIB:PAT_ACCESSIB
2 1 + REC_ACCESSIB + PAT_ACCESSIB + REC_ACCESSIB:PAT_ACCESSIB + REC_ANIM + LEN_RECmPAT_LOG + REC_ANIM:LEN_RECmPAT_LOG
  Resid. df Resid. Dev   Test    Df LR stat.   Pr(Chi)
1      5310   3529.736
2      5307   3525.044 1 vs 2     3 4.692158 0.1957774

Both interactions are ns so we first delete the one with the higher p-value, i.e. we delete REC_ANIM:LEN_RECmPAT_LOG and go with m.02b. But can we also delete the other interaction?

summary(m.03 <- update(          # make m.03 an update of
   m.02b, .~.                    # m.02b, namely all of it (.~.), but then
   - REC_ACCESSIB:PAT_ACCESSIB)) # remove this interaction
# weights:  24 (15 variable)
initial  value 2462.058785
iter  10 value 1794.993382
iter  20 value 1767.868111
final  value 1767.866967
converged
Call:
multinom(formula = TRANSITIVITY ~ REC_ACCESSIB + PAT_ACCESSIB +
    REC_ANIM + LEN_RECmPAT_LOG, data = d, na.action = na.exclude,
    model = TRUE)

Coefficients:
    (Intercept) REC_ACCESSIBnew PAT_ACCESSIBnew REC_ANIMinanimate
d_p -0.07582023       0.1192941      -0.3564488        0.04390158
p_a -0.41438351       0.7085056      -0.4224211        0.95956732
p_p -0.61758006       0.6157147      -1.3373812        0.55492462
    LEN_RECmPAT_LOG
d_p       0.4380379
p_a       1.0425965
p_p       0.9557104

Std. Errors:
    (Intercept) REC_ACCESSIBnew PAT_ACCESSIBnew REC_ANIMinanimate
d_p   0.1760039       0.1430298       0.1732770         0.1454120
p_a   0.1950505       0.1519703       0.1942310         0.1492648
p_p   0.2199506       0.2123822       0.2295769         0.2072032
    LEN_RECmPAT_LOG
d_p      0.05167523
p_a      0.05876898
p_p      0.07436727

Residual Deviance: 3535.734
AIC: 3565.734 
anova(m.02b, m.03, # compare  m.02b to m.03
   test="Chisq")   # using a LRT
Likelihood ratio tests of Multinomial Models

Response: TRANSITIVITY
                                                                                 Model
1                             REC_ACCESSIB + PAT_ACCESSIB + REC_ANIM + LEN_RECmPAT_LOG
2 REC_ACCESSIB + PAT_ACCESSIB + REC_ANIM + LEN_RECmPAT_LOG + REC_ACCESSIB:PAT_ACCESSIB
  Resid. df Resid. Dev   Test    Df LR stat.   Pr(Chi)
1      5313   3535.734
2      5310   3529.736 1 vs 2     3 5.997935 0.1117107

Yes, we can so we delete REC_ACCESSIB:PAT_ACCESSIB and go with m.03. But can we also delete main effects?

summary(m.04a <- update( # make m.04a an update of
   m.03, .~.             # m.03, namely all of it (.~.), but then
   - REC_ACCESSIB))      # remove this predictor
# weights:  20 (12 variable)
initial  value 2462.058785
iter  10 value 1805.888093
final  value 1780.292471
converged
Call:
multinom(formula = TRANSITIVITY ~ PAT_ACCESSIB + REC_ANIM + LEN_RECmPAT_LOG,
    data = d, na.action = na.exclude, model = TRUE)

Coefficients:
    (Intercept) PAT_ACCESSIBnew REC_ANIMinanimate LEN_RECmPAT_LOG
d_p -0.03385164      -0.3446204        0.04927733       0.4487349
p_a -0.13661249      -0.3158986        0.97063967       1.1111132
p_p -0.38134000      -1.2462871        0.56502687       1.0158562

Std. Errors:
    (Intercept) PAT_ACCESSIBnew REC_ANIMinanimate LEN_RECmPAT_LOG
d_p   0.1694037       0.1723732         0.1453316      0.05021125
p_a   0.1843444       0.1917693         0.1480772      0.05767235
p_p   0.2014785       0.2262448         0.2066145      0.07242679

Residual Deviance: 3560.585
AIC: 3584.585 
   anova(m.03, m.04a, test="Chisq") # compare models with a LRT
Likelihood ratio tests of Multinomial Models

Response: TRANSITIVITY
                                                     Model Resid. df Resid. Dev
1                PAT_ACCESSIB + REC_ANIM + LEN_RECmPAT_LOG      5316   3560.585
2 REC_ACCESSIB + PAT_ACCESSIB + REC_ANIM + LEN_RECmPAT_LOG      5313   3535.734
    Test    Df LR stat.      Pr(Chi)
1
2 1 vs 2     3 24.85101 1.658861e-05
summary(m.04b <- update( # make m.04b an update of
   m.03, .~.             # m.03, namely all of it (.~.), but then
   - PAT_ACCESSIB))      # remove this predictor
# weights:  20 (12 variable)
initial  value 2462.058785
iter  10 value 1798.965851
final  value 1784.182896
converged
Call:
multinom(formula = TRANSITIVITY ~ REC_ACCESSIB + REC_ANIM + LEN_RECmPAT_LOG,
    data = d, na.action = na.exclude, model = TRUE)

Coefficients:
    (Intercept) REC_ACCESSIBnew REC_ANIMinanimate LEN_RECmPAT_LOG
d_p  -0.3514652      0.09136321        0.02985044       0.4429333
p_a  -0.7382389      0.67129889        0.94339105       1.0495865
p_p  -1.4885262      0.46162804        0.40302393       0.9614303

Std. Errors:
    (Intercept) REC_ACCESSIBnew REC_ANIMinanimate LEN_RECmPAT_LOG
d_p   0.1155751       0.1420761         0.1449772      0.05162030
p_a   0.1308031       0.1505184         0.1481442      0.05877521
p_p   0.1714852       0.2072247         0.2029757      0.07349044

Residual Deviance: 3568.366
AIC: 3592.366 
   anova(m.03, m.04b, test="Chisq") # compare models with a LRT
Likelihood ratio tests of Multinomial Models

Response: TRANSITIVITY
                                                     Model Resid. df Resid. Dev
1                REC_ACCESSIB + REC_ANIM + LEN_RECmPAT_LOG      5316   3568.366
2 REC_ACCESSIB + PAT_ACCESSIB + REC_ANIM + LEN_RECmPAT_LOG      5313   3535.734
    Test    Df LR stat.      Pr(Chi)
1
2 1 vs 2     3 32.63186 3.851131e-07
summary(m.04c <- update( # make m.04c an update of
   m.03, .~.             # m.03, namely all of it (.~.), but then
   - REC_ANIM))          # remove this predictor
# weights:  20 (12 variable)
initial  value 2462.058785
iter  10 value 1817.346357
final  value 1791.948604
converged
Call:
multinom(formula = TRANSITIVITY ~ REC_ACCESSIB + PAT_ACCESSIB +
    LEN_RECmPAT_LOG, data = d, na.action = na.exclude, model = TRUE)

Coefficients:
    (Intercept) REC_ACCESSIBnew PAT_ACCESSIBnew LEN_RECmPAT_LOG
d_p -0.05752161       0.1223786      -0.3639212       0.4382956
p_a -0.11697191       0.7248194      -0.2874666       1.0718270
p_p -0.44904006       0.6259620      -1.2894656       0.9707461

Std. Errors:
    (Intercept) REC_ACCESSIBnew PAT_ACCESSIBnew LEN_RECmPAT_LOG
d_p   0.1722890       0.1429678       0.1728674      0.05139989
p_a   0.1884496       0.1499452       0.1922632      0.05805851
p_p   0.2116643       0.2121530       0.2267210      0.07405764

Residual Deviance: 3583.897
AIC: 3607.897 
   anova(m.03, m.04c, test="Chisq") # compare models with a LRT
Likelihood ratio tests of Multinomial Models

Response: TRANSITIVITY
                                                     Model Resid. df Resid. Dev
1            REC_ACCESSIB + PAT_ACCESSIB + LEN_RECmPAT_LOG      5316   3583.897
2 REC_ACCESSIB + PAT_ACCESSIB + REC_ANIM + LEN_RECmPAT_LOG      5313   3535.734
    Test    Df LR stat.      Pr(Chi)
1
2 1 vs 2     3 48.16327 1.965748e-10
summary(m.04d <- update( # make m.04d an update of
   m.03, .~.             # m.03, namely all of it (.~.), but then
   - LEN_RECmPAT_LOG))   # remove this predictor
# weights:  20 (12 variable)
initial  value 2462.058785
iter  10 value 2038.920162
final  value 2021.374467
converged
Call:
multinom(formula = TRANSITIVITY ~ REC_ACCESSIB + PAT_ACCESSIB +
    REC_ANIM, data = d, na.action = na.exclude, model = TRUE)

Coefficients:
    (Intercept) REC_ACCESSIBnew PAT_ACCESSIBnew REC_ANIMinanimate
d_p  -0.6785844       0.4221001      -0.4274774         0.1569561
p_a  -1.3836692       1.5125323      -0.4844716         1.2427424
p_p  -1.5400404       1.3393672      -1.4252069         0.7773680

Std. Errors:
    (Intercept) REC_ACCESSIBnew PAT_ACCESSIBnew REC_ANIMinanimate
d_p   0.1578792       0.1351589       0.1690794         0.1416616
p_a   0.1705027       0.1289408       0.1735308         0.1283868
p_p   0.2002157       0.1943860       0.2135040         0.1943153

Residual Deviance: 4042.749
AIC: 4066.749 
   anova(m.03, m.04d, test="Chisq") # compare models with a LRT
Likelihood ratio tests of Multinomial Models

Response: TRANSITIVITY
                                                     Model Resid. df Resid. Dev
1                   REC_ACCESSIB + PAT_ACCESSIB + REC_ANIM      5316   4042.749
2 REC_ACCESSIB + PAT_ACCESSIB + REC_ANIM + LEN_RECmPAT_LOG      5313   3535.734
    Test    Df LR stat. Pr(Chi)
1
2 1 vs 2     3  507.015       0

No, we can’t delete any more predictors, we’re done with model selection:

summary(m.final <- m.03); rm(m.04a, m.04b, m.04c, m.04d); invisible(gc())
Call:
multinom(formula = TRANSITIVITY ~ REC_ACCESSIB + PAT_ACCESSIB +
    REC_ANIM + LEN_RECmPAT_LOG, data = d, na.action = na.exclude,
    model = TRUE)

Coefficients:
    (Intercept) REC_ACCESSIBnew PAT_ACCESSIBnew REC_ANIMinanimate
d_p -0.07582023       0.1192941      -0.3564488        0.04390158
p_a -0.41438351       0.7085056      -0.4224211        0.95956732
p_p -0.61758006       0.6157147      -1.3373812        0.55492462
    LEN_RECmPAT_LOG
d_p       0.4380379
p_a       1.0425965
p_p       0.9557104

Std. Errors:
    (Intercept) REC_ACCESSIBnew PAT_ACCESSIBnew REC_ANIMinanimate
d_p   0.1760039       0.1430298       0.1732770         0.1454120
p_a   0.1950505       0.1519703       0.1942310         0.1492648
p_p   0.2199506       0.2123822       0.2295769         0.2072032
    LEN_RECmPAT_LOG
d_p      0.05167523
p_a      0.05876898
p_p      0.07436727

Residual Deviance: 3535.734
AIC: 3565.734 
(m.final.zscores <- summary(m.final)$coefficients / summary(m.final)$standard.errors)
    (Intercept) REC_ACCESSIBnew PAT_ACCESSIBnew REC_ANIMinanimate
d_p  -0.4307872        0.834051       -2.057104         0.3019117
p_a  -2.1244935        4.662131       -2.174839         6.4286223
p_p  -2.8078127        2.899088       -5.825417         2.6781657
    LEN_RECmPAT_LOG
d_p        8.476748
p_a       17.740594
p_p       12.851222
(m.final.pvalues <- 2*pnorm(abs(m.final.zscores), lower.tail=FALSE))
    (Intercept) REC_ACCESSIBnew PAT_ACCESSIBnew REC_ANIMinanimate
d_p 0.666623116    4.042523e-01    3.967623e-02      7.627194e-01
p_a 0.033628902    3.129516e-06    2.964221e-02      1.287656e-10
p_p 0.004987923    3.742502e-03    5.697018e-09      7.402659e-03
    LEN_RECmPAT_LOG
d_p    2.315757e-17
p_a    2.037701e-70
p_p    8.466167e-38

Let’s also compute the overall p-value for this model:

anova(m.00, m.final, # compare the null model to m.final
   test="Chisq")     # using a LRT
Likelihood ratio tests of Multinomial Models

Response: TRANSITIVITY
                                                     Model Resid. df Resid. Dev
1                                                        1      5325   4375.423
2 REC_ACCESSIB + PAT_ACCESSIB + REC_ANIM + LEN_RECmPAT_LOG      5313   3535.734
    Test    Df LR stat. Pr(Chi)
1
2 1 vs 2    12 839.6888       0
pchisq(              # compute the area under the chi-squared curve
   q=839.6888,       # for this chi-squared value
   df=12,            # at 12 df
   lower.tail=FALSE) # only using the right/upper tail/side
[1] 5.073936e-172

On to model quality: We make ‘predictions’ …

d <- cbind(d,       # add to d the columns resulting from
   predict(         # predicting
      m.final,      # from m.final
      type="prob")) # predicted probabilities
# give those columns better names
names(d)[10:13] <- paste0( # rename those columns to the result of pasting
   "PREDS.PP.",            # "PREDS.PP."
   names(d)[10:13])        # in front of the existing names
d <- cbind(d,         # add to d a column(d
   PREDS.CAT=predict( # called PREDS.CAT, the predictions
      m.final,        # from m.final, namely
      type="class"))  # categorical predictions
head(d) # check the result
  CASE TRANSITIVITY REC_LEN PAT_LEN  REC_ANIM     PAT_SEM REC_ACCESSIB
1    1          d_a       5       8   animate    abstract          new
2    2          d_a       2       3   animate    abstract          new
3    3          d_a       2       2 inanimate information        given
4    4          p_a       5       3 inanimate    abstract          new
5    5          d_a       2       3   animate    abstract          new
6    6          d_a       1       4   animate    concrete        given
  PAT_ACCESSIB LEN_RECmPAT_LOG PREDS.PP.d_a PREDS.PP.d_p PREDS.PP.p_a
1          new      -0.6780719    0.4729958    0.2570037    0.2051686
2          new      -0.5849625    0.4558345    0.2579896    0.2178814
3          new       0.0000000    0.3272929    0.2219580    0.3700426
4          new       0.7369656    0.1261111    0.1330745    0.6243988
5          new      -0.5849625    0.4558345    0.2579896    0.2178814
6        given      -2.0000000    0.6460504    0.2493809    0.0530535
  PREDS.PP.p_p PREDS.CAT
1   0.06483201       d_a
2   0.06829445       d_a
3   0.08070648       p_a
4   0.11641564       p_a
5   0.06829445       d_a
6   0.05151519       d_a

… and then we evaluate them:

(c.m <- table(             # confusion matrix: cross-tabulate
   "OBS"  =d$TRANSITIVITY, # observed constructions in the rows
   "PREDS"=d$PREDS.CAT))   # predicted constructions in the columns
     PREDS
OBS   d_a d_p p_a p_p
  d_a 736   0  74   0
  d_p 252   1  96   0
  p_a 132   1 343   0
  p_p  58   0  83   0
c(tapply(d$TRANSITIVITY==d$PREDS.CAT, d$TRANSITIVITY, mean), # accuracies for ...
  "overall"=mean(d$TRANSITIVITY==d$PREDS.CAT)) # ... each level & overall
       d_a        d_p        p_a        p_p    overall
0.90864198 0.00286533 0.72058824 0.00000000 0.60810811 
cohens.kappa(c.m)
Cohen's kappa            se         lower         upper  p (2-tailed)
 3.547219e-01  1.907360e-02  3.173383e-01  3.921055e-01  3.805102e-77 

Plus, we need the model’s R2-values and compute both McFadden’s R2 and the more widely-used version of Nagelkerke’s R2:

R2s(m.final)
# weights:  8 (3 variable)
initial  value 2462.058785
iter  10 value 2187.711475
iter  10 value 2187.711469
final  value 2187.711344
converged
  McFadden R-squared Nagelkerke R-squared
           0.1919103            0.4117980 

We can also add PREDS.PP.obs, like we did for binary logistic regression, but it’s a bit more complex here so I’ll just show how it’s done without more details:

d$PREDS.PP.obs <- predict(m.final, type="prob")[ # make this column the predictions
   matrix(c(                                # that are in the positions defined here
      seq(nrow(d)),                         # one prediction for each row
      as.numeric(d$TRANSITIVITY)), ncol=2)] # namely the column of the observed result

Why are we doing this? To see whether we can compute the deviance of the final model from its contributions to logloss – i.e., -log(d$PREDS.PP.obs)) – again. And we can, the logic is the same as before:

sum(-log(d$PREDS.PP.obs)) * 2
[1] 3535.734
deviance(m.final)
[1] 3535.734

Of course, if you paid attention to the book and worked through everything there, much of this you could have done with this:

summarize.multinom(m.final)
# weights:  8 (3 variable)
initial  value 2462.058785
iter  10 value 2187.711475
iter  10 value 2187.711469
final  value 2187.711344
converged
$Coefficients
                         Estimate  Std.Error       2.5%       97.5%          z
(Intercept):d_p       -0.07582023 0.17600392 -0.4207816  0.26914111 -0.4307872
(Intercept):p_a       -0.41438351 0.19505050 -0.7966755 -0.03209156 -2.1244935
(Intercept):p_p       -0.61758006 0.21995059 -1.0486753 -0.18648482 -2.8078127
REC_ACCESSIBnew:d_p    0.11929412 0.14302977 -0.1610391  0.39962731  0.8340510
REC_ACCESSIBnew:p_a    0.70850560 0.15197033  0.4106492  1.00636196  4.6621312
REC_ACCESSIBnew:p_p    0.61571469 0.21238222  0.1994532  1.03197619  2.8990877
PAT_ACCESSIBnew:d_p   -0.35644878 0.17327699 -0.6960654 -0.01683212 -2.0571039
PAT_ACCESSIBnew:p_a   -0.42242110 0.19423101 -0.8031069 -0.04173531 -2.1748386
PAT_ACCESSIBnew:p_p   -1.33738125 0.22957691 -1.7873437 -0.88741876 -5.8254170
REC_ANIMinanimate:d_p  0.04390158 0.14541201 -0.2411007  0.32890388  0.3019117
REC_ANIMinanimate:p_a  0.95956732 0.14926485  0.6670136  1.25212105  6.4286223
REC_ANIMinanimate:p_p  0.55492462 0.20720324  0.1488137  0.96103551  2.6781657
LEN_RECmPAT_LOG:d_p    0.43803793 0.05167523  0.3367563  0.53931953  8.4767481
LEN_RECmPAT_LOG:p_a    1.04259654 0.05876898  0.9274115  1.15778162 17.7405938
LEN_RECmPAT_LOG:p_p    0.95571035 0.07436727  0.8099532  1.10146753 12.8512225
                          p2tailed
(Intercept):d_p       6.666231e-01
(Intercept):p_a       3.362890e-02
(Intercept):p_p       4.987923e-03
REC_ACCESSIBnew:d_p   4.042523e-01
REC_ACCESSIBnew:p_a   3.129516e-06
REC_ACCESSIBnew:p_p   3.742502e-03
PAT_ACCESSIBnew:d_p   3.967623e-02
PAT_ACCESSIBnew:p_a   2.964221e-02
PAT_ACCESSIBnew:p_p   5.697018e-09
REC_ANIMinanimate:d_p 7.627194e-01
REC_ANIMinanimate:p_a 1.287656e-10
REC_ANIMinanimate:p_p 7.402659e-03
LEN_RECmPAT_LOG:d_p   2.315757e-17
LEN_RECmPAT_LOG:p_a   2.037701e-70
LEN_RECmPAT_LOG:p_p   8.466167e-38

$`Confusion matrix`
     PRED
OBS   d_a d_p p_a p_p
  d_a 736   0  74   0
  d_p 252   1  96   0
  p_a 132   1 343   0
  p_p  58   0  83   0

$`Classification metrics`
        Baseline     Acc. for d_a     Acc. for d_p     Acc. for p_a
    4.560811e-01     9.086420e-01     2.865330e-03     7.205882e-01
    Acc. for p_p Acc. for overall           pbinom          logloss
    0.000000e+00     6.081081e-01     5.990962e-38     9.954206e-01

$`Deviances & R2s`
Deviance of null model Deviance of this model            McFadden R2
          4375.4226877           3535.7339348              0.1919103
         Nagelkerke R2
             0.4117980

$`Model significance test`
         LR statistic                    Df               P-value
         8.396888e+02          1.200000e+01         5.074054e-172
Number of data points
         1.776000e+03 

5 Visual interpretation

Now let’s visualize the results of the predictors based on the predictions, one at a time. We begin with REC_ACCESSIB:

ra.d <- data.frame(   # make ra.d a data frame of
   ra <- effect(      # the effect
      "REC_ACCESSIB", # of REC_ACCESSIB
      m.final))       # in m.final
ra.d[,1:5] # show just the predicted probabilities
  REC_ACCESSIB  prob.d_a  prob.d_p  prob.p_a   prob.p_p
1        given 0.4995828 0.2663792 0.1736107 0.06042732
2          new 0.3951903 0.2374149 0.2789172 0.08847765
plot(ra,                        # plot the effect of REC_ACCESSIB
   type="probability",          # but plot predicted probabilities
   ylim=c(0, 1),                # w/ these y-axis limits
   grid=TRUE,                   # & w/ a grid
   lines=list(multiline=TRUE),  # make all lines be in one panel
   confint=list(style="auto"),  # make the lines come with confidence bands
   lattice=list(key.args=list(  # make the legend ...
      columns=4,                # ... have 4 columns &
      cex.title=0)))            # no title

Next, PAT_ACCESSIB:

pa.d <- data.frame(   # make pa.d a data frame of
   pa <- effect(      # the effect
      "PAT_ACCESSIB", # of PAT_ACCESSIB
      m.final))       # in m.final
pa.d[,1:5] # show just the predicted probabilities
  PAT_ACCESSIB  prob.d_a  prob.d_p  prob.p_a   prob.p_p
1        given 0.3442863 0.2595748 0.2316187 0.16452012
2          new 0.4774871 0.2520581 0.2105525 0.05990231
plot(pa,                        # plot the effect of PAT_ACCESSIB
   type="probability",          # but plot predicted probabilities
   ylim=c(0, 1),                # w/ these y-axis limits
   grid=TRUE,                   # & w/ a grid
   lines=list(multiline=TRUE),  # make all lines be in one panel
   confint=list(style="auto"),  # make the lines come with confidence bands
   lattice=list(key.args=list(  # make the legend ...
      columns=4,                # ... have 4 columns &
      cex.title=0)))            # no title

REC_ANIM is next:

rca.d <- data.frame( # make rca.d a data frame of
   rca <- effect(     # the effect
      "REC_ANIM",     # of REC_ANIM
      m.final))       # in m.final
rca.d[,1:5] # show just the predicted probabilities
   REC_ANIM  prob.d_a  prob.d_p  prob.p_a   prob.p_p
1   animate 0.4969793 0.2747134 0.1643431 0.06396429
2 inanimate 0.3752305 0.2167235 0.3239262 0.08411979
plot(rca,                       # plot the effect of REC_ANIM
   type="probability",          # but plot predicted probabilities
   ylim=c(0, 1),                # w/ these y-axis limits
   grid=TRUE,                   # & w/ a grid
   lines=list(multiline=TRUE),  # make all lines be in one panel
   confint=list(style="auto"),  # make the lines come with confidence bands
   lattice=list(key.args=list(  # make the legend ...
      columns=4,                # ... have 4 columns &
      cex.title=0)))            # no title

Finally, the numeric predictor LEN_RECmPAT_LOG:

ld.d <- data.frame(     # make ld.d a data frame of
   ld <- effect(         # the effect
      "LEN_RECmPAT_LOG", # of LEN_RECmPAT_LOG
      m.final))          # in m.final
ld.d[,1:5] # show just the predicted probabilities
  LEN_RECmPAT_LOG    prob.d_a   prob.d_p    prob.p_a    prob.p_p
1           -5.00 0.917200641 0.07599314 0.004572736 0.002233479
2           -3.00 0.796463073 0.15847130 0.031949591 0.013116036
3            0.02 0.335907630 0.25091231 0.314019905 0.099160160
4            3.00 0.033619304 0.09263998 0.702509057 0.171231663
5            5.00 0.004757516 0.03148219 0.799891945 0.163868352
plot(ld,                        # plot the effect of LEN_RECmPAT_LOG
   type="probability",          # but plot predicted probabilities
   ylim=c(0, 1),                # w/ these y-axis limits
   grid=TRUE,                   # & w/ a grid
   lines=list(multiline=TRUE),  # make all lines be in one panel
   confint=list(style="auto"),  # make the lines come with confidence bands
   lattice=list(key.args=list(  # make the legend ...
      columns=4,                # ... have 4 columns &
      cex.title=0)))            # no title

6 Write-up

To determine whether TRANSITIVITY varies as a function of

  • REC_ANIM, REC_ACCESSIB, PAT_ACCESSIB;
  • the length difference of the recipient and the patient;
  • the interaction of the accessibility predictors;
  • the interaction of the REC_ANIM and the length difference,

a multinomial model selection process was undertaken (using backwards stepwise model selection of all predictors and controls based on significance testing). The final model resulting from the elimination of ns predictors contained all four main effects and was highly significant (LR=839.6888, df=12, p<0.0001) with a decent amount of explanatory/predictive power (McFadden’s R2=0.192, Nagelkerke’s R2=0.412); the overall accuracy of the model is 60.8%, but the accuracies/recalls for each construction are very varied [insert those values]. The summary table for the model is shown below:

Estimate 95%-CI se z p2-tailed
Interceptd_a→d_p -0.08 [-0.42, 0.27] 0.18 -0.43 0.6666
Interceptd_a→p_a -0.41 [-0.80, -0.03] 0.20 -2.12 0.0336
Interceptd_a→p_p -0.62 [-1.05, -0.19] 0.22 -2.81 0.005
REC_ACCESSIBgiven→new+d_a→d_p 0.12 [-0.16, 0.40] 0.14 0.83 0.4043
REC_ACCESSIBgiven→new+d_a→p_a 0.71 [ 0.41, 1.01] 0.15 4.66 <0.0001
REC_ACCESSIBgiven→new+d_a→p_p 0.62 [ 0.20, 1.03] 0.21 2.90 0.0037
PAT_ACCESSIBgiven→new+d_a→d_p -0.36 [-0.70, -0.02] 0.17 -2.06 0.0397
PAT_ACCESSIBgiven→new+d_a→p_a -0.42 [-0.80, -0.04] 0.19 -2.17 0.0296
PAT_ACCESSIBgiven→new+d_a→p_p -1.34 [-1.79, -0.89] 0.23 -5.83 <0.0001
REC_ANIManim→inanim+d_a→d_p 0.04 [-0.24, 0.33] 0.15 0.30 0.7627
REC_ANIManim→inanim+d_a→p_a 0.96 [ 0.67, 1.25] 0.15 6.43 <0.0001
REC_ANIManim→inanim+d_a→p_p 0.55 [ 0.15, 0.96] 0.21 2.68 0.0074
LEN_RECmPAT_LOG0→1+d_a→d_p 0.44 [ 0.34, 0.54] 0.05 8.48 <0.0001
LEN_RECmPAT_LOG0→1+d_a→p_a 1.04 [ 0.93, 1.16] 0.06 17.74 <0.0001
LEN_RECmPAT_LOG0→1+d_a→p_p 0.96 [ 0.81, 1.10] 0.07 12.85 <0.0001

The way the main effects in the model are correlated with TRANSITIVITY are as follows [show plots]:

  • with regard to REC_ACCESSIB,
    • ditransitive actives become much more likely with given recipients;
    • prepositional dative actives become much more likely with new recipients;
    • the predicted probabilities of the passives do not differ much in response to REC_ACCESSIB;
  • with regard to PAT_ACCESSIB,
    • ditransitive actives become much more likely with new patients;
    • prepositional dative passives become much more likely with given patients;
    • the predicted probabilities of the other two constructions do not differ much in response to PAT_ACCESSIB;
  • with regard to REC_ANIM,
    • ditransitive actives become much more likely with animate recipients;
    • prepositional dative actives become much more likely with inanimate recipients;
    • ditransitive passives become somewhat more likely with animate recipients;
    • the predicted probabilities of the prepositional dative passives do not differ much in response to REC_ANIM;
  • with regard to the length difference predictor,
    • ditransitive actives become much more likely when the recipient is shorter then the patient;
    • prepositional dative actives become much more likely when the recipient is longer then the patient;
    • the predicted probabilities of the passives do not differ much in response to the length difference predictor.

7 Excursus 1: prototypes

Note that, just like last session, we can use the predicted probabilities to explore the prototypical uses, the best examples, of these constructions:

d.split <- split(d, d$TRANSITIVITY) # split up data by construction
# the prototype of the ditransitive active
head(unique(d.split$d_a[order(d.split$d_a$PREDS.PP.d_a, decreasing=TRUE),c(2, 5, 7:9, 10)]), 20)
     TRANSITIVITY  REC_ANIM REC_ACCESSIB PAT_ACCESSIB LEN_RECmPAT_LOG
963           d_a   animate        given          new       -5.044394
1665          d_a   animate        given          new       -4.857981
728           d_a   animate        given          new       -4.392317
217           d_a   animate        given          new       -4.321928
784           d_a   animate        given          new       -4.247928
54            d_a   animate        given          new       -4.169925
247           d_a   animate          new          new       -4.459432
623           d_a   animate        given          new       -4.087463
529           d_a   animate        given          new       -4.000000
663           d_a   animate        given          new       -3.906891
724           d_a   animate        given          new       -3.807355
1164          d_a inanimate        given          new       -4.000000
1278          d_a   animate        given          new       -3.700440
353           d_a   animate          new          new       -4.000000
399           d_a inanimate        given          new       -3.906891
231           d_a   animate        given          new       -3.584963
142           d_a   animate        given          new       -3.459432
899           d_a inanimate        given          new       -3.700440
199           d_a   animate        given          new       -3.321928
801           d_a   animate        given        given       -4.087463
     PREDS.PP.d_a
963     0.9305632
1665    0.9247412
728     0.9079811
217     0.9051423
784     0.9020637
54      0.8987107
247     0.8962209
623     0.8950415
529     0.8910051
663     0.8865381
724     0.8815612
1164    0.8768473
1278    0.8759735
353     0.8725102
399     0.8713634
231     0.8696441
142     0.8624003
899     0.8582752
199     0.8540085
801     0.8511656
# the prototype of the prepositional dative active
head(unique(d.split$p_a[order(d.split$p_a$PREDS.PP.p_a, decreasing=TRUE),c(2, 5, 7:9, 12)]), 20)
     TRANSITIVITY  REC_ANIM REC_ACCESSIB PAT_ACCESSIB LEN_RECmPAT_LOG
1627          p_a inanimate          new          new        4.247928
999           p_a inanimate          new          new        4.087463
1447          p_a inanimate          new          new        3.954196
239           p_a inanimate          new          new        3.906891
188           p_a inanimate          new          new        3.807355
1538          p_a inanimate          new          new        3.584963
380           p_a inanimate          new          new        3.459432
1437          p_a inanimate          new          new        3.321928
163           p_a inanimate          new          new        3.169925
205           p_a inanimate        given          new        4.087463
170           p_a inanimate          new          new        3.115477
727           p_a inanimate          new          new        3.087463
150           p_a inanimate          new          new        3.000000
121           p_a inanimate        given          new        3.857981
109           p_a inanimate          new          new        2.807355
1646          p_a inanimate        given          new        3.700440
143           p_a inanimate          new          new        2.700440
47            p_a inanimate          new          new        2.584963
186           p_a inanimate          new          new        2.459432
951           p_a inanimate        given          new        3.321928
     PREDS.PP.p_a
1627    0.8560652
999     0.8520718
1447    0.8485311
239     0.8472217
188     0.8443709
1538    0.8374944
380     0.8332723
1437    0.8283350
163     0.8224616
205     0.8223316
170     0.8202437
727     0.8190779
150     0.8153269
121     0.8129771
109     0.8064299
1646    0.8059040
143     0.8010844
47      0.7949559
186     0.7878461
951     0.7863926

Nice:

  • the prototypical ditransitive active involves
    • a short, given, and animate recipient;
    • a long new patient;
    • examples such as He told him a very long and convoluted story;
  • the prototypical prepositional dative active involves
    • a long, new, and inanimate recipient;
    • a short new patient;
    • examples such as She gave some serious thought to the situation that had presented itself to her.

The above solution used the predicted probabilities for the attested cases, the data in d. And in this particular case, this approach is probably sufficient because we don’t have that many predictors and most have them have only few levels so our data cover what’s possible relatively well. But remember from last session that we can also use all possible combinations of predictors’ levels, values, or value ranges. For that, but also for other reasons, we should really always create a data frame like nd here, which contains all possible combinations of things:

summary(nd <- expand.grid(
   REC_ACCESSIB=levels(d$REC_ACCESSIB),
   PAT_ACCESSIB=levels(d$PAT_ACCESSIB),
   REC_ANIM=levels(d$REC_ANIM),
   LEN_RECmPAT_LOG=seq(
      min(d$LEN_RECmPAT_LOG),
      max(d$LEN_RECmPAT_LOG),
      length.out=10)))
 REC_ACCESSIB PAT_ACCESSIB      REC_ANIM  LEN_RECmPAT_LOG
 given:40     given:40     animate  :40   Min.   :-5.04439
 new  :40     new  :40     inanimate:40   1st Qu.:-2.79287
                                          Median : 0.02153
                                          Mean   : 0.02153
                                          3rd Qu.: 2.83594
                                          Max.   : 5.08746  

And then we would add to it the predicted probabilities of all levels of the response variable:

nd <- cbind(nd, predict(m.final, newdata=nd, type="probs"))

That data frame can then be sorted by these 4 new columns of predicted probabilities to see what combinations come out on top for each construction. Here’s the result of this for the same two constructions as above:

head(nd[order(nd$d_a, decreasing=TRUE), c(1:4, 5)], 10)
   REC_ACCESSIB PAT_ACCESSIB  REC_ANIM LEN_RECmPAT_LOG       d_a
3         given          new   animate       -5.044394 0.9305632
7         given          new inanimate       -5.044394 0.9239690
4           new          new   animate       -5.044394 0.9200185
8           new          new inanimate       -5.044394 0.9095308
1         given        given   animate       -5.044394 0.9012998
5         given        given inanimate       -5.044394 0.8906063
11        given          new   animate       -3.918632 0.8871113
2           new        given   animate       -5.044394 0.8852398
15        given          new inanimate       -3.918632 0.8720688
6           new        given inanimate       -5.044394 0.8680667
head(nd[order(nd$p_a, decreasing=TRUE), c(1:4, 7)], 10)
   REC_ACCESSIB PAT_ACCESSIB  REC_ANIM LEN_RECmPAT_LOG       p_a
80          new          new inanimate        5.087463 0.8731474
79        given          new inanimate        5.087463 0.8530508
72          new          new inanimate        3.961701 0.8487362
71        given          new inanimate        3.961701 0.8173373
76          new          new   animate        5.087463 0.8093854
64          new          new inanimate        2.835939 0.8078080
75        given          new   animate        5.087463 0.7747445
68          new          new   animate        3.961701 0.7661525
63        given          new inanimate        2.835939 0.7550298
78          new        given inanimate        5.087463 0.7491943

Here, the results don’t change all that much, but there are scenarios where this can make a bigger difference.

8 Excursus 2: predictions that are most wrong

It can again be interesting to explore the case where the model does worst, i.e. where the contributions to logloss are highest:

lapply(                         # apply to
   d.split,                     # the list w/ the 4 data frames
   \(af)                        # an anonymous function (SFLWR 3.5.4)
      head(af[                  # that takes the head of each data frame
      order(                    # when the data frame is sorted
         -log(af$PREDS.PP.obs), # by the contributions to logloss
         decreasing=TRUE),]))   # in decreasing order
$d_a
     CASE TRANSITIVITY REC_LEN PAT_LEN  REC_ANIM  PAT_SEM REC_ACCESSIB
558   592          d_a       4       1 inanimate abstract          new
540   572          d_a       5       2 inanimate abstract          new
148   158          d_a      13       2   animate abstract          new
288   303          d_a       7       2 inanimate abstract          new
780   825          d_a       6       2 inanimate abstract          new
1063 1123          d_a      14       5   animate concrete          new
     PAT_ACCESSIB LEN_RECmPAT_LOG PREDS.PP.d_a PREDS.PP.d_p PREDS.PP.p_a
558           new        2.000000   0.04098797   0.07520977    0.7572852
540         given        1.321928   0.04348256   0.08467242    0.6044306
148           new        2.700440   0.04642982   0.11081498    0.6820561
288           new        1.807355   0.04909686   0.08279865    0.7420416
780           new        1.584963   0.06027015   0.09220711    0.7224019
1063        given        1.485427   0.07438122   0.14891127    0.4696682
     PREDS.PP.p_p PREDS.CAT PREDS.PP.obs
558     0.1265171       p_a   0.04098797
540     0.2674144       p_a   0.04348256
148     0.1606991       p_a   0.04642982
288     0.1260629       p_a   0.04909686
780     0.1251208       p_a   0.06027015
1063    0.3070393       p_a   0.07438122

$d_p
     CASE TRANSITIVITY REC_LEN PAT_LEN  REC_ANIM  PAT_SEM REC_ACCESSIB
1105 1167          d_p      17       1 inanimate abstract          new
76     82          d_p      16       1 inanimate abstract          new
1012 1064          d_p      10       2 inanimate abstract          new
1235 1311          d_p       5       1 inanimate abstract          new
606   642          d_p       6       1 inanimate abstract          new
94    100          d_p       5       1 inanimate abstract          new
     PAT_ACCESSIB LEN_RECmPAT_LOG PREDS.PP.d_a PREDS.PP.d_p PREDS.PP.p_a
1105          new        4.087463  0.005232179   0.02395599    0.8520718
76            new        4.000000  0.005716255   0.02518862    0.8497722
1012        given        2.321928  0.016823555   0.05076708    0.6633506
1235        given        2.321928  0.016823555   0.05076708    0.6633506
606           new        2.584963  0.023381506   0.05543348    0.7949559
94            new        2.321928  0.030160357   0.06372313    0.7794830
     PREDS.PP.p_p PREDS.CAT PREDS.PP.obs
1105    0.1187400       p_a   0.02395599
76      0.1193229       p_a   0.02518862
1012    0.2690588       p_a   0.05076708
1235    0.2690588       p_a   0.05076708
606     0.1262291       p_a   0.05543348
94      0.1266335       p_a   0.06372313

$p_a
     CASE TRANSITIVITY REC_LEN PAT_LEN  REC_ANIM     PAT_SEM REC_ACCESSIB
795   841          p_a       1      18 inanimate information          new
730   773          p_a       1      10 inanimate    concrete        given
1435 1517          p_a       2      17 inanimate    abstract        given
1413 1494          p_a       1       4   animate    concrete        given
32     34          p_a       1      10 inanimate    concrete          new
668   711          p_a       1       3   animate    abstract        given
     PAT_ACCESSIB LEN_RECmPAT_LOG PREDS.PP.d_a PREDS.PP.d_p PREDS.PP.p_a
795           new       -4.169925    0.8611907    0.1059172   0.02558573
730           new       -3.321928    0.8305757    0.1314493   0.02941342
1435          new       -3.087463    0.8107024    0.1421820   0.03665998
1413          new       -2.000000    0.7434795    0.2009380   0.04001856
32            new       -3.321928    0.7878249    0.1404810   0.05666255
668           new       -1.584963    0.6952908    0.2253799   0.05768800
     PREDS.PP.p_p PREDS.CAT PREDS.PP.obs
795   0.007306420       d_a   0.02558573
730   0.008561551       d_a   0.02941342
1435  0.010455667       d_a   0.03665998
1413  0.015563978       d_a   0.04001856
32    0.015031572       d_a   0.05666255
668   0.021641305       d_a   0.05768800

$p_p
     CASE TRANSITIVITY REC_LEN PAT_LEN REC_ANIM     PAT_SEM REC_ACCESSIB
93     99          p_p       1       7  animate    concrete        given
886   933          p_p       1       6  animate    concrete        given
1106 1168          p_p       2      12  animate    concrete        given
990  1042          p_p       2      10  animate    abstract        given
452   482          p_p       1       4  animate    concrete        given
889   936          p_p       3      10  animate information        given
     PAT_ACCESSIB LEN_RECmPAT_LOG PREDS.PP.d_a PREDS.PP.d_p PREDS.PP.p_a
93            new       -2.807355    0.8179065    0.1552057   0.01897277
886           new       -2.584963    0.7997475    0.1672878   0.02339257
1106          new       -2.584963    0.7997475    0.1672878   0.02339257
990           new       -2.321928    0.7760426    0.1821528   0.02986146
452           new       -2.000000    0.7434795    0.2009380   0.04001856
889           new       -1.736966    0.7137766    0.2164683   0.05054233
     PREDS.PP.p_p PREDS.CAT PREDS.PP.obs
93    0.007915070       d_a  0.007915070
886   0.009572164       d_a  0.009572164
1106  0.009572164       d_a  0.009572164
990   0.011943120       d_a  0.011943120
452   0.015563978       d_a  0.015563978
889   0.019212724       d_a  0.019212724

In a real study, one could now check the concrete examples and their contexts for what might explain the unexpected speaker choices.

9 Homework

To prepare for next session, read SFLWR3, Section 5.4.2 on ordinal regression.

10 Session info

sessionInfo()
R version 4.5.0 (2025-04-11)
Platform: x86_64-pc-linux-gnu
Running under: Pop!_OS 22.04 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

time zone: America/Los_Angeles
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  compiler  methods
[8] base

other attached packages:
[1] nnet_7.3-20    effects_4.2-2  car_3.1-3      carData_3.0-5  STGmisc_1.02
[6] Rcpp_1.0.14    magrittr_2.0.3

loaded via a namespace (and not attached):
 [1] Matrix_1.7-3      jsonlite_2.0.0    survey_4.4-2      splines_4.5.0
 [5] boot_1.3-31       yaml_2.3.10       fastmap_1.2.0     lattice_0.22-7
 [9] Formula_1.2-5     knitr_1.50        rbibutils_2.3     htmlwidgets_1.6.4
[13] MASS_7.3-65       nloptr_2.2.1      insight_1.2.0     minqa_1.2.8
[17] DBI_1.2.3         rlang_1.1.6       xfun_0.52         cli_3.6.5
[21] Rdpack_2.6.4      digest_0.6.37     grid_4.5.0        rstudioapi_0.17.1
[25] lme4_1.1-37       nlme_3.1-168      reformulas_0.4.0  evaluate_1.0.3
[29] mitools_2.4       survival_3.8-3    abind_1.4-8       colorspace_2.1-1
[33] rmarkdown_2.29    tools_4.5.0       htmltools_0.5.8.1