1 Session 07: Ordinal regression

Let’s load a data set for an example of an ordinal regression model; importantly, we make sure the response variable is an ordered factor:

rm(list=ls(all.names=TRUE))
library(car); library(effects); library(MASS)
source("105_04-05_R2.r") # load small helper function
str(x <- read.delim(   # summarize x, the result of loading
   file="105_07_RATG.csv", # this file
   stringsAsFactors=TRUE)) # change categorical variables into factors
## 'data.frame':    4300 obs. of  8 variables:
##  $ CASE      : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ JUDGMENT  : int  3 -3 3 3 -3 2 -2 2 -3 3 ...
##  $ AGREEMENT : Factor w/ 2 levels "no","yes": 1 1 1 1 1 1 1 1 2 2 ...
##  $ NEGATION  : Factor w/ 2 levels "affirmative",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ EMPHPRON  : Factor w/ 2 levels "no","yes": 2 2 2 2 2 2 2 2 2 2 ...
##  $ SUBJORDER : Factor w/ 2 levels "gramsem","semgram": 1 1 1 1 1 1 1 1 2 2 ...
##  $ NUMBERGRAM: Factor w/ 2 levels "plural","singular": 1 1 1 1 1 1 1 1 1 1 ...
##  $ PERSONGRAM: Factor w/ 3 levels "first","second",..: 3 3 3 3 3 3 3 3 3 3 ...
x$JUDGMENT <- factor(x$JUDGMENT, ordered=TRUE)
summary(x)
##       CASE      JUDGMENT  AGREEMENT         NEGATION    EMPHPRON  
##  Min.   :   1   -3: 620   no :2077   affirmative:2139   no :2133  
##  1st Qu.:1137   -2: 480   yes:2223   negative   :2161   yes:2167  
##  Median :2262   -1: 387                                           
##  Mean   :2266   0 : 329                                           
##  3rd Qu.:3394   1 : 355                                           
##  Max.   :4528   2 : 826                                           
##                 3 :1303                                           
##    SUBJORDER       NUMBERGRAM    PERSONGRAM  
##  gramsem:2173   plural  :1989   first :1381  
##  semgram:2127   singular:2311   second:1561  
##                                 third :1358  
##                                              
##                                              
##                                              
## 

Here’s what each of the columns are/contain:

  • the column CASE is a simple case/row number from 1 to n (the sample size) just so that each case has a unique ID;
  • the column JUDGMENT is the response variable: it’s an ordinal rating variable with 7 levels centered around 0 (neutral);
  • the column AGREEMENT is a binary predictor: it’s whether the experimental stimulus (a sentence) exhibited proper agreement between the subject and the verb or not: no (e.g., He do the dishes) vs. yes (e.g., He does the dishes);
  • the column NEGATION is a binary predictor: it’s whether the experimental stimulus was affirmative (e.g., He does the dishes) vs. negative (e.g., He doesn’t do the dishes);
  • the column EMPHPRON is a binary predictor: it’s whether the experimental stimulus contained an emphatic pronoun: no (e.g., He did the dishes) vs. yes (e.g., He, HE, did the dishes);
  • the column SUBJORDER is a binary predictor: it’s whether the experimental stimulus has a grammatical subject before an experiencer subject or not: gramsem (e.g., Birds please me) vs. semgram (e.g., I like birds) (these are not great examples, the study was actually done on Spanish);
  • the column NUMBERGRAM is a binary predictor: it’s whether the grammatical subject of the experimental stimulus was plural (e.g., Birds please me) or singular (e.g., This pleases me);
  • the column PERSONGRAM is a ternary predictor: it’s whether the grammatical subject of the experimental stimulus was first person (e.g., I like birds) vs. second person (e.g., you like birds) vs. third person (e.g., He likes birds).

We want to whether the rating of a Spanish construction involving the semantic role of an experiencer is predictable from

  • AGREEMENT, NEGATION, EMPHPRON, SUBJORDER, NUMBERGRAM, and PERSONGRAM;
  • any interaction of a predictor with AGREEMENT.

1.1 Deviance & baseline(s)

Let’s already compute a multinomial kind of baseline for what will be the response variable, JUDGMENT, but note that this baseline doesn’t take the ordinal nature into consideration well:

(baseline <- max(          # make baseline the highest
   prop.table(             # proportion in the
      table(x$JUDGMENT)))) # frequency table of the response variable
## [1] 0.3030233

Let’s compute the deviance of the null model:

deviance(m.00 <- polr(JUDGMENT ~ 1, Hess=TRUE, data=x, na.action=na.exclude))
## [1] 15669.11

1.2 Exploration & preparation

Some exploration of the relevant variables:

# the predictor(s)/response on its/their own
# just check the summary for the main effects
table(x$NEGATION, x$AGREEMENT)
##              
##                 no  yes
##   affirmative 1049 1090
##   negative    1028 1133
table(x$EMPHPRON, x$AGREEMENT)
##      
##         no  yes
##   no  1038 1095
##   yes 1039 1128
table(x$SUBJORDER, x$AGREEMENT)
##          
##             no  yes
##   gramsem 1047 1126
##   semgram 1030 1097
table(x$NUMBERGRAM, x$AGREEMENT)
##           
##              no  yes
##   plural    926 1063
##   singular 1151 1160
table(x$PERSONGRAM, x$AGREEMENT)
##         
##           no yes
##   first  657 724
##   second 809 752
##   third  611 747
# the predictor(s) w/ the response
table(x$AGREEMENT, x$JUDGMENT)
##      
##        -3  -2  -1   0   1   2   3
##   no  463 304 225 158 190 344 393
##   yes 157 176 162 171 165 482 910
ftable(x$NEGATION, x$AGREEMENT, x$JUDGMENT)
##                   -3  -2  -1   0   1   2   3
##                                             
## affirmative no   231 146 114  86  96 176 200
##             yes   80  94  75  78  83 224 456
## negative    no   232 158 111  72  94 168 193
##             yes   77  82  87  93  82 258 454
ftable(x$EMPHPRON, x$AGREEMENT, x$JUDGMENT)
##           -3  -2  -1   0   1   2   3
##                                     
## no  no   240 149 113  78  99 171 188
##     yes   74  82  66  91  75 242 465
## yes no   223 155 112  80  91 173 205
##     yes   83  94  96  80  90 240 445
ftable(x$SUBJORDER, x$AGREEMENT, x$JUDGMENT)
##               -3  -2  -1   0   1   2   3
##                                         
## gramsem no   253 159 124  87  97 161 166
##         yes   98 107 104  98  99 254 366
## semgram no   210 145 101  71  93 183 227
##         yes   59  69  58  73  66 228 544
ftable(x$NUMBERGRAM, x$AGREEMENT, x$JUDGMENT)
##                -3  -2  -1   0   1   2   3
##                                          
## plural   no   162 113  89  70  97 184 211
##          yes   64  84  75  88  80 243 429
## singular no   301 191 136  88  93 160 182
##          yes   93  92  87  83  85 239 481
ftable(x$PERSONGRAM, x$AGREEMENT, x$JUDGMENT)
##              -3  -2  -1   0   1   2   3
##                                        
## first  no   173 112  85  50  51  81 105
##        yes   64  64  61  52  50 141 292
## second no   190 122  86  65  71 139 136
##        yes   41  51  47  62  62 173 316
## third  no   100  70  54  43  68 124 152
##        yes   52  61  54  57  53 168 302

1.3 Modeling & numerical interpretation

Let’s fit our initial regression model:

summary(m.01 <- polr(     # summarize m.01, which models
   JUDGMENT ~ 1 +         # JUDGMENT ~ an overall intercept (1)
   # AGREEMENT interacting w/ all other predictors
   AGREEMENT * (NEGATION+EMPHPRON+SUBJORDER+NUMBERGRAM+PERSONGRAM),
   Hess=TRUE,             # recommended for using summary
   data=x,                # those variables are in x
   na.action=na.exclude)) # skip cases with NA/missing data
## Call:
## polr(formula = JUDGMENT ~ 1 + AGREEMENT * (NEGATION + EMPHPRON + 
##     SUBJORDER + NUMBERGRAM + PERSONGRAM), data = x, na.action = na.exclude, 
##     Hess = TRUE)
## 
## Coefficients:
##                                    Value Std. Error t value
## AGREEMENTyes                     1.00762    0.23047  4.3720
## NEGATIONnegative                -0.07963    0.07825 -1.0177
## EMPHPRONyes                      0.01820    0.07869  0.2313
## SUBJORDERsemgram                 0.26155    0.07833  3.3390
## NUMBERGRAMsingular              -0.28824    0.12844 -2.2441
## PERSONGRAMsecond                 0.12173    0.10676  1.1402
## PERSONGRAMthird                  0.41046    0.16241  2.5273
## AGREEMENTyes:NEGATIONnegative    0.06444    0.10980  0.5869
## AGREEMENTyes:EMPHPRONyes        -0.14410    0.11020 -1.3077
## AGREEMENTyes:SUBJORDERsemgram    0.39130    0.11081  3.5311
## AGREEMENTyes:NUMBERGRAMsingular  0.24572    0.18523  1.3266
## AGREEMENTyes:PERSONGRAMsecond   -0.05795    0.15453 -0.3750
## AGREEMENTyes:PERSONGRAMthird    -0.36910    0.23059 -1.6007
## 
## Intercepts:
##       Value   Std. Error t value
## -3|-2 -1.2230  0.1653    -7.4002
## -2|-1 -0.4605  0.1635    -2.8162
## -1|0   0.0101  0.1632     0.0617
## 0|1    0.3675  0.1634     2.2493
## 1|2    0.7353  0.1637     4.4917
## 2|3    1.6316  0.1652     9.8743
## 
## Residual Deviance: 15099.88 
## AIC: 15137.88

Let’s also compute the t-values for these coefficient values (however uninterpretable those are):

(m.01.tscores <- summary(m.01)$coefficients[,"t value"])
##                    AGREEMENTyes                NEGATIONnegative 
##                      4.37198298                     -1.01767960 
##                     EMPHPRONyes                SUBJORDERsemgram 
##                      0.23126312                      3.33897026 
##              NUMBERGRAMsingular                PERSONGRAMsecond 
##                     -2.24412620                      1.14023537 
##                 PERSONGRAMthird   AGREEMENTyes:NEGATIONnegative 
##                      2.52729216                      0.58690684 
##        AGREEMENTyes:EMPHPRONyes   AGREEMENTyes:SUBJORDERsemgram 
##                     -1.30766349                      3.53108615 
## AGREEMENTyes:NUMBERGRAMsingular   AGREEMENTyes:PERSONGRAMsecond 
##                      1.32657549                     -0.37498824 
##    AGREEMENTyes:PERSONGRAMthird                           -3|-2 
##                     -1.60067926                     -7.40016289 
##                           -2|-1                            -1|0 
##                     -2.81623304                      0.06170461 
##                             0|1                             1|2 
##                      2.24927021                      4.49166614 
##                             2|3 
##                      9.87427499

And from that we compute the p-values:

(m.01.pvalues <- 2*pnorm(abs(m.01.tscores), lower.tail=FALSE))
##                    AGREEMENTyes                NEGATIONnegative 
##                    1.231231e-05                    3.088302e-01 
##                     EMPHPRONyes                SUBJORDERsemgram 
##                    8.171104e-01                    8.408957e-04 
##              NUMBERGRAMsingular                PERSONGRAMsecond 
##                    2.482428e-02                    2.541883e-01 
##                 PERSONGRAMthird   AGREEMENTyes:NEGATIONnegative 
##                    1.149458e-02                    5.572663e-01 
##        AGREEMENTyes:EMPHPRONyes   AGREEMENTyes:SUBJORDERsemgram 
##                    1.909875e-01                    4.138569e-04 
## AGREEMENTyes:NUMBERGRAMsingular   AGREEMENTyes:PERSONGRAMsecond 
##                    1.846491e-01                    7.076692e-01 
##    AGREEMENTyes:PERSONGRAMthird                           -3|-2 
##                    1.094480e-01                    1.360175e-13 
##                           -2|-1                            -1|0 
##                    4.859041e-03                    9.507981e-01 
##                             0|1                             1|2 
##                    2.449531e-02                    7.066814e-06 
##                             2|3 
##                    5.382075e-23

Also, let’s quickly check the model’s overall significance value and see which predictors may get dropped:

anova(                # compare
   update(m.01, ~ 1), # the null model generated here ad hoc
   m.final,           # to the final model
   test="Chisq")      # using a LRT
## Likelihood ratio tests of ordinal regression models
## 
## Response: JUDGMENT
##                                                                         Model
## 1                                                                           1
## 2 1 + AGREEMENT * (NEGATION + EMPHPRON + SUBJORDER + NUMBERGRAM + PERSONGRAM)
##   Resid. df Resid. Dev   Test    Df LR stat. Pr(Chi)
## 1      4294   15669.11                              
## 2      4281   15099.88 1 vs 2    13 569.2355       0
drop1(           # drop each droppable predictor at a time
   m.01,         # from the model m.01 &
   test="Chisq") # test its significance w/ a LRT
## Single term deletions
## 
## Model:
## JUDGMENT ~ 1 + AGREEMENT * (NEGATION + EMPHPRON + SUBJORDER + 
##     NUMBERGRAM + PERSONGRAM)
##                      Df   AIC     LRT Pr(>Chi)    
## <none>                  15138                     
## AGREEMENT:NEGATION    1 15136  0.3444  0.55727    
## AGREEMENT:EMPHPRON    1 15138  1.7102  0.19095    
## AGREEMENT:SUBJORDER   1 15148 12.4862  0.00041 ***
## AGREEMENT:NUMBERGRAM  1 15138  1.7590  0.18474    
## AGREEMENT:PERSONGRAM  2 15137  3.4406  0.17901    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Let’s fit the 2nd model with the interaction AGREEMENT:NEGATION deleted:

summary(m.02 <- update(   # make m.02 an update of
   m.01, .~.              # m.01, namely all of it (.~.), but then
   - AGREEMENT:NEGATION)) # remove this interaction
## Call:
## polr(formula = JUDGMENT ~ AGREEMENT + NEGATION + EMPHPRON + SUBJORDER + 
##     NUMBERGRAM + PERSONGRAM + AGREEMENT:EMPHPRON + AGREEMENT:SUBJORDER + 
##     AGREEMENT:NUMBERGRAM + AGREEMENT:PERSONGRAM, data = x, na.action = na.exclude, 
##     Hess = TRUE)
## 
## Coefficients:
##                                    Value Std. Error t value
## AGREEMENTyes                     1.03982    0.22386  4.6450
## NEGATIONnegative                -0.04691    0.05489 -0.8545
## EMPHPRONyes                      0.01919    0.07868  0.2439
## SUBJORDERsemgram                 0.26067    0.07832  3.3283
## NUMBERGRAMsingular              -0.29084    0.12836 -2.2658
## PERSONGRAMsecond                 0.11651    0.10639  1.0951
## PERSONGRAMthird                  0.40454    0.16209  2.4958
## AGREEMENTyes:EMPHPRONyes        -0.14501    0.11019 -1.3160
## AGREEMENTyes:SUBJORDERsemgram    0.39248    0.11080  3.5423
## AGREEMENTyes:NUMBERGRAMsingular  0.24393    0.18521  1.3171
## AGREEMENTyes:PERSONGRAMsecond   -0.05529    0.15446 -0.3579
## AGREEMENTyes:PERSONGRAMthird    -0.36913    0.23059 -1.6008
## 
## Intercepts:
##       Value   Std. Error t value
## -3|-2 -1.2119  0.1641    -7.3827
## -2|-1 -0.4494  0.1624    -2.7674
## -1|0   0.0210  0.1622     0.1295
## 0|1    0.3783  0.1623     2.3310
## 1|2    0.7462  0.1627     4.5874
## 2|3    1.6425  0.1642    10.0033
## 
## Residual Deviance: 15100.22 
## AIC: 15136.22
anova(m.01, m.02, test="Chisq") # compare models with a LRT
## Likelihood ratio tests of ordinal regression models
## 
## Response: JUDGMENT
##                                                                                                                                                            Model
## 1 AGREEMENT + NEGATION + EMPHPRON + SUBJORDER + NUMBERGRAM + PERSONGRAM + AGREEMENT:EMPHPRON + AGREEMENT:SUBJORDER + AGREEMENT:NUMBERGRAM + AGREEMENT:PERSONGRAM
## 2                                                                                    1 + AGREEMENT * (NEGATION + EMPHPRON + SUBJORDER + NUMBERGRAM + PERSONGRAM)
##   Resid. df Resid. Dev   Test    Df  LR stat.   Pr(Chi)
## 1      4282   15100.22                                 
## 2      4281   15099.88 1 vs 2     1 0.3444451 0.5572746
drop1(           # drop each droppable predictor at a time
   m.02,         # from the model m.02 &
   test="Chisq") # test its significance w/ a LRT
## Single term deletions
## 
## Model:
## JUDGMENT ~ AGREEMENT + NEGATION + EMPHPRON + SUBJORDER + NUMBERGRAM + 
##     PERSONGRAM + AGREEMENT:EMPHPRON + AGREEMENT:SUBJORDER + AGREEMENT:NUMBERGRAM + 
##     AGREEMENT:PERSONGRAM
##                      Df   AIC     LRT  Pr(>Chi)    
## <none>                  15136                      
## NEGATION              1 15135  0.7302 0.3928118    
## AGREEMENT:EMPHPRON    1 15136  1.7323 0.1881216    
## AGREEMENT:SUBJORDER   1 15147 12.5658 0.0003929 ***
## AGREEMENT:NUMBERGRAM  1 15136  1.7338 0.1879251    
## AGREEMENT:PERSONGRAM  2 15136  3.4864 0.1749558    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Let’s fit the 3rd model with the main effect NEGATION deleted:

summary(m.03 <- update( # make m.03 an update of
   m.02, .~.            # m.02, namely all of it (.~.), but then
   - NEGATION))         # remove this predictor
## Call:
## polr(formula = JUDGMENT ~ AGREEMENT + EMPHPRON + SUBJORDER + 
##     NUMBERGRAM + PERSONGRAM + AGREEMENT:EMPHPRON + AGREEMENT:SUBJORDER + 
##     AGREEMENT:NUMBERGRAM + AGREEMENT:PERSONGRAM, data = x, na.action = na.exclude, 
##     Hess = TRUE)
## 
## Coefficients:
##                                    Value Std. Error t value
## AGREEMENTyes                     1.02402    0.22306  4.5908
## EMPHPRONyes                      0.02060    0.07866  0.2618
## SUBJORDERsemgram                 0.25944    0.07831  3.3130
## NUMBERGRAMsingular              -0.29455    0.12828 -2.2962
## PERSONGRAMsecond                 0.10906    0.10604  1.0285
## PERSONGRAMthird                  0.39612    0.16178  2.4485
## AGREEMENTyes:EMPHPRONyes        -0.14650    0.11018 -1.3297
## AGREEMENTyes:SUBJORDERsemgram    0.39322    0.11080  3.5490
## AGREEMENTyes:NUMBERGRAMsingular  0.25412    0.18481  1.3751
## AGREEMENTyes:PERSONGRAMsecond   -0.04408    0.15391 -0.2864
## AGREEMENTyes:PERSONGRAMthird    -0.35195    0.22969 -1.5323
## 
## Intercepts:
##       Value   Std. Error t value
## -3|-2 -1.1959  0.1631    -7.3332
## -2|-1 -0.4336  0.1613    -2.6873
## -1|0   0.0368  0.1611     0.2283
## 0|1    0.3940  0.1612     2.4437
## 1|2    0.7619  0.1616     4.7141
## 2|3    1.6581  0.1632    10.1612
## 
## Residual Deviance: 15100.95 
## AIC: 15134.95
anova(m.02, m.03, test="Chisq") # compare models with a LRT
## Likelihood ratio tests of ordinal regression models
## 
## Response: JUDGMENT
##                                                                                                                                                            Model
## 1            AGREEMENT + EMPHPRON + SUBJORDER + NUMBERGRAM + PERSONGRAM + AGREEMENT:EMPHPRON + AGREEMENT:SUBJORDER + AGREEMENT:NUMBERGRAM + AGREEMENT:PERSONGRAM
## 2 AGREEMENT + NEGATION + EMPHPRON + SUBJORDER + NUMBERGRAM + PERSONGRAM + AGREEMENT:EMPHPRON + AGREEMENT:SUBJORDER + AGREEMENT:NUMBERGRAM + AGREEMENT:PERSONGRAM
##   Resid. df Resid. Dev   Test    Df  LR stat.   Pr(Chi)
## 1      4283   15100.95                                 
## 2      4282   15100.22 1 vs 2     1 0.7302203 0.3928118
drop1(           # drop each droppable predictor at a time
   m.03,         # from the model m.03 &
   test="Chisq") # test its significance w/ a LRT
## Single term deletions
## 
## Model:
## JUDGMENT ~ AGREEMENT + EMPHPRON + SUBJORDER + NUMBERGRAM + PERSONGRAM + 
##     AGREEMENT:EMPHPRON + AGREEMENT:SUBJORDER + AGREEMENT:NUMBERGRAM + 
##     AGREEMENT:PERSONGRAM
##                      Df   AIC     LRT Pr(>Chi)    
## <none>                  15135                     
## AGREEMENT:EMPHPRON    1 15135  1.7682 0.183607    
## AGREEMENT:SUBJORDER   1 15146 12.6131 0.000383 ***
## AGREEMENT:NUMBERGRAM  1 15135  1.8905 0.169147    
## AGREEMENT:PERSONGRAM  2 15134  3.3222 0.189929    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Let’s fit the 4th model with the interaction AGREEMENT:PERSONGRAM deleted:

summary(m.04 <- update( # make m.04 an update of
   m.03, .~.            # m.03, namely all of it (.~.), but then
   - AGREEMENT:PERSONGRAM)) # remove this interaction
## Call:
## polr(formula = JUDGMENT ~ AGREEMENT + EMPHPRON + SUBJORDER + 
##     NUMBERGRAM + PERSONGRAM + AGREEMENT:EMPHPRON + AGREEMENT:SUBJORDER + 
##     AGREEMENT:NUMBERGRAM, data = x, na.action = na.exclude, Hess = TRUE)
## 
## Coefficients:
##                                    Value Std. Error t value
## AGREEMENTyes                     0.76959    0.11295  6.8139
## EMPHPRONyes                      0.01133    0.07850  0.1443
## SUBJORDERsemgram                 0.26537    0.07813  3.3965
## NUMBERGRAMsingular              -0.41025    0.10669 -3.8454
## PERSONGRAMsecond                 0.08777    0.07689  1.1415
## PERSONGRAMthird                  0.21723    0.11466  1.8946
## AGREEMENTyes:EMPHPRONyes        -0.13700    0.11004 -1.2449
## AGREEMENTyes:SUBJORDERsemgram    0.38909    0.11063  3.5171
## AGREEMENTyes:NUMBERGRAMsingular  0.48785    0.11089  4.3995
## 
## Intercepts:
##       Value   Std. Error t value
## -3|-2 -1.3213  0.1329    -9.9452
## -2|-1 -0.5594  0.1305    -4.2860
## -1|0  -0.0895  0.1301    -0.6880
## 0|1    0.2674  0.1301     2.0556
## 1|2    0.6350  0.1304     4.8681
## 2|3    1.5310  0.1323    11.5759
## 
## Residual Deviance: 15104.28 
## AIC: 15134.28
anova(m.03, m.04, test="Chisq") # compare models with a LRT
## Likelihood ratio tests of ordinal regression models
## 
## Response: JUDGMENT
##                                                                                                                                                 Model
## 1                        AGREEMENT + EMPHPRON + SUBJORDER + NUMBERGRAM + PERSONGRAM + AGREEMENT:EMPHPRON + AGREEMENT:SUBJORDER + AGREEMENT:NUMBERGRAM
## 2 AGREEMENT + EMPHPRON + SUBJORDER + NUMBERGRAM + PERSONGRAM + AGREEMENT:EMPHPRON + AGREEMENT:SUBJORDER + AGREEMENT:NUMBERGRAM + AGREEMENT:PERSONGRAM
##   Resid. df Resid. Dev   Test    Df LR stat.   Pr(Chi)
## 1      4285   15104.28                                
## 2      4283   15100.95 1 vs 2     2 3.322208 0.1899291
drop1(           # drop each droppable predictor at a time
   m.04,         # from the model m.04 &
   test="Chisq") # test its significance w/ a LRT
## Single term deletions
## 
## Model:
## JUDGMENT ~ AGREEMENT + EMPHPRON + SUBJORDER + NUMBERGRAM + PERSONGRAM + 
##     AGREEMENT:EMPHPRON + AGREEMENT:SUBJORDER + AGREEMENT:NUMBERGRAM
##                      Df   AIC     LRT  Pr(>Chi)    
## <none>                  15134                      
## PERSONGRAM            2 15134  3.6143 0.1641241    
## AGREEMENT:EMPHPRON    1 15134  1.5498 0.2131632    
## AGREEMENT:SUBJORDER   1 15145 12.3871 0.0004323 ***
## AGREEMENT:NUMBERGRAM  1 15152 19.3703 1.077e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Let’s fit the 5th model with the interaction AGREEMENT:EMPHPRON deleted:

summary(m.05 <- update(   # make m.05 an update of
   m.04, .~.              # m.04, namely all of it (.~.), but then
   - AGREEMENT:EMPHPRON)) # remove this interaction
## Call:
## polr(formula = JUDGMENT ~ AGREEMENT + EMPHPRON + SUBJORDER + 
##     NUMBERGRAM + PERSONGRAM + AGREEMENT:SUBJORDER + AGREEMENT:NUMBERGRAM, 
##     data = x, na.action = na.exclude, Hess = TRUE)
## 
## Coefficients:
##                                    Value Std. Error t value
## AGREEMENTyes                     0.69713    0.09674   7.206
## EMPHPRONyes                     -0.05837    0.05502  -1.061
## SUBJORDERsemgram                 0.26392    0.07812   3.378
## NUMBERGRAMsingular              -0.42288    0.10623  -3.981
## PERSONGRAMsecond                 0.08768    0.07688   1.141
## PERSONGRAMthird                  0.21233    0.11460   1.853
## AGREEMENTyes:SUBJORDERsemgram    0.39157    0.11061   3.540
## AGREEMENTyes:NUMBERGRAMsingular  0.49044    0.11087   4.424
## 
## Intercepts:
##       Value    Std. Error t value 
## -3|-2  -1.3652   0.1282   -10.6526
## -2|-1  -0.6033   0.1257    -4.7990
## -1|0   -0.1334   0.1252    -1.0654
## 0|1     0.2235   0.1252     1.7846
## 1|2     0.5910   0.1256     4.7069
## 2|3     1.4865   0.1273    11.6748
## 
## Residual Deviance: 15105.83 
## AIC: 15133.83
anova(m.04, m.05, test="Chisq") # compare models with a LRT
## Likelihood ratio tests of ordinal regression models
## 
## Response: JUDGMENT
##                                                                                                                          Model
## 1                      AGREEMENT + EMPHPRON + SUBJORDER + NUMBERGRAM + PERSONGRAM + AGREEMENT:SUBJORDER + AGREEMENT:NUMBERGRAM
## 2 AGREEMENT + EMPHPRON + SUBJORDER + NUMBERGRAM + PERSONGRAM + AGREEMENT:EMPHPRON + AGREEMENT:SUBJORDER + AGREEMENT:NUMBERGRAM
##   Resid. df Resid. Dev   Test    Df LR stat.   Pr(Chi)
## 1      4286   15105.83                                
## 2      4285   15104.28 1 vs 2     1 1.549812 0.2131632
drop1(           # drop each droppable predictor at a time
   m.05,         # from the model m.05 &
   test="Chisq") # test its significance w/ a LRT
## Single term deletions
## 
## Model:
## JUDGMENT ~ AGREEMENT + EMPHPRON + SUBJORDER + NUMBERGRAM + PERSONGRAM + 
##     AGREEMENT:SUBJORDER + AGREEMENT:NUMBERGRAM
##                      Df   AIC     LRT  Pr(>Chi)    
## <none>                  15134                      
## EMPHPRON              1 15133  1.1264 0.2885398    
## PERSONGRAM            2 15133  3.4489 0.1782729    
## AGREEMENT:SUBJORDER   1 15144 12.5500 0.0003962 ***
## AGREEMENT:NUMBERGRAM  1 15151 19.5864 9.615e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Let’s fit the 6th model with the main effect EMPHPRON deleted:

summary(m.06 <- update( # make m.06 an update of
   m.05, .~.            # m.05, namely all of it (.~.), but then
   - EMPHPRON))         # remove this predictor
## Call:
## polr(formula = JUDGMENT ~ AGREEMENT + SUBJORDER + NUMBERGRAM + 
##     PERSONGRAM + AGREEMENT:SUBJORDER + AGREEMENT:NUMBERGRAM, 
##     data = x, na.action = na.exclude, Hess = TRUE)
## 
## Coefficients:
##                                    Value Std. Error t value
## AGREEMENTyes                     0.70368    0.09656   7.288
## SUBJORDERsemgram                 0.26511    0.07811   3.394
## NUMBERGRAMsingular              -0.41126    0.10563  -3.893
## PERSONGRAMsecond                 0.08941    0.07685   1.163
## PERSONGRAMthird                  0.21791    0.11446   1.904
## AGREEMENTyes:SUBJORDERsemgram    0.39147    0.11060   3.539
## AGREEMENTyes:NUMBERGRAMsingular  0.47677    0.11012   4.330
## 
## Intercepts:
##       Value    Std. Error t value 
## -3|-2  -1.3264   0.1228   -10.8030
## -2|-1  -0.5646   0.1203    -4.6947
## -1|0   -0.0948   0.1198    -0.7917
## 0|1     0.2619   0.1198     2.1853
## 1|2     0.6293   0.1202     5.2342
## 2|3     1.5247   0.1222    12.4814
## 
## Residual Deviance: 15106.95 
## AIC: 15132.95
anova(m.05, m.06, test="Chisq") # compare models with a LRT
## Likelihood ratio tests of ordinal regression models
## 
## Response: JUDGMENT
##                                                                                                     Model
## 1            AGREEMENT + SUBJORDER + NUMBERGRAM + PERSONGRAM + AGREEMENT:SUBJORDER + AGREEMENT:NUMBERGRAM
## 2 AGREEMENT + EMPHPRON + SUBJORDER + NUMBERGRAM + PERSONGRAM + AGREEMENT:SUBJORDER + AGREEMENT:NUMBERGRAM
##   Resid. df Resid. Dev   Test    Df LR stat.   Pr(Chi)
## 1      4287   15106.95                                
## 2      4286   15105.83 1 vs 2     1 1.126422 0.2885398
drop1(           # drop each droppable predictor at a time
   m.06,         # from the model m.06 &
   test="Chisq") # test its significance w/ a LRT
## Single term deletions
## 
## Model:
## JUDGMENT ~ AGREEMENT + SUBJORDER + NUMBERGRAM + PERSONGRAM + 
##     AGREEMENT:SUBJORDER + AGREEMENT:NUMBERGRAM
##                      Df   AIC     LRT  Pr(>Chi)    
## <none>                  15133                      
## PERSONGRAM            2 15133  3.6433 0.1617624    
## AGREEMENT:SUBJORDER   1 15144 12.5430 0.0003977 ***
## AGREEMENT:NUMBERGRAM  1 15150 18.7609 1.482e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Let’s fit the 7th model with the predictor PERSONGRAM deleted:

summary(m.07 <- update( # make m.07 an update of
   m.06, .~.            # m.06, namely all of it (.~.), but then
   - PERSONGRAM))       # remove this predictor
## Call:
## polr(formula = JUDGMENT ~ AGREEMENT + SUBJORDER + NUMBERGRAM + 
##     AGREEMENT:SUBJORDER + AGREEMENT:NUMBERGRAM, data = x, na.action = na.exclude, 
##     Hess = TRUE)
## 
## Coefficients:
##                                   Value Std. Error t value
## AGREEMENTyes                     0.7064    0.09623   7.341
## SUBJORDERsemgram                 0.2666    0.07795   3.421
## NUMBERGRAMsingular              -0.5460    0.07867  -6.941
## AGREEMENTyes:SUBJORDERsemgram    0.3958    0.10971   3.608
## AGREEMENTyes:NUMBERGRAMsingular  0.4667    0.11000   4.243
## 
## Intercepts:
##       Value    Std. Error t value 
## -3|-2  -1.4980   0.0786   -19.0583
## -2|-1  -0.7364   0.0743    -9.9046
## -1|0   -0.2670   0.0732    -3.6463
## 0|1     0.0893   0.0730     1.2232
## 1|2     0.4562   0.0733     6.2265
## 2|3     1.3510   0.0760    17.7800
## 
## Residual Deviance: 15110.60 
## AIC: 15132.60
anova(m.06, m.07, test="Chisq") # compare models with a LRT
## Likelihood ratio tests of ordinal regression models
## 
## Response: JUDGMENT
##                                                                                          Model
## 1              AGREEMENT + SUBJORDER + NUMBERGRAM + AGREEMENT:SUBJORDER + AGREEMENT:NUMBERGRAM
## 2 AGREEMENT + SUBJORDER + NUMBERGRAM + PERSONGRAM + AGREEMENT:SUBJORDER + AGREEMENT:NUMBERGRAM
##   Resid. df Resid. Dev   Test    Df LR stat.   Pr(Chi)
## 1      4289   15110.60                                
## 2      4287   15106.95 1 vs 2     2 3.643254 0.1617624
drop1(           # drop each droppable predictor at a time
   m.07,         # from the model m.07 &
   test="Chisq") # test its significance w/ a LRT
## Single term deletions
## 
## Model:
## JUDGMENT ~ AGREEMENT + SUBJORDER + NUMBERGRAM + AGREEMENT:SUBJORDER + 
##     AGREEMENT:NUMBERGRAM
##                      Df   AIC    LRT  Pr(>Chi)    
## <none>                  15133                     
## AGREEMENT:SUBJORDER   1 15144 13.034 0.0003059 ***
## AGREEMENT:NUMBERGRAM  1 15149 18.017  2.19e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

No, we can’t delete any more predictors, we’re done with model selection:

summary(m.final <- m.07); rm(m.07); invisible(invisible(gc()))
## Call:
## polr(formula = JUDGMENT ~ AGREEMENT + SUBJORDER + NUMBERGRAM + 
##     AGREEMENT:SUBJORDER + AGREEMENT:NUMBERGRAM, data = x, na.action = na.exclude, 
##     Hess = TRUE)
## 
## Coefficients:
##                                   Value Std. Error t value
## AGREEMENTyes                     0.7064    0.09623   7.341
## SUBJORDERsemgram                 0.2666    0.07795   3.421
## NUMBERGRAMsingular              -0.5460    0.07867  -6.941
## AGREEMENTyes:SUBJORDERsemgram    0.3958    0.10971   3.608
## AGREEMENTyes:NUMBERGRAMsingular  0.4667    0.11000   4.243
## 
## Intercepts:
##       Value    Std. Error t value 
## -3|-2  -1.4980   0.0786   -19.0583
## -2|-1  -0.7364   0.0743    -9.9046
## -1|0   -0.2670   0.0732    -3.6463
## 0|1     0.0893   0.0730     1.2232
## 1|2     0.4562   0.0733     6.2265
## 2|3     1.3510   0.0760    17.7800
## 
## Residual Deviance: 15110.60 
## AIC: 15132.60
(m.final.tscores <- summary(m.final)$coefficients[,"t value"])
##                    AGREEMENTyes                SUBJORDERsemgram 
##                        7.341353                        3.420638 
##              NUMBERGRAMsingular   AGREEMENTyes:SUBJORDERsemgram 
##                       -6.941128                        3.607670 
## AGREEMENTyes:NUMBERGRAMsingular                           -3|-2 
##                        4.242926                      -19.058294 
##                           -2|-1                            -1|0 
##                       -9.904564                       -3.646271 
##                             0|1                             1|2 
##                        1.223185                        6.226538 
##                             2|3 
##                       17.779967
(m.final.pvalues <- 2*pnorm(abs(m.final.tscores), lower.tail=FALSE))
##                    AGREEMENTyes                SUBJORDERsemgram 
##                    2.114448e-13                    6.247452e-04 
##              NUMBERGRAMsingular   AGREEMENTyes:SUBJORDERsemgram 
##                    3.889817e-12                    3.089594e-04 
## AGREEMENTyes:NUMBERGRAMsingular                           -3|-2 
##                    2.206240e-05                    5.607383e-81 
##                           -2|-1                            -1|0 
##                    3.977021e-23                    2.660737e-04 
##                             0|1                             1|2 
##                    2.212599e-01                    4.768550e-10 
##                             2|3 
##                    1.010396e-70

Let’s also compute the overallp-value for this model:

m.00 <- update( # make m.00 an update of
   m.final,     # m.final, namely
   ~ 1)         # making this the right hand side of the formula (i.e. no predictors)
anova(           # compare
   m.00,         # the null model
   m.01,         # to the model w/ the predictor
   test="Chisq") # using a LRT
## Likelihood ratio tests of ordinal regression models
## 
## Response: JUDGMENT
##                                                                         Model
## 1                                                                           1
## 2 1 + AGREEMENT * (NEGATION + EMPHPRON + SUBJORDER + NUMBERGRAM + PERSONGRAM)
##   Resid. df Resid. Dev   Test    Df LR stat. Pr(Chi)
## 1      4294   15669.11                              
## 2      4281   15099.88 1 vs 2    13 569.2355       0
pchisq(              # compute the area under the chi-squared curve
   q=569.2355,       # for this chi-squared value
   df=13,            # at 13 df
   lower.tail=FALSE) # only using the right/upper tail/side
## [1] 2.752626e-113

On to model quality; first, the model’s R2:

R2(m.final)
## [1] 0.1250769

Now, we make the ‘predictions’ …

x <- cbind(x,       # add to x the columns resulting from
   predict(         # predicting
      m.final,      # from m.final
      type="prob")) # predicted probabilities
# give those columns better names
names(x)[9:15] <- paste0( # rename those columns to the result of pasting
   "PREDS.",              # "PREDS."
   names(x)[9:15])        # in front of the existing names
x <- cbind(x,         # add to x a column
   PREDS.CAT=predict( # called PREDS.CAT, the predictions
      m.final,        # from m.final, namely
      type="class"))  # categorical predictions
head(x) # check the result
##   CASE JUDGMENT AGREEMENT    NEGATION EMPHPRON SUBJORDER NUMBERGRAM PERSONGRAM
## 1    1        3        no affirmative      yes   gramsem     plural      third
## 2    2       -3        no affirmative      yes   gramsem     plural      third
## 3    3        3        no affirmative      yes   gramsem     plural      third
## 4    4        3        no affirmative      yes   gramsem     plural      third
## 5    5       -3        no affirmative      yes   gramsem     plural      third
## 6    6        2        no affirmative      yes   gramsem     plural      third
##    PREDS.-3  PREDS.-2 PREDS.-1    PREDS.0    PREDS.1   PREDS.2   PREDS.3
## 1 0.1827313 0.1410666 0.109845 0.08866082 0.08981659 0.1821742 0.2057054
## 2 0.1827313 0.1410666 0.109845 0.08866082 0.08981659 0.1821742 0.2057054
## 3 0.1827313 0.1410666 0.109845 0.08866082 0.08981659 0.1821742 0.2057054
## 4 0.1827313 0.1410666 0.109845 0.08866082 0.08981659 0.1821742 0.2057054
## 5 0.1827313 0.1410666 0.109845 0.08866082 0.08981659 0.1821742 0.2057054
## 6 0.1827313 0.1410666 0.109845 0.08866082 0.08981659 0.1821742 0.2057054
##   PREDS.CAT
## 1         3
## 2         3
## 3         3
## 4         3
## 5         3
## 6         3

… and evaluate them (again, confusion matrices here are not as good as for the previous kinds of models):

(c.m <- table(           # confusion matrix: cross-tabulate
   "OBS"  =x$JUDGMENT,   # observed judgments in the rows
   "PREDS"=x$PREDS.CAT)) # predicted judgments in the columns
##     PREDS
## OBS    -3   -2   -1    0    1    2    3
##   -3  301    0    0    0    0    0  319
##   -2  191    0    0    0    0    0  289
##   -1  136    0    0    0    0    0  251
##   0    88    0    0    0    0    0  241
##   1    93    0    0    0    0    0  262
##   2   160    0    0    0    0    0  666
##   3   182    0    0    0    0    0 1121
c(tapply(as.character(x$JUDGMENT)==x$PREDS.CAT, x$JUDGMENT, mean), # accuracies for ...
  "overall"=mean(as.character(x$JUDGMENT)==x$PREDS.CAT)) # ... each level & overall
##        -3        -2        -1         0         1         2         3   overall 
## 0.4854839 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.8603223 0.3306977

1.4 Visual interpretation

Now let’s visualize the results of the two interactions based on the predictions; we begin with AGREEMENT:SUBJORDER:

ph.as <- data.frame(         # make ph.as a data frame of
   agrsub <- effect(         # the effect
      "AGREEMENT:SUBJORDER", # of AGREEMENT:SUBJORDER
      m.final))              # in m.final
ph.as[,1:9] # show just the predicted probabilities
##   AGREEMENT SUBJORDER   prob.X.3   prob.X.2   prob.X.1    prob.X0    prob.X1
## 1        no   gramsem 0.23067814 0.16037026 0.11556664 0.08791856 0.08457945
## 2       yes   gramsem 0.10323854 0.09455074 0.08497318 0.07743151 0.08810081
## 3        no   semgram 0.18677053 0.14292687 0.11054340 0.08874982 0.08947773
## 4       yes   semgram 0.05603033 0.05675284 0.05614353 0.05603617 0.07028809
##     prob.X2   prob.X3
## 1 0.1590298 0.1618572
## 2 0.2170518 0.3346534
## 3 0.1801735 0.2013581
## 4 0.2109386 0.4938105
plot(agrsub,                   # plot the effect of AGREEMENT:SUBJORDER
   type="probability",         # but plot predicted probabilities
   ylim=c(0, 1),               # w/ these y-axis limits
   grid=TRUE,                  # & w/ a grid
   multiline=TRUE,             # put all lines in one panel
   confint=list(style="auto"), # keep confidence intervals
   lattice=list(key.args=list( # make the
      columns=5,               # legend have 5 columns &
      cex.title=0)))           # no title

plot(agrsub,                   # plot the effect of AGREEMENT:SUBJORDER
   type="probability",         # but plot predicted probabilities
   ylim=c(0, 1),               # w/ these y-axis limits
   x.var="SUBJORDER",          # w/ this variable on the x-axis
   grid=TRUE,                  # & w/ a grid
   multiline=TRUE,             # put all lines in one panel
   confint=list(style="auto"), # keep confidence intervals
   lattice=list(key.args=list( # make the
      columns=5,               # legend have 5 columns &
      cex.title=0)))           # no title