library(foreign) #package to access data type
library(MASS) #package for model
dat <- read.dta("https://stats.idre.ucla.edu/stat/data/ologit.dta")10.24: Ordinal regression
Learning objectives
Identify when to use an ordinal regression model
Interpret an ordinal regression model
Generate an ordinal regression model in R
Classwise videos
If you have questions as you watch the videos, feel free to send me an email or slack message! I will address common questions at the beginning of class.
Dr. Väisänan does a great job of explaining ordinal regression, though the code is in Stata. The lab exercise in class will introduce the R code.
Textbook
ISLR does not cover ordinal regression.
Application exercise
Data
A study looks at factors that influence the decision of whether to apply to graduate school. College juniors are asked if they are unlikely, somewhat likely, or very likely to apply to graduate school.
Source: https://stats.oarc.ucla.edu/r/dae/ordinal-logistic-regression/
Use the code below to access the data:
Data dictionary:
apply: response to question “how likely are you to apply to graduate school?” with 3 options: very likely, somewhat likely, unlikelypared: parental education status (1=at least one parent has a graduate degree, 0 otherwise)public: undergraduate institution type (1=publis, 0=private)gpa: current undergraduate GPA (4.0 scale)
Take a glimpse of the data. What is the outcome?
Calculate some summary statistics for the data. How many students are in each category of
apply? What is the average GPA for each category ofapply?- Note: For a table showing the number of each parental education status and institution type per level of
apply, you can either usetableorcount. We have seen thecountfunction before, andtableworks the same way (it’s just not a tidyverse function). Specify the two variables within the function, where the first specified variable will show up in the row of the table and the second variable will show up in the columns (e.g.,table(dat$pared, dat$apply)
- Note: For a table showing the number of each parental education status and institution type per level of
Ordinal regression model (AKA proportional odds model or cumulative logit model)
We can fit the model using the polr function in the MASS package. Note that the syntax is the same as what we’ve seen before, but we don’t need to specify family like we do with glm, because the polr function is specialized for the ordinal model. The Hess=TRUE option just stores the necessary information in the ord_mod object for us to access the summary information; we will forego the technical details here.
ord_mod <- polr(apply ~ pared + public + gpa, data=dat, Hess=TRUE)
summary(ord_mod)Call:
polr(formula = apply ~ pared + public + gpa, data = dat, Hess = TRUE)
Coefficients:
Value Std. Error t value
pared 1.04769 0.2658 3.9418
public -0.05879 0.2979 -0.1974
gpa 0.61594 0.2606 2.3632
Intercepts:
Value Std. Error t value
unlikely|somewhat likely 2.2039 0.7795 2.8272
somewhat likely|very likely 4.2994 0.8043 5.3453
Residual Deviance: 717.0249
AIC: 727.0249
Recall from the lecture videos that with the ordinal model, we have the same predictor coefficients for each level of the outcome, but different intercepts. This is why we see two components to the summary output: the Coefficients table and the Intercepts table.
Notice that the output does not provide p-values by default. We can manually calculate them ourselves and then create a table to show the estimates, standard errors, t values, and p values:
pvals <- pnorm(-abs(summary(ord_mod)$coef[,"t value"]))*2
ctable <- cbind(summary(ord_mod)$coef,pvals)
ctable Value Std. Error t value pvals
pared 1.04769010 0.2657894 3.9418050 8.087072e-05
public -0.05878572 0.2978614 -0.1973593 8.435464e-01
gpa 0.61594057 0.2606340 2.3632399 1.811594e-02
unlikely|somewhat likely 2.20391473 0.7795455 2.8271792 4.696004e-03
somewhat likely|very likely 4.29936315 0.8043267 5.3452947 9.027008e-08
Like other models we have seen, the intercepts are not typically the focus of the interpretations. Let’s focus on interpreting the coefficient estimates. First, let’s exponentiate the estimates and confidence intervals to interpret on the odds ratio scale.
exp_coefs <- exp(cbind(OR=coef(ord_mod),confint(ord_mod)))Waiting for profiling to be done...
We can interpret the coefficient estimate of pared as follows:
For students whose parents did attend college, the odds of being more likely (i.e., very or somewhat likely versus unlikely) to apply to graduate school is 2.85 times that of students whose parents did not go to college, holding constant all other variables.
Write the interpretations for the other two predictor variables
Go Further: What if you wanted to interpret the gpa variable per 0.1 increase instead of per 1.0 unit increase?
Assessment
We can access predicted probabilities of being in each category:
head(ord_mod$fitted.values) unlikely somewhat likely very likely
1 0.5488310 0.3593310 0.09183798
2 0.3055632 0.4759496 0.21848725
3 0.2293835 0.4781951 0.29242138
4 0.6161224 0.3126888 0.07118879
5 0.6560149 0.2833901 0.06059505
6 0.6609240 0.2797117 0.05936430
head(predict(ord_mod))[1] unlikely somewhat likely somewhat likely unlikely
[5] unlikely unlikely
Levels: unlikely somewhat likely very likely
Generate the confusion matrix to assess the accuracy of the predictions.
Finally, we should assess the proportional odds assumption. To do this, we can compare the predicted probabilities using the multinomial model, which is a more precise model, to the predicted probabilities with the ordinal model. Note that this is a subjective process because it is difficult to know how different the predicted probabilities should be to conclude that the assumption is violated. It is also useful to think through whether or not it is reasonable to assume that the odds of being in one category vs another would increase proportionally.
To generate predictions, let’s create a new data frame with different combinations of the predictor values. It is typically easiest to use different combinations of the categorical variables and hold the continuous predictors constant.
new.data <- data.frame(pared=c(0,1,0,1),
gpa=mean(dat$gpa),
public=c(0,1,1,0))The new data contains a constant value of GPA and each unique combination of the values of parental education and institution type
new.data pared gpa public
1 0 2.998925 0
2 1 2.998925 1
3 0 2.998925 1
4 1 2.998925 0
Now we can compare predicted probabilities from the ordinal model and the multinomial model
library(nnet) #package for multinomial model
mult_mod <- multinom(apply~pared+gpa+public,
data=dat)# weights: 15 (8 variable)
initial value 439.444915
iter 10 value 357.012275
final value 356.996982
converged
predict(ord_mod, new.data, type="probs") unlikely somewhat likely very likely
1 0.5882547 0.3324677 0.07927756
2 0.3470234 0.4650134 0.18796324
3 0.6024157 0.3224929 0.07509137
4 0.3338251 0.4690740 0.19710087
predict(mult_mod, new.data, type="probs") unlikely somewhat likely very likely
1 0.5840849 0.3426999 0.0732152
2 0.3690459 0.3689135 0.2620406
3 0.6387143 0.2465215 0.1147642
4 0.3316785 0.5040241 0.1642974
Notice that the predicted probabilities for the unlikely category are very similar for both models. We do see some differences for the other two categories, but overall, I would conclude that we do not have strong evidence that the proportional odds assumption is violated.
You may also want to generate the confusion matrix for the multinomial model and see how it compares to the confusion matrix you generated for the ordinal model.