Upload
janis-page
View
218
Download
0
Tags:
Embed Size (px)
Citation preview
1
Topic 4 :
Ordered Logit Analysis
2
Often we deal with data where the responses are ordered – e.g. :
(i) Eyesight tests – bad; average; good(ii) Voting – rank the candidates(iii) Bond ratings – A+++, A++, A+, A, B+++
(iv) We could set this up as a multinomial logit model, but this would ignore an important piece of information in the data
the ordering of the values. Of course, ordinary least squares would have a problem in the opposite direction – the numbers would be used as if the values meant something.
3
To see how we can deal with such data sensibly, let’s reconsider the (binary) logit set-up and motivation. One way to proceed is to assume there is a “latent” (unobservable) variable, y*, and we observe code
*if;0
*if;1
y
yy
where is a threshold value.
4
XF
xF
xx
xx
xyxy
1
Pr
Pr
*Pr1Pr
Then write y* = x +
if the underlying distribution is symmetric
5
xFxy
xFxy
10Pr
1Pr
Setting = 0, we have :
and then choose the cdf of logit as the link function.
6
*;
*;2
*;1
*;0let
*
1
21
10
0
yJ
y
y
yy
xy
J
Now let’s use this approach with our ordered data. We have a latent variable, y*, where
The Uj’s are unknown parameters to be estimated with the ’s.
7
xF
x
xxy
0
0
0
Pr
Pr0Pr
xFxF
xx
xxy
1
10
10
Pr
Pr1Pr
8
xF
x
x
xxJy
J
J
J
J
1
1
1
1
1
Pr1
Pr
PrPr
Of course, for all these probabilities to be positive, we require that .1210 J
9
Now to get the ’s and the ’s, note the following. For the ith observation, the log-likelihood is
xFJyI
xFxFyI
xFyIL
Ji
i
ii
1
01
0
1log
log1
log0,log
where I(E) = 1 if event E occurs= 0 otherwise
Then sum over all “n” observations to get the full log likelihood function (assuming independence). As usual, there is a unique maximum.
10
As for the signs of the coefficients, we need to look carefully at the marginal effects – Recall that :
xFP
xFxFP
xFP
JJ
1
011
00
1
11
So as for the marginal effects –
00
1
1
for 1, , 1
kk
jk j j
k
Jk J
k
Pf x
x
Pf x f x j J
x
Pf x
x
12
So, sign of P0 marginal effect is opposite to the coefficient sign; sign of PJ marginal effect is the same as the coefficient sign; other signs of ambiguous. One has to be very careful when interpreting the coefficients in this model.
13
Example : Consider the data set in the cast example in which the dependent variable was the response to the question, “If you found a wallet on the street, would you (1) keep the wallet and the money (2) keep the money and return the wallet (3) return both the wallet and the money”. There is an obvious ordering in the responses : 1 is the most unethical response, 3 is the most ethical; 2 is in the middle.
14
Intercept 1 = 0 = – 3.2691
Intercept 2 = 0 = – 1.4913
• Likelihood ratio :
H0 = MALE = BUSINESS = PUNISH = EXPLAIN = 0
H1 = otherwise
773.44
367.30714.352
2
RUR nLnLLR
• Score test for the proportional odds assumptionsH0 = model1 = model2
H1 = otherwise
15
DATA WALLET;
INFILE 'D:\TEACHING\MS4225\WALLET.TXT';
INPUT WALLET MALE BUSINESS PUNISH EXPLAIN;
PROC LOGISTIC DATA=WALLET;
MODEL WALLET=MALE BUSINESS PUNISH EXPLAIN;
RUN;
16
The SAS System The LOGISTIC Procedure Model Information Data Set WORK.WALLET Response Variable WALLET Number of Response Levels 3 Number of Observations 195 Model cumulative logit Optimization Technique Fisher's scoring Response Profile Ordered Total Value WALLET Frequency 1 1 24 2 2 50 3 3 121 Probabilities modeled are cumulated over the lower Ordered Values. Model Convergence Status Convergence criterion (GCONV=1E-8) satisfied. Score Test for the Proportional Odds Assumption Chi-Square DF Pr > ChiSq 5.1514 4 0.2721
17
The LOGISTIC Procedure Model Fit Statistics Intercept Intercept and Criterion Only Covariates AIC 356.140 319.367 SC 362.686 339.005 -2 Log L 352.140 307.367 Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio 44.7727 4 <.0001 Score 40.8753 4 <.0001 Wald 38.5746 4 <.0001 Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1 1 -3.2691 0.5612 33.9325 <.0001 Intercept 2 1 -1.4913 0.5085 8.6012 0.0034 MALE 1 1.0636 0.3255 10.6771 0.0011 BUSINESS 1 0.7370 0.3515 4.3973 0.0360 PUNISH 1 0.6874 0.2246 9.3644 0.0022 EXPLAIN 1 -1.0452 0.3392 9.4972 0.0021 The LOGISTIC Procedure Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits MALE 2.897 1.531 5.483 BUSINESS 2.090 1.049 4.161 PUNISH 1.989 1.280 3.089 EXPLAIN 0.352 0.181 0.684 Association of Predicted Probabilities and Observed Responses Percent Concordant 67.7 Somers' D 0.463 Percent Discordant 21.4 Gamma 0.519 Percent Tied 10.9 Tau-a 0.248
Pairs 10154 c 0.731
18
DATA WALLET;INFILE 'D:\TEACHING\MS4225\WALLET.TXT';INPUT WALLET MALE BUSINESS PUNISH EXPLAIN;DATA A;SET WALLET;IF WALLET=3 THEN WALLET=2;RUN;PROC LOGISTIC DATA=A;MODEL WALLET = MALE BUSINESS PUNISH EXPLAIN;RUN;
19
Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1 -3.7153 0.8017 21.4793 <.0001 MALE 1 0.8268 0.5290 2.4426 0.1181 BUSINESS 1 1.0129 0.5142 3.8810 0.0488 PUNISH 1 1.0108 0.3075 10.8032 0.0010 EXPLAIN 1 -1.2760 0.5112 6.2311 0.0126
20
DATA WALLET;INFILE 'D:\TEACHING\MS4225\WALLET.TXT';INPUT WALLET MALE BUSINESS PUNISH EXPLAIN;DATA A;SET WALLET;IF WALLET=1 THEN WALLET=2;RUN;PROC LOGISTIC DATA=A;MODEL WALLET = MALE BUSINESS PUNISH EXPLAIN;RUN;
21
Analysis of Maximum Likelihood Estimates
Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -1.3189 0.5465 5.8243 0.0158 MALE 1 1.1845 0.3408 12.0824 0.0005 BUSINESS 1 0.6357 0.3812 2.7808 0.0954 PUNISH 1 0.5071 0.2474 4.2030 0.0404 EXPLAIN 1 -1.0200 0.3662 7.7606 0.0053