Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Analisis Data Kategorik - STK654 (Materi UAS)
Dr. Kusman Sadik, M.Si
Program Studi Magister Statistika Terapan
Departemen Statistika IPB, Semester Ganjil 2019/2020
IPB University─ Bogor Indonesia ─ Inspiring Innovation with Integrity
Model Regresi Logit Ordinal(Peubah Respon Multikategori-Ordinal)
2
The main feature of the ordinal logistic models is that
they predict the log odds, odds, or probability of a
response occurring at or below any given outcome
category.
For example, ordering the educational attainment
categories from lowest to highest (less than high
school, high school, junior college, bachelor’s degree,
graduate degree) we can use this model to predict the
probability of being (for example) at the bachelor’s
level or below from age at first marriage.
3
.... (a)
(a)
4
5
a
6
The slopes are assumed to be the same for all logits
and, under this assumption, the model is known as
the proportional odds model.
The underlying assumption of equivalent slopes
across all logits can, and should, be tested to verify
that this model is appropriate.
If this assumption appears to be violated, then one
could fit the nominal, or more complicated alternative
models.
7
We use data from the 2006 GSS to predict a
respondent’s educational attainment level (degree),
measured as either less than high school, high school,
junior college, bachelor’s degree, or graduate degree,
from the respondent’s age when first married
(agewed).
The outcome variable (educational attainment level) is
treated as ordinal, so the proportional odds model is
used.
8
# Model Logistik Ordinal untuk Data GSS (Azen, sub-bab 10.5)
# Data Respon : Harus Data Terurut
dataku
9
# Pendugaan nilai peluang untuk tiap kategori
prediksi
10
degree degree.order agewed
1 HIGH SCHOOL 2 22
2 HIGH SCHOOL 2 23
3 HIGH SCHOOL 2 24
4 HIGH SCHOOL 2 22
5 LT HIGH SCHOOL 1 28
6 LT HIGH SCHOOL 1 21
7 HIGH SCHOOL 2 29
8 LT HIGH SCHOOL 1 19
9 LT HIGH SCHOOL 1 28
10 LT HIGH SCHOOL 1 29
.
.
.
1158 HIGH SCHOOL 2 21
1159 HIGH SCHOOL 2 22
1160 BACHELOR 4 28
Catatan : yang dipakai “degree.order” bukan
“degree”, karena “degree” belum terurut.
11
degree.order
degree 1 2 3 4 5
LT HIGH SCHOOL 195 0 0 0 0
BACHELOR 0 0 0 185 0
GRADUATE 0 0 0 0 104
HIGH SCHOOL 0 590 0 0 0
JUNIOR COLLEGE 0 0 86 0 0
12
Coefficients:
Value Std. Error t value
agewed 0.05059 0.01031 4.908
Intercepts:
Value Std. Error t value
1|2 -0.4549 0.2431 -1.8711
2|3 1.9226 0.2501 7.6886
3|4 2.2940 0.2530 9.0670
4|5 3.5242 0.2682 13.1389
Residual Deviance: 3096.156
AIC: 3106.156
13
degree.order agewed P.Y.1 P.Y.2 P.Y.3 P.Y.4 P.Y.5
1 2 22 0.172538473 0.51950698 0.07310156 0.1525423 0.08231071
2 2 23 0.165435593 0.51572572 0.07477403 0.1578513 0.08621336
3 2 24 0.158569083 0.51150679 0.07640620 0.1632351 0.09028283
4 2 22 0.172538473 0.51950698 0.07310156 0.1525423 0.08231071
5 1 28 0.133396465 0.49051546 0.08241161 0.1853397 0.10833676
6 1 21 0.179880581 0.52283962 0.07139470 0.1473156 0.07856954
7 2 29 0.127656350 0.48431357 0.08375223 0.1909568 0.11332101
8 1 19 0.195291690 0.52812158 0.06790088 0.1371355 0.07155037
9 1 28 0.133396465 0.49051546 0.08241161 0.1853397 0.10833676
10 1 29 0.127656350 0.48431357 0.08375223 0.1909568 0.11332101
11 4 30 0.122128426 0.47776346 0.08501689 0.1965871 0.11850410
12 2 21 0.179880581 0.52283962 0.07139470 0.1473156 0.07856954
13 4 24 0.158569083 0.51150679 0.07640620 0.1632351 0.09028283
14 2 18 0.203364066 0.53005548 0.06612507 0.1321935 0.06826190
15 4 52 0.043717274 0.28635273 0.08659973 0.2930019 0.29032839
16 4 26 0.145531789 0.50180589 0.07952560 0.1741929 0.09894384
17 1 29 0.127656350 0.48431357 0.08375223 0.1909568 0.11332101
18 2 19 0.195291690 0.52812158 0.06790088 0.1371355 0.07155037
19 1 25 0.151935686 0.50686240 0.07799206 0.1686853 0.09452453
20 4 18 0.203364066 0.53005548 0.06612507 0.1321935 0.06826190
21 1 16 0.220246789 0.53248085 0.06254204 0.1226288 0.06210152
22 2 20 0.187464338 0.52571395 0.06965925 0.1421779 0.07498452
.
.
.
1160 4 28 0.133396465 0.49051546 0.08241161 0.1853397 0.10833676
14
Output SAS : Bandingkan dengan Output R
15
Output SAS : Bandingkan dengan Output R
16
Perbedaan Model antara R, SPSS, dan SAS
R dan SPSS
SAS
17
Perbedaan Model antara R, SPSS, dan SAS
18
Interpretasi dan Pengujian Parameter
19
Ilustrasi Interpretasi Parameter (Output R)
Coefficients:
Value Std. Error t value
agewed 0.05059 0.01031 4.908
Intercepts:
Value Std. Error t value
1|2 -0.4549 0.2431 -1.8711
2|3 1.9226 0.2501 7.6886
3|4 2.2940 0.2530 9.0670
4|5 3.5242 0.2682 13.1389
Nilai negatif dari β
Misal untuk Y = 1:
ln𝑃(𝑌≤1)
𝑃(𝑌>1)=ln
𝑃 𝑌≤1
1−𝑃 𝑌≤1
= −0.455 − 0.051(𝑎𝑔𝑒𝑤𝑒𝑑)
20
Penentuan Nilai Peluang Kumulatif
Penjabaran untuk Y = 1:
ln𝑃(𝑌≤1)
𝑃(𝑌>1)=ln
𝑃 𝑌≤1
1−𝑃 𝑌≤1= −0.455 − 0.051𝑥
⇔𝑃 𝑌≤1
1−𝑃 𝑌≤1= 𝑒−0.455−0.051𝑥
⇔ 𝑃 𝑌 ≤ 1 = (1 − 𝑃 𝑌 ≤ 1 )𝑒−0.455−0.051𝑥
⇔ 𝑃 𝑌 ≤ 1 =𝑒−0.455−0.051𝑥
1 + 𝑒−0.455−0.051𝑥
21
𝑃 𝑌 ≤ 1 =𝑒−0.455−0.051𝑥
1 + 𝑒−0.455−0.051𝑥=
𝑒−0.455−0.051(20)
1 + 𝑒−0.455−0.051(20)=
0.2288
1.2288= 0.1862
𝑃 𝑌 ≤ 2 =𝑒1.923−0.051𝑥
1 + 𝑒1.923−0.051𝑥=
𝑒1.923−0.051(20)
1 + 𝑒1.923−0.051(20)=
2.4670
3.4670= 0.7116
dengan cara yang sama dapat dihitung 𝑃 𝑌 ≤ 3 dan 𝑃 𝑌 ≤ 4
Penentuan Nilai Peluang Kumulatif
22
𝑃 𝑌 = 4 = 𝑃 𝑌 ≤ 4 − 𝑃 𝑌 ≤ 3
dengan cara yang sama dapat dihitung peluang setiap kategori 𝑌: 𝑃 𝑌 = 1 ,
𝑃 𝑌 = 2 , 𝑃 𝑌 = 3 , 𝑃 𝑌 = 4 , dan 𝑃 𝑌 = 5 .
Penentuan Nilai Peluang Setiap Kategori
23
Kesimpulan
Berdasarkan nilai dugaan peluang tiap kategori tersebut,
jika seseorang diketahui berumur 20 tahun saat menikah,
apa dugaan tingkat pendidikan terakhir orang tersebut?
24
25
1 Gunakan Program R untuk data Mental Impairment (Agresti, sub-
bab 7.2.4, hlm. 279 ) .
a. Bandingkan hasilnya dengan output SAS pada buku Agresti
tersebut serta berikan interpretasi pada tiap nilai dugaan
parameter model.
b. Berdasarkan hasil pada poin (a) di atas, tentukan nilai
dugaan P(Y = 1), P(Y = 3), dan P(Y > 2).
c. Tentukan model terbaik.
d. Misalkan seorang individu diketahui bahwa Life Events (x1 =
8) dan SES (x2 = 1), berdasarkan model pada poin (c)
tentukan dugaan “Mental Impairment”.
26
27
2 Kerjakan Problem 10.7 (Azen, 2011)
28
29
3 Kerjakan Problem 10.8 (Azen, 2011)
30
Pustaka
1. Azen, R. dan Walker, C.R. (2011). Categorical Data
Analysis for the Behavioral and Social Sciences.
Routledge, Taylor and Francis Group, New York.
2. Agresti, A. (2002). Categorical Data Analysis 2nd. New
York: Wiley.
3. Pustaka lain yang relevan.
31
Bisa di-download di
kusmansadik.wordpress.com
32
Terima Kasih