14
October 6, 2009 Session 6 Slide 1 PSC 5940: Running Basic Multi-Level Models in R Session 6 Fall, 2009

October 6, 2009 Session 6Slide 1 PSC 5940: Running Basic Multi- Level Models in R Session 6 Fall, 2009

Embed Size (px)

Citation preview

Page 1: October 6, 2009 Session 6Slide 1 PSC 5940: Running Basic Multi- Level Models in R Session 6 Fall, 2009

October 6, 2009 Session 6 Slide 1

PSC 5940: Running Basic Multi-Level Models in R

Session 6

Fall, 2009

Page 2: October 6, 2009 Session 6Slide 1 PSC 5940: Running Basic Multi- Level Models in R Session 6 Fall, 2009

October 6, 2009 Session 6 Slide 2

Running Multilevel Models in R• Using lmer: “linear mixed-effects in R”

• Identify a grouping variable: “state”

• levels(state) # will show the categories:

> levels(state) [1] "AK" "AL" "AR" "AZ" "CA" "CO" "CT" "DC" "DE" "FL" "GA"[12] "HI" "IA" "ID" "IL" "IN" "KS" "KY" "LA" "MA" "MD" "ME"[23] "MI" "MN" "MO" "MS" "MT" "NC" "ND" "NE" "NH" "NJ" "NM"[34] "NV" "NY" "OH" "OK" "OR" "PA" "RI" "SC" "SD" "TN" "TX"[45] "UT" "VA" "VT" "WA" "WI" "WV" "WY”

Texas is element #44; Oklahoma is element #37; etc.

Page 3: October 6, 2009 Session 6Slide 1 PSC 5940: Running Basic Multi- Level Models in R Session 6 Fall, 2009

October 6, 2009 Session 6 Slide 3

Running Multilevel Models in R• Re-name some variables for analysis

• income<-e130e_co• educ<-e2b_edu

• Run a simple linear model for comparison:• OLS1<-lm(income ~ educ)

lm(formula = income ~ educ)

Residuals: Min 1Q Median 3Q Max -9.2963 -2.5845 -0.5845 1.4600 16.5934

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.05071 0.27953 7.336 3.58e-13 ***educ 1.17794 0.07544 15.613 < 2e-16 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.704 on 1506 degrees of freedom (190 observations deleted due to missingness)Multiple R-squared: 0.1393, Adjusted R-squared: 0.1387 F-statistic: 243.8 on 1 and 1506 DF, p-value: < 2.2e-16

Page 4: October 6, 2009 Session 6Slide 1 PSC 5940: Running Basic Multi- Level Models in R Session 6 Fall, 2009

October 6, 2009 Session 6 Slide 4

Running Multilevel Models in R• For a simple-minded intercept-varying

model (with no slope coefficients):

• ML1<-lmer(income ~ 1 + (1 | state))Formula: income ~ 1 + (1 | state) AIC BIC logLik deviance REMLdev 8480 8496 -4237 8472 8474Random effects: Groups Name Variance Std.Dev. state (Intercept) 0.19588 0.44258 Residual 15.68736 3.96073 Number of obs: 1513, groups: state, 51

Fixed effects: Estimate Std. Error t value(Intercept) 6.0937 0.1304 46.75

Page 5: October 6, 2009 Session 6Slide 1 PSC 5940: Running Basic Multi- Level Models in R Session 6 Fall, 2009

October 6, 2009 Session 6 Slide 5

Running Multilevel Models in R• To see the fixed effect:

• fixef(ML1) • Returns the average intercept: 6.093686

• ranef(ML1)• Returns the variation for each state around

the mean intercept: $state (Intercept)AK 0.03582853AL -0.34874818AR -0.35354326AZ -0.09795315CA 0.74016962CO 0.22587276(etc.)

Page 6: October 6, 2009 Session 6Slide 1 PSC 5940: Running Basic Multi- Level Models in R Session 6 Fall, 2009

October 6, 2009 Session 6 Slide 6

Running Multilevel Models in R• A somewhat more interesting ML model:

• ML2<-lmer(income ~ educ + (1 | state))• Returns a model with a fixed slope and varying

intercepts. Summary gets you this:

Formula: income ~ educ + (1 | state) AIC BIC logLik deviance REMLdev 8238 8259 -4115 8224 8230Random effects: Groups Name Variance Std.Dev. state (Intercept) 0.13219 0.36357 Residual 13.59123 3.68663 Number of obs: 1508, groups: state, 51

Fixed effects: Estimate Std. Error t value(Intercept) 2.0361 0.2867 7.102educ 1.1751 0.0757 15.524

Page 7: October 6, 2009 Session 6Slide 1 PSC 5940: Running Basic Multi- Level Models in R Session 6 Fall, 2009

October 6, 2009 Session 6 Slide 7

Running Multilevel Models in R• To observe the model estimates:

• fixef(ML2): (Intercept) educ

2.036075 1.175145

• ranef(ML2):

• Calculation of the intercept for Texas (46th state):• coef(ML2)$state[46,1], returns:

[1] 2.169662

$state (Intercept)AK 3.310271e-02AL -3.366027e-01AR -2.271760e-01AZ -1.131920e-01CA 4.937171e-01CO 6.491345e-02CT 2.490139e-01

Page 8: October 6, 2009 Session 6Slide 1 PSC 5940: Running Basic Multi- Level Models in R Session 6 Fall, 2009

October 6, 2009 Session 6 Slide 8

Running Multilevel Models in R

• To calculate the 95% confidence interval for Texas:• coef(ML2)$state[46,1]+c(-2,2)*se.ranef(ML2)$state[46]

[1] 1.527386 2.811937

• The 95% confidence interval for the model slope is:

• fixef(ML2)["educ"]+c(-2,2)*se.fixef(ML2)["educ"]

• which returns:• [1] 1.023752 1.326537

Page 9: October 6, 2009 Session 6Slide 1 PSC 5940: Running Basic Multi- Level Models in R Session 6 Fall, 2009

October 6, 2009 Session 6 Slide 9

Running Multilevel Models in R• A still more interesting ML model:

• ML2<-lmer(income ~ educ + (1 + educ | state))• Returns a model with both a varying slope and

intercept for each state. Summary gets you this:Formula: income ~ educ + (1 + educ | state) AIC BIC logLik deviance REMLdev 8233 8265 -4111 8216 8221Random effects: Groups Name Variance Std.Dev. Corr state (Intercept) 0.65751 0.81087 educ 0.13761 0.37096 -1.000 Residual 13.36960 3.65645 Number of obs: 1508, groups: state, 51

Fixed effects: Estimate Std. Error t value(Intercept) 2.1212 0.3172 6.687educ 1.1431 0.1017 11.235

Page 10: October 6, 2009 Session 6Slide 1 PSC 5940: Running Basic Multi- Level Models in R Session 6 Fall, 2009

October 6, 2009 Session 6 Slide 10

Running Multilevel Models in R• To observe the model estimates:

• fixef(ML3): (Intercept) educ

2.121166 1.143087

• ranef(ML3):

• Calculation of the intercept and slopes for Texas:• coef(ML3)$state[46,1], returns: [1] 1.662176

• coef(ML3)$state[46,2], returns: [2] 1.353068

$state (Intercept) educAK -0.062791841 0.028726346AL 1.064733054 -0.487099757AR 0.716358907 -0.327723694AZ -0.174025953 0.079614321CA -0.970880883 0.444163765CO -0.594929356 0.272171455CT -0.951004214 0.435070481

Page 11: October 6, 2009 Session 6Slide 1 PSC 5940: Running Basic Multi- Level Models in R Session 6 Fall, 2009

October 6, 2009 Session 6 Slide 11

Workshop 1:

• Build ML Model using• Ideology to Predict GHG Risk

• Use the state variable as the group level

• How much is the model residual reduced by allowing states to vary?

• Present it to me in 20 min.

Page 12: October 6, 2009 Session 6Slide 1 PSC 5940: Running Basic Multi- Level Models in R Session 6 Fall, 2009

October 6, 2009 Session 6 Slide 12

BREAK

Page 13: October 6, 2009 Session 6Slide 1 PSC 5940: Running Basic Multi- Level Models in R Session 6 Fall, 2009

October 6, 2009 Session 6 Slide 13

Workshop 2:

• Data presentations• Sources, characteristics

• Preliminary group-level models?

Page 14: October 6, 2009 Session 6Slide 1 PSC 5940: Running Basic Multi- Level Models in R Session 6 Fall, 2009

October 6, 2009 Session 6 Slide 14

For Next Week

• Read Gelman & Hill Ch. 13

• Build plots:• Figure out how to replicate

Figure 12.4 (p. 257)

• code is shown on p. 262.

• Present your initial group-level models