Upload
darcy-reeves
View
214
Download
1
Embed Size (px)
Citation preview
October 6, 2009 Session 6 Slide 1
PSC 5940: Running Basic Multi-Level Models in R
Session 6
Fall, 2009
October 6, 2009 Session 6 Slide 2
Running Multilevel Models in R• Using lmer: “linear mixed-effects in R”
• Identify a grouping variable: “state”
• levels(state) # will show the categories:
> levels(state) [1] "AK" "AL" "AR" "AZ" "CA" "CO" "CT" "DC" "DE" "FL" "GA"[12] "HI" "IA" "ID" "IL" "IN" "KS" "KY" "LA" "MA" "MD" "ME"[23] "MI" "MN" "MO" "MS" "MT" "NC" "ND" "NE" "NH" "NJ" "NM"[34] "NV" "NY" "OH" "OK" "OR" "PA" "RI" "SC" "SD" "TN" "TX"[45] "UT" "VA" "VT" "WA" "WI" "WV" "WY”
Texas is element #44; Oklahoma is element #37; etc.
October 6, 2009 Session 6 Slide 3
Running Multilevel Models in R• Re-name some variables for analysis
• income<-e130e_co• educ<-e2b_edu
• Run a simple linear model for comparison:• OLS1<-lm(income ~ educ)
lm(formula = income ~ educ)
Residuals: Min 1Q Median 3Q Max -9.2963 -2.5845 -0.5845 1.4600 16.5934
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.05071 0.27953 7.336 3.58e-13 ***educ 1.17794 0.07544 15.613 < 2e-16 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.704 on 1506 degrees of freedom (190 observations deleted due to missingness)Multiple R-squared: 0.1393, Adjusted R-squared: 0.1387 F-statistic: 243.8 on 1 and 1506 DF, p-value: < 2.2e-16
October 6, 2009 Session 6 Slide 4
Running Multilevel Models in R• For a simple-minded intercept-varying
model (with no slope coefficients):
• ML1<-lmer(income ~ 1 + (1 | state))Formula: income ~ 1 + (1 | state) AIC BIC logLik deviance REMLdev 8480 8496 -4237 8472 8474Random effects: Groups Name Variance Std.Dev. state (Intercept) 0.19588 0.44258 Residual 15.68736 3.96073 Number of obs: 1513, groups: state, 51
Fixed effects: Estimate Std. Error t value(Intercept) 6.0937 0.1304 46.75
October 6, 2009 Session 6 Slide 5
Running Multilevel Models in R• To see the fixed effect:
• fixef(ML1) • Returns the average intercept: 6.093686
• ranef(ML1)• Returns the variation for each state around
the mean intercept: $state (Intercept)AK 0.03582853AL -0.34874818AR -0.35354326AZ -0.09795315CA 0.74016962CO 0.22587276(etc.)
October 6, 2009 Session 6 Slide 6
Running Multilevel Models in R• A somewhat more interesting ML model:
• ML2<-lmer(income ~ educ + (1 | state))• Returns a model with a fixed slope and varying
intercepts. Summary gets you this:
Formula: income ~ educ + (1 | state) AIC BIC logLik deviance REMLdev 8238 8259 -4115 8224 8230Random effects: Groups Name Variance Std.Dev. state (Intercept) 0.13219 0.36357 Residual 13.59123 3.68663 Number of obs: 1508, groups: state, 51
Fixed effects: Estimate Std. Error t value(Intercept) 2.0361 0.2867 7.102educ 1.1751 0.0757 15.524
October 6, 2009 Session 6 Slide 7
Running Multilevel Models in R• To observe the model estimates:
• fixef(ML2): (Intercept) educ
2.036075 1.175145
• ranef(ML2):
• Calculation of the intercept for Texas (46th state):• coef(ML2)$state[46,1], returns:
[1] 2.169662
$state (Intercept)AK 3.310271e-02AL -3.366027e-01AR -2.271760e-01AZ -1.131920e-01CA 4.937171e-01CO 6.491345e-02CT 2.490139e-01
October 6, 2009 Session 6 Slide 8
Running Multilevel Models in R
• To calculate the 95% confidence interval for Texas:• coef(ML2)$state[46,1]+c(-2,2)*se.ranef(ML2)$state[46]
[1] 1.527386 2.811937
• The 95% confidence interval for the model slope is:
• fixef(ML2)["educ"]+c(-2,2)*se.fixef(ML2)["educ"]
• which returns:• [1] 1.023752 1.326537
October 6, 2009 Session 6 Slide 9
Running Multilevel Models in R• A still more interesting ML model:
• ML2<-lmer(income ~ educ + (1 + educ | state))• Returns a model with both a varying slope and
intercept for each state. Summary gets you this:Formula: income ~ educ + (1 + educ | state) AIC BIC logLik deviance REMLdev 8233 8265 -4111 8216 8221Random effects: Groups Name Variance Std.Dev. Corr state (Intercept) 0.65751 0.81087 educ 0.13761 0.37096 -1.000 Residual 13.36960 3.65645 Number of obs: 1508, groups: state, 51
Fixed effects: Estimate Std. Error t value(Intercept) 2.1212 0.3172 6.687educ 1.1431 0.1017 11.235
October 6, 2009 Session 6 Slide 10
Running Multilevel Models in R• To observe the model estimates:
• fixef(ML3): (Intercept) educ
2.121166 1.143087
• ranef(ML3):
• Calculation of the intercept and slopes for Texas:• coef(ML3)$state[46,1], returns: [1] 1.662176
• coef(ML3)$state[46,2], returns: [2] 1.353068
$state (Intercept) educAK -0.062791841 0.028726346AL 1.064733054 -0.487099757AR 0.716358907 -0.327723694AZ -0.174025953 0.079614321CA -0.970880883 0.444163765CO -0.594929356 0.272171455CT -0.951004214 0.435070481
October 6, 2009 Session 6 Slide 11
Workshop 1:
• Build ML Model using• Ideology to Predict GHG Risk
• Use the state variable as the group level
• How much is the model residual reduced by allowing states to vary?
• Present it to me in 20 min.
October 6, 2009 Session 6 Slide 12
BREAK
October 6, 2009 Session 6 Slide 13
Workshop 2:
• Data presentations• Sources, characteristics
• Preliminary group-level models?
October 6, 2009 Session 6 Slide 14
For Next Week
• Read Gelman & Hill Ch. 13
• Build plots:• Figure out how to replicate
Figure 12.4 (p. 257)
• code is shown on p. 262.
• Present your initial group-level models