Upload
aubrey-rich
View
224
Download
0
Tags:
Embed Size (px)
Citation preview
Identifying At-Risk Students With Two-Phased Regression Models
Jing Wang-Dahlback, Director of Institutional ResearchJonathan Shiveley, Research Analyst
Office of Institutional ResearchSacramento State
About This Study
This study focuses on 1-year and 2-year retention of first-time freshman because on average 28% of the student population will drop out within the
first two years.
Create early and final regression
models to predict 1 and 2- year
retention based on data availability.
Use the regression models to calculate individual student
risk scores.
Use results to initiate action
through retention outreach efforts.
3
Trends of 1-Year and 2-Year Retention
2010 Cohort 2011 Cohort 2012 Cohort50.0%
60.0%
70.0%
80.0%
90.0%
100.0%
82.6% 81.4% 82.2%
72.7% 71.3% 72.0%
2010-2012 First-time Freshmen Retention by Cohort1-Year Retention 2-Year Retention
41-Year and 2-Year Retention by College
60.0%
65.0%
70.0%
75.0%
80.0%
85.0%
90.0%
82.2%81.5% 81.5%
84.2%
82.1%84.0%
81.9%80.7%
71.7% 71.5%72.8%
69.9%71.5%
73.1% 73.2%
71.0%
1-Year Retention 2-Year Retention
5
1-Year Retention: ProfileTable 1. 1-Year Retention
Persisted after 1 year Withdrew after 1 year Gap Total Count
Statistical Significance Count %/Mean Count Rate
Demographic Characteristics Gender Male 2,916 80.2% 719 19.8%
-3.1%3,635
YesFemale 4,301 83.4% 859 16.6% 5,160
Race/Ethnicity URM 2,537 81.4% 578 18.6%
-0.9%3,115
NoNon-URM 4,680 82.4% 1,000 17.6% 5,680
First Generation College StudentYes 2,410 81.5% 548 18.5%
-1.0%2,958
NoNo 4,440 82.5% 942 17.5% 5,382
Low Income (Pell Grant Eligible)Yes 3,864 82.0% 850 18.0%
-0.2%4,714
NoNo 3,353 82.2% 728 17.8% 4,081
Commuter StatusLiving on Campus 2,223 81.9% 492 18.1%
-0.3%2,715
NoCommuter 4,994 82.1% 1086 17.9% 6,080Distance to School 7,217 26.0 1,578 29.8 -3.8 8,795 Yes
College ReadinessNeed Remediation 4,204 79.6% 1079 20.4%
-6.2%5,283
YesNo Remediation 3,013 85.8% 499 14.2% 3,512
Remediation TypeEnglish (E) 1,339 83.1% 272 16.9% 1,611
Yes, E > B & MMath (M) 949 79.3% 248 20.7% 1,197Both (B) 1,916 77.4% 559 22.6% 2,475
Yes, N > B & MNone (N) 3,013 85.8% 499 14.2% 3,512
Test ScoresHS GPA 7,195 3.26 1572 3.12 0.14 8,767 YesSAT Verbal 6,663 471 1442 460 11 8,105 YesSAT Math 6,663 490 1442 476 14 8,105 YesEPT 4,647 142 1110 140 1 5,757 YesELM 7,217 29 1578 31 -2 8,795 YesAP Units 729 8.6 116 8.9 -0.3 845 No
* T-test or Chi-Square Test, p<.001, higher value is highlighted in yellow; p<.01, higher value is highlighted in green; p<.05, higher value is highlighted in blue.
6
2-Year Retention: Profile
Table 2. 2-Year RetentionPersisted after 2 years Withdrew after 2 Years
Gap Total CountStatistical
Significance Count %/Mean Count RateDemographic CharacteristicsGender Male 2,557 70.3% 1,078 29.7%
-2.8%3,635
YesFemale 3,773 73.1% 1,387 26.9% 5,160
Race/EthnicityURM 2,200 70.6% 915 29.4%
-2.1%3115
YesNon-URM 4,130 72.7% 1,550 27.3% 5,680
First Generation of College StudentYes 2,113 71.4% 845 28.6%
-1.0%2,958
NoNo 3,897 72.4% 1,485 27.6% 5382
Low Income (Pell Grant Eligible)Yes 3,384 71.8% 1,330 28.2%
-0.4%4,714
NoNo 2,946 72.2% 1,135 27.8% 4081
Commuting StatusLiving on Campus 1,951 71.9% 764 28.1%
-0.2%2,715
NoCommuter 4,379 72.0% 1701 28.0% 6,080
Distance to School 6,330 25.9 2465 28.9 -3.0 8,795 YesCollege ReadinessNeed Remediation 3,632 68.7% 1651 31.3%
-8.1%5,283
YesNo Remediation 2,698 76.8% 814 23.2% 3,512
Remediation TypeEnglish (E) 1,204 74.7% 407 25.3% 1,611
Yes, E > B & MMath (M) 816 68.2% 381 31.8% 1,197Both (B) 1,612 65.1% 863 34.9% 2,475
Yes, N > B & MNone (N) 2,698 76.8% 814 23.2% 3,512
Test ScoresHS GPA 6,312 3.28 2,455 3.13 0.15 8,767 YesSAT Verbal 5,844 473 2,261 461 12 8,105 YesSAT Math 5,844 492 2,261 476 16 8,105 YesEPT 4,031 142 1,726 141 1 5,757 YesELM 6,330 29 2,465 31 -2 8,795 YesAP Unit 638 8.4 207 9.3 -0.8 845 No
Highlights of 1-Year & 2-Year Retention Profiles
Demographic Characteristics
College Readiness
Among the selected 6 factors, only 2 or 3 factors had a significant impact on 1-year and 2-year retention rates. Those factors were: gender, underrepresented minorities, and distance to school.
All factors but AP units had a significant impact on 1-year or 2-year retention. Remediation is a key factor: The proportion withdrawals in need of remediation were 6% to 8% higher than those who persisted.
8
1-Year Retention: Academic PerformanceTable 3. Academic Performance (By the end of first year)
Persisted after 1 year Withdrew after 1 year Gap Total Count
Statistical Significance Count %/Mean Count Rate
Term 1 GPA 7,217 2.97 1,578 1.96 1.01 8,795 YesTerm 2 GPA 7,157 2.91 1,203 1.79 1.12 8,360 YesPass Rate (Overall GPA>=2.0)Pass 6,751 91.0% 666 9.0% 7,417
YesNot Pass 466 33.8% 912 66.2% 57.2% 1,378
Dean's List (Overall GPA>=3.0)Yes 3,461 92.7% 274 7.3%
18.5%3,735
YesNo 3,756 74.2% 1,304 25.8% 5,060
STEM Major Yes 1,652 82.6% 349 17.4%
0.6%2,001
NoNo 5,565 81.9% 1,229 18.1% 6,794
Major Status Declared Major 3,447 81.9% 761 18.1% 4,208
NoPre-Major 2,757 82.6% 581 17.4% 3,338Undecided 1,013 81.1% 236 18.9% 1,249
Changed Major Changed 659 61.0% 422 39.0%
-24.1%1,081
YesNo Change 6,558 85.0% 1,156 15.0% 7,714
Repeating Courses Yes 470 62.9% 277 37.1%
-20.9%747
YesNo 6,747 83.8% 1,301 16.2% 8,048
Unit Completion Units Attempted 7157 27 1203 26 1 8,360 YesUnits per term 14 13 1 Units Completed 7157 26 1203 18 8 8,360 YesUnits per term 13 9 4 Overall Units 7157 26 1203 17 9 8,360 YesUnits per term 13 8 5
* T-test , Chi-Square Test or ANOVA, p<.001, higher value is highlighted in yellow; p<.01, higher value is highlighted in green; p<.05, higher value is highlighted in blue.Note: STEM majors and declared majors/pre-majors/undecided were based on status at the second semester. Major change refers to changes which occurred between the first and second semester.
9
2-Year Retention: Academic PerformanceTable 4. Academic Performance (By the end of second year)
Persisted after 2 years Withdrew after 2 Years
Gap Total CountStatistical
Significance Count %/Mean Count RateTerm 3 GPA 6,243 2.93 974 2.31 0.62 7,217 YesTerm 4 GPA 6,224 2.92 714 2.32 0.59 6,938 Yes
Pass Rate (Overall GPA>=2.0)
Pass 6,133 83.7% 1,193 16.3%70.3%
7,326Yes
Not Pass 197 13.4% 1,272 86.6% 1,469
Dean's List (Overall GPA>=3.0)
Yes 2,802 86.0% 457 14.0%22.3%
3,259Yes
No 3,528 63.7% 2,008 36.3% 5,536STEM Major
Yes 1,398 72.0% 543 28.0%0.0%
1,941No
No 4,932 72.0% 1,922 28.0% 6,854Major Status
Declared Major 3,104 72.9% 1,152 27.1% 4,256Yes, Major & Pre >
UndecidedPre-Major 2,381 72.2% 919 27.8% 3,300Undecided 845 68.2% 394 31.8% 1,239
Changed Major
Changed 0 0.0% 0 0.0%0.0%
0No
No Change 0 0.0% 0 0.0% 0Repeating Courses
Yes 1,698 72.1% 657 27.9%0.2%
2,355No
No 4,632 71.9% 1,808 28.1% 6,440Unit Completion
Units Attempted 6224 54 714 51 3 6,938 YesUnits per term 13 13 1 Units Completed 6224 51 714 36 15 6,938 YesUnits per term 13 9 4 Overall Units 6224 52 714 41 11 6,938 YesUnits per term 13 10 3
* T-test , Chi-Square Test or ANOVA, p<.001, higher value is highlighted in yellow; p<.01, higher value is highlighted in green; p<.05, higher value is highlighted in blue.Note: STEM majors and declared majors/pre-majors/undecided were based on status at the second semester. Major change refers to changes which occurred between the first and second semester.
10 1-Year Retention: Intervention
On Fin
ancia
l Aid
Not o
n Aid
EOP
Fres
hmen
Sem
inar
Non-p
artic
ipan
ts
EOP
Lear
ning C
omm
unity
Non-p
artic
ipan
ts
Equity
Pro
gram
s
Non-p
artic
ipan
ts
Fres
hmen
Sem
inar
Non-p
artic
ipan
ts
Lear
ning C
omm
unity
Non-p
artic
ipan
ts50%
55%
60%
65%
70%
75%
80%
85%
90%
95%
100%
82.3%81.6%86.1%
81.3%85.7%
81.0%
86.7%
80.7%85.2%
80.5%
86.1%
80.5%
Intervention: 1-Year Retention Rate of Participants and Non-Participants
Intervention: 1-Year Retention Rate of Participants and Non-Participants
11 2-Year Retention: Intervention
On Finan
cial A
id
Not on Aid
EOP Fr
eshmen Se
minar
Non-parti
cipants
EOP Le
arning C
ommunity
Non-parti
cipants
Equity
Progra
m
Non-parti
cipants
Fresh
men Se
minar
Non-parti
cipants
Learn
ing Community
Non-parti
cipants
50%
55%
60%
65%
70%
75%
80%
85%
90%86.6%
57.7%
73.3%71.5%
73.5%71.3%
75.5%
70.8%
75.8%
70.1%
77.2%
70.0%
Intervention: 2-Year Retention Rate of Participants and Non-Partic-ipants
12
The Development of Regression Models
• Literature review
• Data availability
Identify variables (up to 36)
• Correlation• Collinearity• Missing values
Select variables (18-
19)
• Early models• Final models• Trim Outliers
Develop
regression
models
13
Early Model—1-Year Retention
Table 5 Regression Model: 1-Year Retention (Early model)
Predict Variables B S.E. Wald df Sig. Exp(B)Odds Ratio
(Recalculated) RankHigh School GPA .260 .132 3.867 1 .049 1.30 1.30 4
Fulltime (first term) -.478 .189 6.386 1 .012 0.62 1.61 2
Overall GPA 1.534 .066 546.739 1 .000 4.636 4.64 1
Overall Units .041 .012 12.154 1 .000 1.042 1.04
1st year repeater .439 .139 9.924 1 .002 1.552 1.55 3
Constant -3.044 .490 38.568 1 .000 .048
Model Indicators
Baseline P* 82.1% Chi-Square (df) 1632.985 (17)
Model N 5,577 Pseudo R2 .254 - .442
-2log L 3124.886 % Correctly predicted 83.3%
* Refers to 1-year retention rate.
14
Final Model—1-Year Retention
Table 6 Regression Model: 1-Year Retention (Final model)
Predict Variables B S.E. Wald df Sig. Exp(B)Odds Ratio
(recalculated.) Rank
Underrepresented Minority -.306 .149 4.195 1 .041 .737 1.36 5
Need remediation -.426 .170 6.238 1 .013 .653 1.53 4High School GPA -.455 .192 5.614 1 .018 .634 1.58 2Distance to the University -.002 .001 4.813 1 .028 .998 1.00 Learning Community -.453 .228 3.931 1 .047 .636 1.57 3Overall GPA 3.504 .156 503.535 1 .000 33.252 33.25 1
Overall Units .036 .013 8.414 1 .004 1.037 1.04
Constant -4.323 .748 33.406 1 .000 .013 Model IndicatorsBaseline P* 82.1% Chi-Square (df) 2239.184 (18)Model N 5,293 Pseudo R2 .345 -.675-2log L 1551.824 % Correctly predicted 91.3%
* Refers to 1-year retention rate.
15 Preliminary and Predicted 1-Year Retention Rate (2014 Cohort)
Withdrew Persisted0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
18.5%
81.5%
10.6%
89.4%
1-Year Retention (Early Model)
Preliminary Predicted
Withdrew Persisted0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
18.5%
81.5%
15.8%
84.2%
1-Year Retention (Final Model)
Preliminary Predicted
16 Prediction Results (2014 Cohort)
Preliminary and Predicted 1-Year Retention Rate
Early Model
Preliminary
TotalWithdraw PersistPredicted Withdraw 241 149 390
Persist 441 2864 3305
Total 682 3013 3695
Preliminary 18.5% 81.5%
Predict 10.6% 89.4%
Differ 7.9% -7.9%
Overall Correctly predicted: 84%
Preliminary and Predicted 1-Year Retention Rate
Final Model
Preliminary
TotalWithdraw PersistPredicted Withdraw
402 180 582
Persist 280 2833 3113
Total682 3013 3366
Preliminary 18.5% 81.5%
Predicted 15.8% 84.2%
Differ 2.7% -2.7%
Overall Correctly predicted: 87.6%
171-Year Retention: The Differences Between The Early and Final
Model
1. The Early Model can be used for early intervention purposes during mid-Spring semester. The Final Models can be used to contact at-risk students during the Summer before second academic year.
2. The Early Model doesn’t contain any missing values.
3. The Final Model is more accurate compared to the Early Model. The gap between the predicted retention rate and actual retention rate was 2.7% vs. 7.9%, and overall 87.6% vs. 84% of the data was predicted correctly.
4. Eighteen (18) students did not have risk scores by using final model due to missing variables ( i.e. the commuters did not have a home address and thus the distance to school was unavailable). However, they were included as “persisted” based on their overall GPA.
18
Early Model—2-Year Retention
Table 7 Regression Model: 2-Year Retention (Early Model)
Predict Variables B S.E. Wald df Sig. Exp(B)Odds Ratio
(recalculated) RankRemediation -.475 .153 9.669 1 .002 0.62 1.61 5High School GPA -.403 .178 5.156 1 .023 0.67 1.50 6Equity Programs .342 .172 3.929 1 .047 1.407 1.41 7
Overall GPA 3.338 .172 377.123 1 .000 28.164 28.16 1
Overall units .057 .009 37.324 1 .000 1.059 1.06Repeaters (two years) -.606 .135 20.090 1 .000 .545 1.83 4
Changed major (4th term) -1.034 .372 7.746 1 .005 .355 2.82 3
Undeclared (4th term) -1.059 .388 7.456 1 .006 .347 2.88 2
Constant -5.222 .744 49.248 1 .000 .005
Model IndicatorsBaseline P* 72.0% Chi-Square (df) 1248.613 (18)
Model N 4,568 Pseudo R2 .239-.491
-2log L 1800.528 % Correctly predicted 91.0%
* Refers to 2-year retention rate.
19
Final Model—2-Year Retention
Table 8 Regression Model: 2-Year Retention (Final Model)
Predict Variables B S.E. Wald df Sig. Exp(B)Odds Ratio
(recalculated)Rank
Gender -.838 .221 14.354 1 .000 .433 2.31 4
Remediation -1.027 .249 17.018 1 .000 .358 2.79 3
High School GPA -.681 .286 5.689 1 .017 .506 1.98 5
Equity Programs .570 .283 4.065 1 .044 1.768 1.77 6
Overall GPA 5.615 .388 208.958 1 .000 274.483 274.48 1
Overall units .104 .015 48.240 1 .000 1.110 1.11 7
Repeaters (two years) -1.048 .225 21.599 1 .000 .351 2.85 2
Constant -10.006 1.288 60.353 1 .000 .000
Model Indicators
Baseline P* 72.0% Chi-Square (df) 1173.137 (18)
Model N 4,265 Pseudo R2 .240 - .682
-2log L 679.960 % Correctly predicted 96.3%
* Refers to 2-year retention rate.
20 Preliminary and Predicted 2-Year Retention Rate(2013 Cohort)
Withdrew Persisted0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
26.9%
73.1%
23.8%
76.2%
2-Year Retention (Early Model)
Preliminary Predicted
Withdrew Persisted0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
26.9%
73.1%
23.5%
76.5%
2-Year Retention (Final Model)
Preliminary Predict
21 Prediction Results (2013 Cohort)
Preliminary and Predicted 2-Year Retention Rate
Early Model
Preliminary
TotalWithdraw PersistPredicted Withdraw 682 118 800
Persist 222 2344 2566
Total 904 2462 3366
Preliminary 26.9% 73.1%
Predict 23.8% 76.2%
Differ 3.1% -3.1%
Overall Correctly predicted: 89.9%
Preliminary and Predicted 2-Year Retention Rate
Final Model
Preliminary
TotalWithdraw PersistPredicted Withdraw
694 96 790
Persist 210 2366 2576
Total904 2462 3366
Preliminary 26.9% 73.1%
Predicted 23.5% 76.5%
Differ 3.4% -3.4%
Overall Correctly predicted: 90.9%
2 Year Retention: The Differences Between The Early and Final Model
1. Early Models can be used for early intervention purposes during mid-spring semester of the second year. The Final Models can be used to contact at-risk students during the summer before the third academic year.
2. The accuracy of Early Models and Final Models are at similar levels. The gap between predicted retention and actual retention rate is 3.1% vs. 3.4%, and overall correctly predicted was 89.9% vs. 90.9%, respectively.
3. Two-year retention models are more accurate than one-year retention models because the actual withdrawals from previous semesters have been included as the part of the prediction.
Calculating the Risk Score for Each Student
One year calculation:
Early Model: 1-Year Retention Risk Score = -3.044 + 0.260*HSGPA - 0.478*Fulltimefirstterm + 1.534*Term1_GPA + 0.041*Term1_UNO + 0.439*Repeat1
Final Model: 1-Year Retention Risk Score = -4.323 - 0.306*URM - 0.426*Remed_ind - 0.455*HSGPA - 0.002*Distance - 0.453*UNIVLCommunity + 3.504*Term2_GPA + 0.036*Term2_UNO .
Two year calculation:
Early Model: 2-Year Retention Risk Score = -5.222 - 0.475*Remed_ind - 0.403*HSGPA + 0.342*Equity all + 3.338*Term3_GPA +0.057*Term3_UNO - 0.606*Repeat2 -1.034*MajorChange3 - 1.059*Major_und4.
Final Model: 2-Year Retention Risk Score = -10.006 - 0.838*Gender1 - 1.027*Remed_ind - 0.681*HSGPA + 0.570*Equity all +5.615*Term4_GPA + 0.104*Term4_UNO - 1.048*Repeat2.
24Identify Student at Risk by Using the Final Models1. 582 students
were subsequently identified as being at-risk and may not return in Fall 2015, including 223 actual withdrawals before Spring 2015.
2. After checking the current registration status (as of 6/29), those who had registered for Fall 2015 were included in the contact list.
2014 Cohort: 1-
Year Retention
(N= 3,695)
1. 790 students were subsequently identified as being at-risk and may not return in Fall 2015, including141 actual withdrawals before Spring 2015.
2. After checking the current registration status (as of 6/29), those who had registered for Fall 2015 were included in the contact list.
2013 Cohort: 2-
Year Retention(N=3,366)
25 Intervention During Summer 2015
First Group of Students•Enrolled in Spring 2015 with a high risk score•May or may not register for Fall 2015 •Need to encourage them to register for all 2015
Second Group of Students•Withdrew or stopped out during Spring 2015 •Have not registered for Fall 2015•Need to recruit them back in Fall 2015
Third Group of Students•Withdrew or stopped out at least a year ago •Must reapply for this University if they plan to come back•Need to provide guidelines outlining the admission procedure
26 Contact Lists for Intervention
Term4 Dept.
Registered
Term4 ENR
Term3 ENR
Term2 ENR
Ret2_Score
Ret2_Pre
Term4_GPA
Term4_UNO
ART 0 1 1 1 -1.39 0 1.7 29
COMS 1 1 1 1 -2.78 0 1.6 29
COMS 1 1 1 1 -1.92 0 1.83 18
COMS 1 1 1 1 -1.08 0 1.78 27
COMS 0 1 1 1 -0.96 0 1.94 29
DOD 0 1 1 1 -0.03 0 2.1 21
HIST 0 1 1 1 -1.22 0 1.55 36
PHIL 1 1 1 1 -2.35 0 1.52 27
THEA 1 1 1 1 -0.56 0 1.48 30
BUS 0 1 1 1 -3.95 0 1.44 16
Term2Dept.
Registered
Term2 ENR
Ret1 Score
Ret1 Pre
Term2GPA
Term2 UNO
ART 0 1 -2.11 0 1.33 6
ART 1 1 -1.37 0 1.46 12
COMS 0 1 -6.71 0 0 0
COMS 0 1 -6.53 0 0 0
COMS 0 1 -5.47 0 0 0
COMS 0 1 -4.72 0 0.38 6
COMS 0 1 -3.52 0 0.6 9
COMS 0 1 -1.51 0 1.18 15
COMS 0 1 -1.5 0 1.22 12
COMS 0 1 -1.48 0 1.14 14
2013 Cohort 2014 Cohort
27The Quality of Prediction Models for Retention
High percent of overall correctly predicted:
i. Early Model: 84% correctly predicted for 1-year retention when using to predict the retention rates for the 2014 cohort.
ii. Final Model: 88% correctly predicted for 1-year retention when using to predict the retention rates for the 2014 cohort.
iii. Early Model: 90% correctly predicted for 2-year retention when using to predict the retention rates for the 2013 cohort.
iv. Final Model: 91% correctly predicted for 2-year retention when using to predict the retention rates for the 2013 cohort.
v. All the results will need to be re-checked by using the Fall 2015 census files (currently not available).
28 Discussion: Unsolved Issues
The following issues with the Regression Models need to be addressed or resolved:
1. Negative correlation between high school GPA and 1-year retention or 2-year retention with three of the four models.
2. Negative correlation between the Learning Community and 1-year retention rates with the Final Model.
3. Overall unit completion has a low odds ratio compared to other predictors even though it is still a powerful predictor for retention.
4. When using regression models to predict the retention for different cohorts, the accuracy has decreased slightly by year. For example, 1-year retention models had 1% to 2% higher accuracy of prediction for the 2013 cohort than for the 2014 cohort.
5. It is difficult to predict if or when the students will return after they have stopped out one or more semesters due to of lack of information.
Questions?
Contact Information:
Jing Wang-Dahlback Director of Research
Office of Institutional ResearchCalifornia State University, Sacramento
Email: [email protected]
Sacramento State OIR Website:
http://www.csus.edu/oir/