Upload
others
View
17
Download
0
Embed Size (px)
Citation preview
1
Using Predictive Analytics to Track Students: Evidence from a Multi-College Experiment∗
Elisabeth Barnett,† Peter Bergman,‡ Elizabeth Kopko,† Vikash Reddy†
EARLY DRAFT VERSION: NOT FOR DISTRIBUTION
Tracking is widespread in U.S. education. In higher education alone, at least 71% of post-secondary institutions use a test to track students, and more than 80% of these institutions use a test as the sole criterion to determine placement. While recent research has shown that tracking can have positive effects on student learning, inaccurate placement has consequences: students face misaligned curricula and must pay tuition for remedial courses that do not bear credits toward graduation. We develop an algorithm to place students that combines multiple measures with predictive analytics. We then conduct an experiment across multiple colleges to evaluate its impact. Compared to colleges’ most commonly-used placement test, the algorithm is more predictive of future performance and substantially increases placements into college-level courses. This is particularly true for English courses and for female, Black and Hispanic students. The algorithm tends to predict pass rates more accurately in math than English.
∗ The research reported here was undertaken through the Center for the Analysis of Postsecondary Readiness and supported by the Institute of Education Sciences, U.S. Department of Education, through Grant R305C140007 to Teachers College, Columbia University. The opinions expressed are those of the authors and do not represent views of the Institute or the U.S. Department of Education. † Community College Research Center. ‡ Teachers College, Columbia University. Email address: [email protected].
2
1. Introduction
Tracking students by prior test scores is widespread in U.S. education. In higher
education alone, 60% of incoming college freshman take a remedial course in math or
English (National Center for Public Policy and Higher Education & Southern Regional
Education Board, 2010), at least 71% of post-secondary institutions use a test to track
students, and more than 80% of these institutions use a test as the sole criterion to
determine placement (Fields & Parsad, 2012). These rates are higher in two-year colleges,
which enroll 40% of post-secondary students and graduate 39% of their students (Fields &
Parsad, 2012; Chen & Simone, 2016).1 Prior research has shown large potential benefits of
tracking (Card & Giuliano, 2016; Duflo, Dupas, & Kremer, 2011), but inaccurate
placement has consequences: students face misaligned curriculum and must pay tuition for
remedial courses that do not bear credits toward graduation. The benefits of tracking are
predicated on the validity, reliability, and fairness of the tests used to place students
(Heubert & Hauser, 1999).2
In practice however, the tests used for tracking may lack these characteristics. For
example, using data from the University of California system, Rothstein (2004) finds that
a significant part of the SAT’s predictive value derives from its correlations with student
demographics, and that GPA is an important additional predictor of future college
performance. In the community college setting, Scott-Clayton et al. (2014) find that
placement scores explain little of the variation in college performance, and this validity
1 This is the graduation rate within six years. 2 Validity examines how well the test measures what its users intend it to measure, such as a student’s mastery of a subject area. Reliability assesses the reproducibility of the test’s results; that is, if a student were to take the test on multiple occasions, then they would receive similar scores each time. Fairness, which is closely related to validity, is the ability of the test to provide valid results for all subgroups. For instance, a test may inaccurately measure the math skills of a certain subgroup by relying on culturally-specific language in a word problem. See (Heubert & Hauser, 1999).
3
varies by subgroup. This could impede the ability of colleges to align curricula to
students’ abilities. Scott Clayton et al. show how using multiple student characteristics—
especially high school GPA—more accurately predicts college performance than test scores
alone. These findings contribute to a concern from colleges about the fairness of using a
test score as the sole criterion for placement (Saxon & Morante, 2014). Given that
placement tests aim to predict students’ readiness for college-level courses, using multiple
student characteristics, such as high school GPA, to predict course performance could
mitigate concerns about validity and fairness (Mullainathan & Spiess, 2017).
In this paper, we develop and test a placement algorithm that combines multiple
measures and predictive analytics. To do so, we recruited seven community colleges and
gathered historical data on their students to estimate models predicting students’
likelihood of passing college-level math and English courses. These predictions
incorporated measures such as placement-exam scores, high school GPA, diploma status,
and time since high school graduation. We created college-specific placement algorithms
for math and English that placed students into a remedial course if a student’s predicted
probability of passing a college-level course was below a cut point chosen by each college.
Students were randomly assigned to either colleges’ business-as-usual placement system or
the placement algorithm.
At the outset, it is unclear what impacts this algorithm will have on placement
outcomes. Measures such as high school GPA can reflect a wider array of cognitive and
non-cognitive skills than test scores alone (Kautz, et al., 2014; Kautz & Zanoni, 2014;
Borghans, Golsteyn, & Heckman, 2016). GPA also has a high degree of reliability (Bacon
& Bean, 2006). Improving the validity and the reliability of the placement instrument
could help colleges align course curricula to students’ abilities. If colleges hold placement
rates constant, this should improve students’ course outcomes. However, a higher-quality
4
placement instrument does not imply that placement rates will change either on net or for
a given individual. At particular cut points (e.g. the extremes) it is possible that the
placements assigned by the algorithm and the test score will be the same. Furthermore,
colleges can choose the cut points for placement into the college-level courses, which
affects the number of students placed into these courses and their expected pass rates
conditional on placement.
This paper presents our findings on how five of the seven participating colleges
implemented the placement algorithm, how it affected students’ placement outcomes, and
what impacts this had on first-semester college performance.3 We find that colleges
generally chose cut points to either hold placement rates constant or to hold pass rates
constant. Overall, placement rates changed significantly: relative to colleges’ business as
usual, 21% of math placements changed and 48% of English placements changed. Of all
students, the algorithm placed 14% into a higher-level math class and 7% into a lower-
level math course compared to what would have occurred under the placement test. These
numbers were 41% and 7% for English placements.
These changes led to increases in enrollment and pass rates in college-level courses.
Placement via the algorithm increased enrollment in college-level math and English
courses by 5 percentage points and 19 percentage points, respectively. Treatment group
students were also 3 percentage points more likely to pass college-level math and 13
percentage points more likely to pass college-level English.
We also find evidence that the placement algorithm significantly narrowed certain
demographic gaps in placement rates. Even after controlling for multiple-testing,
placement by the algorithm significantly increased the representation of female students
in college-level math courses by 8 percentage points, the representation of Black students
3 The remaining two colleges started student intake after Fall 2016, so their data will not be available until 2019.
5
in college-level English placement by 18 percentage points (relative to white students),
and the representation of Hispanic students in college-level English placement by 13
percentage points (relative to white students).
Lastly, the algorithm more accurately predicts pass rates in math courses than English
courses. Actual pass rates were 2 percentage points lower than predicted in math and 7
percentage points lower than predicted in English. This pattern matches our findings from
the development of the algorithm: our measures of model fit were better for predicting
passing math grades than passing English grades. With richer high school transcript
information and fewer implementation constraints, which restricted the set of models we
could use, it is possible this accuracy could be improved.
Our research contributes to a broader literature that focuses on the effects of tracking.
Oakes (1985) argued that the evidence on tracking is inconsistent, and, in practice,
higher-track classes tend to have higher-quality classroom experiences than lower-track
classes. More recently, Duflo, Dupas, & Kremer (2011) randomized students in Kenya to
schools that either tracked students by test scores or assigned students randomly to
classrooms. They found that test scores in schools with tracking improved relative to the
control group, both for students placed in the higher-scoring and the lower-scoring tracks.
Card and Giuliano (2016) studied a district policy in which students are placed into
classrooms based on their test scores. This program caused large increases in the test
scores of Black and Hispanic students.
A number of studies look at the effects of being placed into a higher track versus a
lower track (as opposed to the effects of implementing a tracking system). Bui, Craig, and
Imberman (2014) and Card and Giuliano (2014) find that gifted students’ placement into
advanced coursework does not change test scores. However, Cohodes (2015) and Chan
6
(2018) find increases in enrollment in advanced high-school coursework and college.4 In
higher education, the evidence that placement into college-level courses improves
academic outcomes for marginal students is more mixed, and several regression-
discontinuity analyses find no effects (Calcagno & Long 2008; Bettinger & Long 2009;
Boatman & Long 2010; Martorell and McFarlin 2011; Allen & Dadgar 2012; Hodara 2012;
Scott-Clayton & Rodriguez, 2015).
Recently, economists have argued that data-driven algorithms can improve human
decision making and reduce biases (Mullainathan & Spiess, 2017). Kleinberg et al. (2017)
show that a machine-learning algorithm has the potential to reduce bias in bail decisions
relative to judges’ decisions alone. At the same time, others are concerned that these
algorithms could embed biases into decision making and exacerbate inequalities (Eubanks,
2018). We contribute to this literature by experimentally comparing the impacts of a
data-driven algorithm to another quantitative measure, test scores; we find that
placement by the algorithm tends to reduce certain gaps in college placement outcomes
across demographic subgroups.
To the extent that colleges want to implement such a placement algorithm, any
potential benefits should be weighed against its costs. We estimate that the cost per
student in the initial year of the study—above and beyond the business as usual—is $110.
The first year of implementation involves a number of fixed costs, and in subsequent years
we estimate operating costs of $40 additional dollars per student, which could be
significantly lower depending on how colleges collect historical data from students (e.g.
moving away from hand-data entry) and what technology they use to implement the
placement system. Colleges asked whether further savings could arise by not paying to use
4 Several other studies look at the effects of placing into high-test score schools and the results are much more mixed (Jackson 2010; Pop-Eleches and Urquiola 2013; Abdulkadiroglu, Angrist, and Pathak 2014; Dobbie and Fryer 2014).
7
copyrighted exams. We examine the extent to which the algorithm would place students
differently if test scores were not used for prediction. We find that placement rates would
change substantially more for math courses than for English courses, where only 5%-8% of
placements would change.
The rest of this paper proceeds as follows. Section 2 provides further background
information about tracking in postsecondary institutions and study implementation.
Section 3 describes the experimental design, data and empirical strategies. Section 4
presents our findings. Section 5 provides a detailed cost analysis, and Section 6 concludes.
2. Background, Site Recruitment, Algorithm Implementation
Remedial education represents a significant aspect of the public higher education system,
both in terms of enrollment and cost. In the 2011-12 academic year, 41% of first- and
second-year students at four-year institutions had taken a remedial course, while at two-
year institutions, even more—68% of students—had taken a remedial course (Chen, 2016).
The cost of remedial education has been estimated to be as much as $2.9 billion annually
(Strong American Schools, 2010).
The primary purpose of remedial education is to provide differentiated instruction to
under-prepared students so they have the skills to succeed in college-level coursework
(Bettinger & Long, 2009). However, there is evidence that community-colleges’ tracking
systems frequently underplace students—track them into remedial courses when they
could have succeeded in college-level courses—and overplace students—track them into
college-level courses when they were unlikely to be successful (Belfield & Crosta, 2012;
Scott-Clayton, 2012).
Most institutions administer a multiple-choice test in mathematics, reading, and
writing to determine whether incoming students should be placed into remedial or college-
level courses. The ACCUPLACER, a computer-adaptive test offered by the College
8
Board, is the most widely used college placement system in the U.S. (College Board,
2015). Colleges choose a cut score for each test, and place students scoring above this
score into college-level courses and students below the cut score into remedial courses.5
Given the placement rules and immediate test results provided by the ACCUPLACER
platform, students often learn their placement immediately after completing their exam.
Site Selection and Descriptions
All of the participating colleges are part of the State University of New York (SUNY)
system. Table A.3 of the Appendix provides an overview of each college’s characteristics
using public data. The smallest of the colleges serves roughly 5,500 students while the
largest serves over 22,000 students. As is common in community college settings, a large
share of the students is part-time and many are adult learners, with between 21% and
30% of students over the age of 25. For most of the colleges, the majority of students
receive financial aid. The colleges have similar transfer-out rates of between 18% and 22%
and three-year graduation rates are between 15% and 29%. The colleges also tend to serve
local student populations. Lastly, all of the colleges have an open-door admissions policy.
This means that the colleges do not have admission requirements beyond having
graduated from high school or earned a GED.
Creating the Placement Algorithm
Colleges preferred that we develop college-specific algorithms. We created separate
algorithms for each college using data on each college’s previous cohorts of students.
Five colleges in the study had been using ACCUPLACER for several years. One
college had been using ACCUPLACER assessments for English but had transitioned from
5 Certain colleges may offer exemptions from testing; for instance, this can occur for students who speak English as a second language or who have high SAT scores.
9
a home-grown math assessment to the ACCUPLACER math assessments too recently to
generate historical data; at this college we tested an algorithm for English placement only.
One college in our sample had been using the COMPASS exam, which was discontinued
by ACT shortly after this study began. At this college, we tested an algorithm that does
not use any placement test scores against a placement system that incorporates only
ACCUPLACER test results, and only for math placement.
We worked with administrators at each college to obtain the data needed to estimate
each algorithm. In some instances, these measures were stored in college databases. In
other instances, colleges maintained records of high school transcripts as digital images; in
these cases, we had the data entered into databases by hand.
In order to estimate the relationships between predictors in the dataset and
performance in initial college-level courses, we restricted the historical data to students
who took placement tests and who enrolled in a college level course without first taking a
remedial course. Importantly, the latter group of students was selected into college-level
courses based on observable variables. This set of students constituted our estimation
sample for developing the algorithm.
We aimed to predict “success” in future coursework for each student. We met with
colleges to decide how to define success, who agreed to define this as a grade of C or
better in the initial college-level course associated with the placement decision. We then
regressed an indicator for success in a college-level course on various sets of predictors
using Probit and linear probability models (LPM). We used the results of the LPMs
because we could not implement non-linear models into colleges’ existing placement
software. Nonetheless, the non-linear models yielded similar placement decisions as LPMs,
especially around the relevant cut points that colleges chose to determine placements.
10
For each college, we estimated regressions relating placement test scores or high school
GPA to “success” in initial college-level classes for a given subject. We added additional
information from high school transcripts where such information was available. This
information included the number of years that have passed since high school completion
and whether the diploma was a standard high school diploma or a GED (diploma status).
We also tested the inclusion of additional variables such as SAT scores, ACT scores, high
school rank, indicators for high school attended, and scores on the New York Regents
exams where they were available (often these were missing), as well as interaction terms
and higher-order terms for variables. When variables were missing, we imputed values and
added indicators for missing. Identical procedures were followed for both English and
Math.
[1] 1(C or Better)𝑖𝑖 = α + (HS GPAi)β1 + εi
[2] 1(C or Better)i = α + (ACCUPLACERi)β1 + εi
[3] 1(C or Better)i = α + (HS GPAi)β1 + (ACCUPLACERi)β2 + εi
[4] 1(C or Better)i = α + (HS GPAi)β1 + (ACCUPLACERi)β2 + Xiβ3 + εi
The focus of this analysis was the overall predictive power of the model. As such, we
calculated the Akaike Information Criterion (AIC) statistics for each model. The AIC is a
penalized-model-fit criterion that combines a model’s log-likelihood with the number of
parameters included in a model (Akaike, 1998; Burnham, Anderson, & Burnham, 2002;
Mazerolle, 2004).6 In practice, we did not have many variables to select from and higher-
order and interaction terms had little effect on prediction criteria (and additional
complexity was difficult to implement).
6 The AIC is asymptotically equivalent to leave-one-out-cross-validation for normally distributed error terms (Stone, 1977).
11
Figures A.1 and A.2, located in the Appendix, list the full set of variables used by
each college to calculate students’ math and English algorithm scores, respectively. Tables
A1 and A2 in the appendix show typical examples of our regression results for math and
English. Across colleges, explanatory power was much higher for math course grades than
for English course grades. Placement scores typically explained less than 1% of the
variation in passing grades for English. Test scores were better predictors for passing
math grades, explaining roughly 10% of the variation. Adding high school grades typically
explained an additional 10% of the variation in both subjects. Thus, combining multiple
measures with predictive analytics is no panacea for predicting future grades, but it does
significantly improve validity of the placement instrument relative to test scores alone.
Setting cut probabilities: After we selected the final models, we used the coefficients
from the regression to simulate placement rates for each college. Consider the following
simplified example where a placement test score (R) and high school GPA (G) are used to
predict success in college-level math (Y), defined as earning a grade of C or better. The
regression coefficients combined with data on R and G can then predict the probability of
earning a C or better in college-level math for incoming students (Y�). A set of decision
rules must then be determined based on these predicted probabilities. A hypothetical
decision rule would be:
Placementi = � College Level if Y�i ≥ 0.6Remedial if Y�i < 0.6
For each college, we generated spreadsheets projecting the share of students that
would place into college-level coursework at any given cut-point as well as the share of
those students we would anticipate earning a C or better. These spreadsheets were given
to colleges so that faculty in the pertinent departments could set cut-points for students
entering their programs.
12
Figure 1 shows a hypothetical example of one such spreadsheet provided to colleges.
The top panel shows math placement statistics and the bottom panel shows statistics for
English. The highlighted row shows the status quo at the college and the percent of tested
students placed into college level is shown in the second column. For instance, for math,
the status quo placement rate is 30%. The third column shows the pass or success rate,
which is a grade “C” or better in the first college-level course in the relevant subject. In
this example, the status-quo pass rate for math is 50% conditional on placement into the
college-level math course.
Below the highlighted row, we show what would happen to placement and pass rates
at different cut points for placement. The first column shows these cut points (“Minimum
probability of success”). For instance, for math, the first cut point we show is 45%, which
implies that for a student to be placed into college-level math under the algorithm, the
student must have a predicted probability of receiving a “C” or better in the gate-keeper
math course of at least 45%. If this 45% cut point is used, columns two and three show
what would happen to the share of students placed into college-level math under the
algorithm (column two) and what would happen to the share who would pass this course
conditional on placement (column three). In this example, for math, if the 45% cut point
is used, the algorithm would place 40% of students into college-level math and we
anticipate 60% of those students would pass. The cut point differs from the expected pass
rate because the cut point is the lowest probability of passing for a given student; the cut
point implies that every student must have that probability of passing or higher. For
instance, if the cut point is 40%, then every student has 40% chance or greater of passing
the college-level course—most students will have above a 40% chance.
Many faculty opted to create placement rules that either kept pass-rates in college-
level courses the similar to historical pass rates or kept the college-level placement rates
13
the similar to historical placement rates. Under this rule, the algorithm tended to predict
increases in the number of students placed into college-level coursework. For instance, in
the example, the status quo pass rate for English is 60%. A cut point of 45% would induce
the same pass rate, 60%, but would place 75% of students into the college-level English
course.
Installation of new placement method in college systems: We developed two
procedures to place students to maintain the timing of placement decisions. At colleges
running our algorithm through the computerized ACCUPLACER-test platform, we
programmed custom rules into the ACCUPLACER platform for students selected to be
part of the treatment group.7
Other colleges ran their placement through a custom server built for the study.
Student information was sent to servers to generate the probability of success and the
corresponding placement, which was returned to the college.
3. Experimental Design, Data, Empirical Strategy
The sample frame consisted of entering freshmen enrolling at each college who were
required to take the placement exams.8 Random assignment was integrated by computer
into each college’s placement platform described above. Upon taking their placement
exams, students were randomly assigned to be placed using either the business-as-usual
method or the algorithm. Students were blinded to their assignment. If a student took
both the English and math placement exams, they were either assigned to the business-as-
usual placement for both subjects or the algorithm for both subjects. Some students only
7 This process placed constraints on the algorithm’s complexity—interaction terms and non-linear models, for instance, are difficult to implement within the ACCUPLACER system. 8 Some colleges asked to exempt English as a Second Language speakers from the study and students with high SAT scores. However, as these are non-selective colleges, few students take the SAT.
14
took a placement exam in one subject. After taking placement exams, students were
notified of their placements either by a counselor or through an online portal, depending
on the college.
Data
Data came from two sources: placement test records and administrative data from each
college. Student-level placement test records include indicators for each students’
placement level in math and English, as well as the information that would be needed to
determine students’ placements regardless of treatment status. Placement test records
from each college contained high school grade point averages (when available) and scores
on individual placement tests. Additional variables included in placement test records
varied by each college’s placement algorithm. Examples of additional variables
incorporated for certain colleges include the number of years between high school
completion and college enrollment, type of diploma (high school diploma vs. GED), SAT
scores, and New York State Regents Exam scores.
In addition to placement test records, college administrative data included
demographic information, such as gender, race, age, financial aid status, and transcript
data that provided course levels, credits attempted and earned, and course grades.
Demographic data are only available for students who enrolled in our sample colleges.
Our initial sample consists of 4,729 first-year students across five colleges. Not all
students who take placement exams subsequently enroll in courses: 864, or about 18
percent, took a placement exam and did not enroll in at least one course during the fall
2016 term. This could be important if we would like to conduct subgroup analyses by
demographic group and there is a treatment effect on enrollment. As we show below
however, we do not find evidence of the latter.
15
Table 1 shows sample baseline characteristics for students who enrolled in the fall of
2016 at one of the five study colleges. The first two columns show overall sample
characteristics and the additional columns break these down by. On average, students in
the sample were 43 percent white. Approximately half of all students enrolling in at least
one course during the fall 2016 term received a federal Pell Grant for that term. There is
some variation in demographic characteristics. For instance, Colleges 1, 2, and 3 serve
more white students compared to Colleges 4 and 5, which enroll a higher share of
Hispanic students. Using Pell Grant receipt as a proxy for income, average family income
for study participants also varies across colleges. While Pell Grant recipients comprise
more than 60 percent of all study participants from College 1, 2, and 3, they represent less
than half of the students enrolling in the study from Colleges 4 and 5. Comparing these
characteristics to Appendix Table A.3 shows that the sample characteristics roughly
match the overall characteristics of students each college serves.
Lastly, Figure 2 shows the gap in placement rates across demographic subgroups. The
first two bars show that the white students are placed into college-level math and English
courses at rates 14 percentage points and 19 percentage points higher than Black
students. These gaps tend be smaller between Hispanic and white students, and between
male and female students, but also quite large between Pell recipients and non-Pell
recipients—16 and 12 percentage points for math and English, respectively.
Outcomes
We study the effects of assignment to the placement algorithm on several primary
outcomes, by subject. First, we descriptively examine how placements changed as a result
of the algorithm: what share of students had their placement changed relative to the
control, and of these, what share had their placement changed from a remedial-course
assignment to a college-level assignment, and what share had their placement changed
16
from a college-level course assignment to a remedial assignment. Second, we show
treatment effects on enrollment and pass rates for math and English separately. If a
student does not enroll in any college course—remedial or otherwise—we code enrollment
into the college-level course, passing the college-level course, and credits accumulated as a
zero. Thus, attrition from the colleges is incorporated into the outcome, and, as shown
below, we do not find any effects on initial enrollment. Lastly, we descriptively compare
the algorithm’s predicted pass rates in college level courses (conditional on enrollment) to
the actual pass rates.
Impact analyses
We use an intent-to-treat analysis to examine the impacts of using the placement
algorithm versus the single-placement test status quo. We estimate the following model:
[7] Yi = α + βRi + λφi + ηXi + δZi + εi
where Yi are first-semester academic outcomes for student i, such as placement into a
college-level course and passing a college-level course; Ri indicates whether the individual
was randomly assigned to be placed using the algorithm; φi is an indicator for the
institution a student attends; Xi is a vector of baseline covariates including gender, race,
age, and financial aid status; Zi is students’ math and English algorithm calculations,
which are baseline measures of academic preparedness, and εi is a random error term. The
coefficient of interest is β, which is the effect of assignment to the placement algorithm on
outcomes at the end of the first semester discussed above. We show results with and
without baseline covariates.
As not everyone takes a placement exam in both subjects, we run these regressions by
subject placement (e.g. for enrollment in college-level math for those who took a math
17
placement exam) and results for those who took any placement exam (e.g. for enrollment
in any college-level course).
Subgroup Analyses
Colleges were interested to understand what impact the placement algorithm has on the
composition of students placed into remedial and college-level courses, and particularly for
students typically under-represented in these courses. We conduct subgroup analyses as
follows to estimate whether there were significant differential effects.
[8] Yi = α1 + β1kRi + β2kRi × Subgroupk + λ1φi + η1Xi + δ1Zi + ε1i
Yi are placement in college-level math, placement in college-level English, and passing
these courses as well. The coefficient of interest is β2k, which assesses the extent to which
Subgroup k exhibits a different effect from the reference group. The subgroups of interest
are: Black and Hispanic students (separately) compared to white students; female
students relative compared to male students; and Pell recipients compared to non-Pell
recipients.
However, this process yields many tests, which increases the likelihood of type-I errors.
To reduce the number of tests, we estimate the interaction effects simultaneously by
estimating a system of Seemingly Unrelated Regressions and conduct an F test of whether
the interactions effects are jointly significantly different from zero. We also use a step-
down method from Holm (1979) to discern which specific hypotheses can be rejected.
Treatment-Control Baseline Balance
Randomization should ensure, in expectation, that students assigned to the treatment are
similar to those assigned to the control-group placement rules. Table 2 provides evidence
that participants’ demographic and academic characteristics are balanced across
18
treatment and control groups. Students’ ACCUPLACER exam scores also are similar
across both groups. Overall, the magnitudes of differences between treatment and control
groups are small and only two are significant at the 10 percent level, which is expected
with 20 variables tested. Though not shown, this balance also holds for the subgroup of
students who only took an English or a Math placement exam as well. Observation counts
change because colleges only provided demographic variables for those students who
enrolled and took the placement exams. This is potentially problematic for subgroup
analyses (not for overall effects), however, treatment assignment does not impact
subsequent enrollment, which is not surprising given the close proximity in time between
enrollment and placement.9 Lastly, note that high school GPA, a strong predictor of
future performance, is missing for many incoming students.
4. Results
Descriptive Changes in Placements
We begin with a descriptive summary of placement changes to show the various ways the
algorithm changed students’ placements relative to the business as usual. As stated above,
it is not obvious how the algorithm will change net placement rates. For instance, relative
to the business as usual, placement by the algorithm could change an equal number of
placements into and out of college-level courses, which implies no net change in college-
level course placement. Table 3 summarizes changes in placement for program-group
students entering the study in fall 2016. Of the 2,455 students assigned to the program-
group, 92% took a placement exam in math and 76% took a placement exam in English.
Among those students who took a math placement exam, 21% experienced a math
placement different from what would have been expected under the status quo placement
9 Other papers studying the effects of placement into remedial classes have found that this does not affect subsequent enrollment (Martorell and McFarlin 2011; Scott-Clayton and Rodriguez 2014).
19
rules. Of those with a changed math placement, 14% were placed into a higher-level math
course than would have been expected under a single test placement system, and 7%
placed in a lower level math course.
Additionally, 76% of program-group students took a placement exam for English. Of
those who took the English placement exams, approximately 48% of program-group
students experienced a change in the level of their English level placement; 42% placed
into a higher level English course and 7% placed into a lower level course than they would
have under the status quo placement strategy.
Students generally followed the placement assignments that they received. Compliance
rates were high across math and English, ranging from approximately 88% to 99% across
participating colleges. Instances of non-compliance can be at least partially explained by
the fact that we only report on initial placements (intent-to-treat effects based on their
first exam scores), which may subsequently if students re-took the placement exam.
Treatment Effects on Placement, Course Taking, and Credits
To test whether placement algorithm affected student outcomes, we estimate four models
that build upon each other: (1) we first estimate a simple regression including only college
fixed effects (strata), (2) we then add controls for the set of demographic characteristics
including gender, race, and age at entry, (3) we then add proxies for income including Pell
and TAP Grant10 recipient status, and (4) finally, we add the calculated math and
English algorithm values for all students, which are estimated using baseline covariates.
The results are robust across these model specifications, so show results from Model 4, but
all of our estimates across specifications can be found in Tables A.4 through A.10 in the
Appendix.
10 The Tuition Assistance Program (TAP) is a need-based grant available to New York State residents who enroll as a first-time freshman at an approved postsecondary institution in New York State.
20
Table 4 summarizes the results for each outcome, for each subject. The placement
algorithm resulted in increases in all three outcomes: placement into college-level courses,
enrollment in college-level courses, and college-level credits earned. Students assigned to
the placement algorithm are 5.0 percentage points more likely to be placed into a college-
level math course, 4.6 percentage points more likely to enroll in a college-level math
course, and 2.8 percentage points more likely to pass a college-level math course during
the first term. All results are statistically significant at the 1 percent level. One
explanation for the difference between placement and enrollment into a college-level math
course is that some students placing into college-level math did not have to complete a
college-level math course prior to enrolling in other college-level courses in the first term.
For example, if a student placed into college-level math, she does not have to take that
college-level course right away and could instead enroll in other field courses that term.
There are positive and substantially larger effects for English placement, enrollment,
and completion than for outcomes on math courses. Students who were placed by the
algorithm were 30.4 percentage points more likely to place into a college-level English
course, 19.3 percentage points more likely to enroll in a college-level English course, and
12.5 percentage points more likely to pass a college-level English course in the first term.
All results are significant at the 1 percent level. Again, the difference between placement
and enrollment into a college-level English course may occur for the same reason as above
for college-level math enrollment.
In addition to subject-specific impacts, we also test whether program status had any
impact on overall college-level course taking.11 Table 5 shows that algorithm assignment
increases the probability of enrolling in any college-level course by about 1 percentage
11 As expected given the timing of randomization, placement and enrollment, there is no impact of treatment assignment on enrollment (Table 3).
21
point and increases the probability of enrolling and passing any college-level course by 4.2
percentage points. These effects are all significant at least at the 5 percent level. The
smaller impact on college-level course enrollment is likely because students who place into
remedial-level courses in either English or math are, in most cases, still eligible to enroll in
other college-level courses that do not require a certain level of English or math
proficiency.
Colleges can select various cut points for assignment to college-level coursework, which
can make these effects difficult to interpret. Depending on what cut point colleges choose,
we may expect higher pass rates or lower pass rates. One way to gauge success is whether
the algorithm correctly predicted pass rates in college-level courses among those who were
placed and enrolled in these courses. Prediction accuracy was better for math than for
English: Actual pass rates were 2 percentage points lower than predicted in math and 7
percentage points lower than predicted in English.
Lastly, Table 5 shows there is an increase in credit accumulation of 0.60 credits, which
is a 12 percent increase over the control-group mean. This effect is significant at the 1
percent level.
Subgroup Effects
Table 6 shows the interaction effects for the subgroups listed above. The largest
differential effects occur for Black and Hispanic students into college-level English courses;
these groups show 18 and 13 percentage point increases in college-level English relative to
white students. The differential effects are much closer to zero for college-level math
placement, with the exception of female students, who are 8 percentage points more likely
to be placed into college-level math than male students. The interaction effects are
generally smaller and not statistically significant for passing college-level courses.
22
We test the significance of these interaction effects in several ways. First, an F test of
all of the interaction effects for placement into college-level math and English rejects the
null hypothesis that these effects are zero (p<0.01). Second, if we test each interaction
separately to identify which is significant while controlling the Familywise Error Rate
using Holm (1979), we find that female students, Black students and Hispanic students
experience significantly different increases in placement to college-level courses. The
evidence of significant differential effects for passing college-level math and English
courses is weaker however. The same F test does not reject the null (p=0.60) and none of
the interactions effects is significantly different from zero after adjusting for multiple
hypothesis testing.
5. Cost Analysis
The initial investment to build the algorithm has three components. First, data collection
on students’ characteristics (including high school transcripts), placements based on test
results, and subsequent college outcomes. In some colleges, these data are already
available, but other colleges required more extensive data collection. Second, data must be
analyzed to determine a new placement algorithm. Third, resources must be allocated to
create and implement the new system within the college, which includes training
personnel. After the initial investment, implementation requires collecting data from
entering students and personnel to assign students to either remedial or college-level
courses.
The cost estimate of the new placement system is relative to the cost of business-as-
usual testing for placement. For both placement systems there are costs in administering
placement tests. Also, for both systems, future resources may be required as students
progress into college-level courses after completing remedial coursework. If more students
progress into college-level courses, colleges may have to shift resources toward college
23
courses and away from remedial courses in conjunction with any changes in revenue per
student. Currently, no information on differences in coursework costs across the two
options is available, so we cannot incorporate this resource.
We calculated costs for five colleges using the ingredients method (Levin et al.,
2017).12 We collected information on ingredients from direct interviews with personnel
who implemented the new testing protocols. We gathered information on input prices and
overheads costs from secondary sources. The resulting estimates reflect the expected cost
of implementing the new placement system at a college of similar size and organization as
the five sample colleges.
Across the five colleges, implementing the new system costs, on net, $110 per student,
which varies between $70 and $320, depending on the college. We estimate the operating
cost per student per semester over the status quo is $40 per student, which varies from
$10 to $170.13
This variation is primarily driven by the number of students at each site. More
enrollments lead to lower costs because the costs of creating the algorithm are mostly
fixed. The costs per college also varied depending on how much information was
previously available to determine the new placement algorithm and how many students
had the requisite information for the placement system to operate. Data entry costs were
lower if the college had all high school information pre-loaded into their databases; in
contrast, data entry costs were higher if each student’s information had to be entered into
the computing system individually.
Interviewees described strategies by which the number of students with readily-
12 We derived costs from the inputs used at each college multiplied by standardized prices per input expressed in 2016 dollars. 13 See Tables A.13 and A.14 in the Appendix. Details of the prices and analyses behind these calculations available upon request.
24
available information might be increased in the future. Interviewees also did not indicate
significant resource changes with respect to instruction. Potentially, the new placement
system may change assignments such that more students are now in college-level classes,
which could require more college-level faculty and perhaps more sections of college-level
courses. However, most colleges indicated that faculty could be reassigned from teaching
remedial classes to teaching college-level classes, though few changes in class size were
anticipated even given the changes in placement rates.
Colleges could save money by not purchasing the ACCUPLACER exams, and they
asked whether students could be placed via the algorithm as accurately without using
these test scores. We examined the extent to which the algorithm would place students
differently if test scores were not used for prediction. We find that placement rates would
change substantially for math courses—by 18% to 34%--depending on whether colleges
decided to hold predicted pass rates the same or placement rates the same. However, for
English courses, only 5% to 8% of placements would change. This finding again is in line
with the increased predictive value we find for math test scores over English test scores.
6. Discussion and Conclusion
Our findings indicate that combining predictive analytics with multiple measures
significantly impacts how community colleges track students into either college-level or
remedial courses. First, the algorithm allowed colleges to choose cut points that explicitly
targeted predicted placement rates and pass rates. Second, the algorithm led to changes in
the placement of students. Across the five colleges in the analysis, students placed using
the algorithm were more likely to be assigned to college-level courses than those placed
using the business as usual system. In addition, more treatment-group students took and
passed college-level courses in math and English than students in the control group. These
results are due to a combination of the cut points that colleges chose and the alternative
25
placement decisions of the algorithm. There were particularly large increases in college-
level placements for female students in math, and Black and Hispanic students in English.
The algorithm’s prediction accuracy was better for math courses than English courses.
However, the algorithm we developed could be improved. Most notably, our model was
constrained by implementation in several ways. To produce rapid placement decisions, we
had to embed our algorithm into existing systems, which restricted our modeling
choices—we could not for instance, implement a non-linear model. Future models could
also use richer transcript data; the colleges we worked with could only provide high school
GPA, which could be missing for incoming students, nor did it contain course-level high
school grades that could be predictive of future performance as well. More generally, as
colleges develop more consistent ways to record incoming student information, the ability
to predict future performance could improve.
One question is how our results would differ if all students within a college were
placed according to the algorithm. We interviewed college administrators and department
chairs at each college to document their impressions to the algorithm’s implementation.
Most perceived no changes in classroom composition and no need to change faculty
allocations. However, this could change if all students were placed via the algorithm,
especially in English courses where placement changes were more significant.
Our initial results have important implications because the high cost of remedial
education falls directly onto students placed into these courses and indirectly onto
taxpayers whose money helps subsidize public postsecondary institutions. As a result,
there is both a private and social benefit to ensuring that remedial education is correctly
targeted. Colleges recognize this, and some have begun to implement these placement
algorithms. Long Beach City College (LBCC) created a placement formula that uses
student high school achievement in addition to standardized assessment scores. The
26
formula weights each measure based on how predictive it is of student performance in
college courses (Long Beach City College, Office of Institutional Effectiveness, 2013). This
paper provides evidence that these placement systems not only affect student outcomes
through changes in the placement instrument, but also through colleges’ improved ability
to target pass rates and placement rates explicitly. Future research could test more
intricate predictive models than we could implement in the current study.
27
References Akaike, H. (1998). Information theory and an extension of the maximum likelihood
principle. In E. Parzen, K. Tanabe, & G. Kitagwa (Eds.) Selected Papers of Hirotugu Akaike. Springer Series in Statistics (Perspectives in Statistics). New York, NY: Springer.
Allen, D., & Dadgar, M. (2012). Does dual enrollment increase students’ success in college? Evidence from a quasi-experimental analysis of dual enrollment in New York City. New Directions for Higher Education, 2012, 11–19. doi:10.1002/ he.20010
Bacon, D. R., & Bean, B. (2006). GPA in research studies: An invaluable but neglected opportunity. Journal of Marketing Education, 28, 35–42.
Belfield, C., & Crosta, P. M. (2012). Predicting success in college: The importance of placement tests and high school transcripts (CCRC Working Paper No. 42). New York, NY: Community College Research Center. Retrieved from: http://ccrc.tc.columbia.edu/publications/predicting-success-placement-tests-transcripts.html
Bettinger, E.P., Boatman, A., & Long, B.T. (2013). Student supports: Remedial education and other academic programs.” The Future of Children, 23(1), 93 – 115. Retrieved from: http://muse.jhu.edu/article/508222/pdf
Bettinger, E. P., & Long, B. T. (2009). Addressing the needs of underprepared students in higher education: Does college remediation work? Journal of Human Resources, 44(3), 736–771.
Boatman, A., & Long, B. T. (2010). Does remediation work for all students? How the effects of postsecondary remedial and remedial courses vary by level of academic preparation (An NCPR Working Paper). New York, NY: National Center for Postsecondary Research.
Borghans, L., Golsteyn, B. H. H., Heckman, J. J., & Humphries, J. E. (2016). What grades and achievement tests measure. Proceedings of the National Academy of Science, 113(47), 13354-13359.
Bui, S. A., Craig, S. G., and Imberman, S. A. (2014). Is gifted education a bright idea? Assessing the impact of gifted and talented programs on students. American Economic Journal: Economic Policy, 6(3):30 – 62.
Burnham, K. P. & Anderson, D. R., (2002). Model selection and multimodel inference: A practical information-theoretic approach, second edition. New York, NY: Springer.
Calcagno, J. C., & Long, B. T. (2008). The impact of postsecondary remediation using a regression discontinuity approach: Addressing endogenous sorting and noncompliance (NBER Working Paper No. 14194). Cambridge, MA: National Bureau of Economic Research. http://www.nber.org/papers/w14194
28
Card, D. and Giuliano, L. (2014). Does gifted education work? For which students? Technical Report 20453, National Bureau of Economic Research.
Card, D. and Giuliano, L. (2016). Can tracking raise the test scores of high-ability minority students? The American Economic Review, 106(10):2783–2816.
Chan, E.W. (2018). Parent Engagement and Gifted Students. Teachers College, Columbia University. Working Paper.
Chen, X. (2016). Remedial Coursetaking at U.S. Public 2- and 4-year Institutions: Scope, Experiences, and Outcomes (NCES 2016-405). U.S. Department of Education. Washington, DC: National Center for Education Statistic s. Retrieved from: https://nces.ed.gov/pubs2016/2016405.pdf
Cohodes, S. R. (2015): “The Long-Run Impacts of Tracking High-Achieving Students: Evidence from Boston’s Advanced Work Class,” Working Paper.
Dadgar, M., Collins, L., & Schaefer, K. (2015). Placed for success: How California community colleges can improve accuracy of placement in English and math courses, reduce remediation rates, and improve student success. Oakland, CA: Career Ladders Project.
Duflo, Esther, Pascaline Dupas, and Michael Kremer. (2011). Peer effects, teacher incentives, and the impact of tracking: Evidence from a randomized evaluation in Kenya. American Economic Review 101, no. 5:1739–74.
Eubanks, V. (2018). Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. New York, NY. St. Martin’s Press.
Fields, R. & Parsad, B. (2012). Tests and Cut Scores Used for Student Placement in Postsecondary Education: Fall 2011. Washington, DC: National Assessment Governing Board.
Gates, A. G., & Creamer, D. G. (1984). Two-year college attrition: Do student or institutional characteristics contribute most?. Community Junior College Research Quarterly of Research and Practice, 8(1-4), 39–51.
Hodara, M. (2012). Language Minority Students at Community College: How Do Developmental Education and English as a Second Language Affect Their Educational Outcomes? (Doctoral dissertation). Retrieved from ProQuest Dissertations and Theses. (Accession Order No. [3505981]).
Hodara, M., Jaggars, S. S., & Karp, M. M. (2012). Improving remedial education assessment and placement: Lessons from community colleges across the country (CCRC Working Paper No. 51). New York, NY: Community College Research Center. Retrieved from https://ccrc.tc.columbia.edu/publications/remedial-education-assessment-placement-scan.html
Holm, S. (1979). A Simple Sequentially Rejective multiple test procedure. Scandinavian Journal of Statistics, 6(2), 65-70.
29
Heubert, J. P., & Hauser, R. M. (Eds.). (1999). High-stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: National Academy Press.
Kautz, T., Heckman, J. J., Diris, R., ter Weel, B., and Borghans, L. (2014), Fostering and Measuring Skills: Improving Cognitive and Non‐cognitive Skills to Promote Lifetime Success, OECD, Paris.
Kautz, T. D., & Zanoni, W. (2014). Measuring and fostering noncognitive skills in adolescence: Evidence from Chicago Public Schools and the OneGoal Program. Unpublished manuscript, Department of Economics, University of Chicago, Chicago, IL.
Levin, H.M., McEwan, P.J., Belfield, C.R., Bowden, A.B., & Shand. R. (2017). Economic evaluation of education: Cost-effectiveness analysis and benefit-cost analysis. Thousand Oaks, CA: SAGE Publications.
Long, B.T., & Boatman, A. (2013). The role of remedial and remedial courses in access and persistence. In A. Jones & L. Perna (Eds.), The state of college access and completion: Improving college success for students from underrepresented groups. New York, NY: Routledge Books.
Long Beach City College, Office of Institutional Effectiveness. (2013). Preliminary overview of the effects of the fall 2012 Promise Pathways on key educational milestones. Long Beach, CA: Office of Institutional Effectiveness.
Martorell, P., & McFarlin, I., Jr. (2011). Help or hindrance? The effects of college remediation on academic and labor market outcomes. The Review of Economics and Statistics, 93(2), 436–454.
Mazerolle, M. J. (2004). Appendix 1: Making sense out of Akaike’s Information Criterion (AIC): its use and interpretation in model selection and inference from ecological data. Retrieved from http://theses.ulaval.ca/archimede/fichiers/21842/apa.html
McGurk, S. (2015). ACCUPLACER Overview. College Board. Retrieved from: https://www.accs.cc/default/assets/File/DPE_ISS/developmental%20education/accuplacer/overview.pdf
S. Mullainathan and J. Spiess. (2017). Machine learning: an applied econometric approach. Journal of Economic Perspectives, 31(2):87–106.
National Center for Public Policy and Higher Education and the Southern Regional Education Board. (2010). Beyond the Rhetoric: Improving College Readiness Through Coherent State Policy. Retrieved from: http://www.highereducation.org/reports/college_readiness/CollegeReadiness.pdf
North Carolina Community College System. (2015). NCCCS policy using high school transcript GPA and/or standardized test scores for placement (Multiple measures for placement). Retrieved from
30
https://www.southwesterncc.edu/sites/default/files/testing/Multiple%20Measures%20Revised%202015.pdf
Rodríguez, O., Bowden, A.B., Belfield, C.R., & Scott-Clayton, J. (2014) Remedial placement testing in community colleges: What resources are required, and what does it cost? (CCRC Working Paper No. 73). New York, NY: Community College Research Center. Retrieved from: https://ccrc.tc.columbia.edu/media/k2/attachments/remedial-placement-testing-resources.pdf
Rodríguez, O., Bowden, Belfield, C., & Scott-Clayton, J. (2015). Calculating the costs of remedial placement testing (CCRC Analytics). New York, NY: Community College Research Center.
Rothstein, J. M. (2004) College performance predictions and the SAT. Journal of Econometrics, 121(1-2), 2917-317.
Rutschow, E. Z., & Schneider, E. (2011). Unlocking the gate: What we know about improving remedial education. New York, NY: MDRC. Retrieved from: http://www.mdrc.org/sites/default/files/full_595.pdf.
Saxon, D., & Morante, E. (2014). Effective student assessment and placement: Challenges and recommendations. Journal of Developmental Education, 37(3), 24-31.
Scott-Clayton, J. (2012). Do high-stakes placement exams predict college success? (CCRC Working Paper No. 41). New York, NY: Community College Research Center. Retrieved from http://ccrc.tc.columbia.edu/media/k2/attachments/high-stakes-predict-success.pdf
Scott-Clayton, J., & Rodriguez, O. (2015). Development, discouragement, or diversion? New evidence on the effects of college remediation policy. Education Finance and Policy, 10(1), 4–45.
Scott-Clayton, J., Crosta, P. M., & Belfield, C. R. (2014). Improving the targeting of treatment: Evidence from college remediation. Educational Evaluation and Policy Analysis, 36(3), 371–393.
Strong American Schools. (2008). Diploma to nowhere. Retrieved from: http://paworldclassmath.webs.com/8534051-Diploma-To-Nowhere-Strong-American-Schools-2008.pdf.
31
FIGURES
Figure 1. Hypothetical spreadsheet provided to colleges on placement projections
Example Community College Math Success: C or above Minimum Probability of Success
Percent Placed into College Level
Percent Passing College Level
Cohort 3, Status Quo 30% 50% 45% 40% 60% 55% 20% 70% 65% 10% 75%
Eng. Success: C or above Minimum Probability of Success
Percent Placed into College Level
Percent Passing College Level
Cohort 3, Status Quo 40% 60% 45% 75% 60% 55% 60% 65% 65% 20% 70%
32
Figure 2. Gaps in Placement Rates Across Demographic Groups
Notes: Sample includes any student who took a placement exam in at least one subject and first enrolled at one of the five study colleges in the fall of 2016. Currently dual enrollment students and students who tested into ESL courses are excluded.
0%
5%
10%
15%
20%
25%
Black-WhiteGap
Hispanic-WhiteGap
Male-FemaleGap
Non-Pell-PellGap
Gap
in P
lace
men
t R
ate
Math English
33
Table 1. Sample Demographics by College Overall College 1 College 2 College 3 College 4 College 5 Female 48% 58% 48% 52% 50% 44% Race
White 43% 83% 78% 74% 38% 31% Asian 3% 0% 2% 0% 6% 3% Black 18% 7% 17% 21% 22% 18% Hispanic 22% 7% 0% 1% 30% 25% Pacific Islander 0% 0% 1% 0% 0% 0% Two or More Races 11% 1% 0% 0% 3% 19% Non-Resident Alien 0% 1% 0% 0% 0% 0% Race Unknown 2% 0% 0% 0% 0% 4% Race Missing 12% 0% 12% 76% 5% 0%
Age at Entry 19.94 20.10 22.06 21.03 20.43 18.99 Age at Entry Missing 0% 0% 0% 0% 0% 0% Pell Grant Recipient 49% 64% 66% 61% 36% 47% Total 3,865 327 408 673 1,002 2,319 Notes: Sample includes any student who took a placement exam in at least one subject and first enrolled at one of the five study colleges in the fall of 2016. Currently dual enrollment students and students who tested into ESL courses are excluded.
34
Table 2. Baseline characteristics by treatment assignment
Variable Control Mean
Treatment Mean Difference P-value Observations
Enrollment 82.10% 81.30% -0.01% 0.48 4,729 Female 47.50% 47.90% 0.40% 0.76 3,865 Race
White 43.50% 41.70% -1.80% 0.30 3,382 Asian 3.10% 3.70% 0.60% 0.34 3,382 Black 16.90% 19.20% 2.30% 0.09 3,382 Hispanic 21.90% 21.50% -0.40% 0.77 3,382 Two or More Races 11.50% 10.90% -0.60% 0.57 3,382 Race Missing 28.30% 28.70% 0.40% 0.76 4,729
Age at Entry 19.9 20
0.10 0.52 3,593 Pell Grant Recipient 50.10% 53.30% 3.20% 0.06 3,672 TAP Grant Recipient 39.10% 39.60% 0.50% 0.78 3,721 GED Recipient 4.50% 4.20% -0.30% 0.57 4,600 HS GPA (100 Scale) 78 78 0.00% 0.97 1,862 HS GPA (missing) 59.80% 61.70% 1.90% 0.18 4,729 ACCUPLACER Exam Scores
Arithmetic 45.0 45.9 0.90 0.26 3,439 Algebra 53.1 53.7 0.60 0.77 4,407 College-level math 35.5 35.4 -0.10 0.89 455 Reading 72.3 71.9 -0.40 0.47 3,696 Sentence Skills 76.3 76.1 -0.20 0.12 1,072 Written Exam 6.1 6.1 0.00 0.11 3,324
Total 2,274 2,455 4,729 Notes: Sample includes any student who took a placement exam in at least one subject at one of the five study colleges in the fall of 2016. Currently dual enrollment students and students who tested into ESL courses are excluded.
35
Table 3. Changes in Placement for Program-Group Students
(1) (2) (3) (4) (5) (6)
Exempt from Placement
Took Placement
Exam
Same Placement
Under Business as
Usual
Placement Changed from Business as
Usual
Higher Placement
than Business as Usual
Lower Placement
than Business as Usual
Math Placement
N 190 2,265 1,795 470 310 160 % of Total Program Sample
7.74% 92.26% 73.12% 19.14% 12.63% 6.52%
% of Students Placed in Math
- 100.00% 79.25% 20.75% 13.69% 7.06%
English Placement
N 591 1864 967 897 774 123 % of Total Program Sample
24.07% 75.93% 39.39% 36.54% 31.53% 5.01%
% of Students Placed in English
- 100% 51.88% 48.12% 41.52% 6.60%
Notes: Sample includes any student who took a placement exam in at least one subject at one of the five study colleges in the fall of 2016. Currently dual enrollment students and students who tested into ESL courses are excluded.
36
Table 4. Effect on Math and English Coursework (1) (2) (3) (4) (5) (6)
Placed Math
Enrolled Math
Passed Math
Placed English
Enrolled English
Passed English
Treatment 0.050*** 0.046*** 0.028*** 0.304*** 0.193*** 0.125*** (0.014) (0.012) (0.010) (0.014) (0.014) (0.014)
Control Mean 0.437 0.237 0.134 0.524 0.408 0.272
Observations 4,371 4,371 4,371 3,533 4,371 4,371 Notes: Robust standard errors shown in parenthesis. All models include fixed effects for college, controls for demographic indicators including race, gender and age, Pell and TAP Grant recipient status, and calculated math and English algorithm values.
Table 5. Effect on college-course outcomes (1) (2) (3)
Enrolled Any College
Course Passed Any College
Course Total Credits
Earned
Treatment 0.009*** 0.046*** 0.599*** (0.003) (0.012) (0.129)
Control Mean 0.807 0.237 5.170 Observations 4,729 4,729 4,729
Notes: Robust standard errors shown in parenthesis. All models include fixed effects for college, controls for demographic indicators including race, gender and age, Pell and TAP Grant recipient status, and calculated math and English algorithm values.
37
Table 6. Interaction coefficients on college-course outcomes Math Outcomes English Outcomes
CL
Placement CL
Enrollment CL
Completion CL
Placement CL
Enrollment CL
Completion Treatment X Black -0.012 0.000 0.003 0.180*** 0.173*** 0.055
(0.033) (0.046) (0.040) (0.043) (0.049) (0.052)
Treatment X Hispanic -0.007 0.012 0.014 0.134*** 0.116** 0.022 (0.043) (0.043) (0.037) (0.041) (0.045) (0.049)
Observations 2,568 2,568 2,568 2,081 2,081 2,081
Treatment X Female 0.083*** 0.051* 0.045* 0.025 -0.003 0.035 (0.031) (0.030) (0.026) (0.030) (0.033) (0.035)
Observations 3,563 3,563 3,563 2,832 2,832 2,832
Treatment X Pell 0.022 -0.009 0.008 0.021 0.066** 0.045 (0.032) (0.031) (0.026) (0.030) (0.033) (0.036)
Observations 3,396 3,396 3,396 2,748 2,748 2,748 College FE YES YES YES YES YES YES Demographic Indicators YES YES YES YES YES YES Income indicators YES YES YES YES YES YES Algorithm Values YES YES YES YES YES YES Notes: Robust standard errors shown in parenthesis. Each coefficient calculated using a model that includes fixed effects for colleges and controls for demographic indicators including race, gender and age, proxies for income including Pell and TAP Grant recipient status and calculated math and English algorithm values.
38
APPENDIX
Figure A.1. Math algorithm components by college
H.S. GPA
Years Since H.S. Graduation
GED Status
Regents Math Score
SAT Math Score
Arithmetic Score
Algebra Score
College-level math
College 1 X X X X X X College 2 X X X X X X X X College 3 X X X X X College 4 X X X X X College 5 X X X X
Figure A.2. English algorithm components by college
H.S. GPA
Years Since H.S. Graduation
GED Status
Reading Score
Sentence Skills Score
Essay Score
College 1 X X X X X College 2 X X X X X X College 3 X X X X X College 4 X X X X X College 5 X X X X
39
Table A.1. Math predictive models College 1 – Math LPM_1 LPM_2 LPM_3 LPM_4 hs_avg_1 .0345778*** .0275944*** .0302728*** [.0023706] [.0025555] [.0024947] missing_gpa 2.821865*** 2.269872*** 2.583032*** [.1947364] [.2088917] [.2098888] ACPL_algebra_scr_1 .00637*** .0044024*** .0041625*** [.0012656] [.0012155] [.0012265] ACPL_arithmetic_scr_1m 0.0556 0.037867 0.065269 [.0395568] [.041076] [.0416692] ACPL_algebra_scr_1m .6339479*** .3611804** .3349449* [.1408736] [.136684] [.1397535] ACPL_coll_lvl_math_scr_1m -0.08707 -.0881443~ -0.08396 [.0554991] [.0513349] [.0513172] years_out .0201255*** [.0041444] hs_grad_yr_1m -0.05616 [.0679753] GED -.1924288** [.0705082] missing_hs_diploma 0.120686 [.1002104] _cons -2.336633*** 0.037725 -2.047722*** -2.303033*** [.1920042] [.1224331] [.2166218] [.2128129] N 1166 1166 1166 1166 r2 0.124598 0.104755 0.176263 0.206776 aic 1538.419 1568.555 1475.49 1439.478 1100 point scale 2Binary indicator 3Test score range 20-120 4Test score range 1-8
40
Table A.2. English algorithm models
English Model 1 Model 2 Model 3 Model 4 HS GPA1 .0223682*** .0224298*** .0237109***
[.0012347] [.0012391] [.0013036] Missing GPA2 1.774132*** 1.760689*** 1.958613***
[.1025804] [.1032799] [.1139943] Reading3 .0012387* .0011471* 0.0005107
[.0005945] [.0005718] [.0005756] Sentence Skills3 0.0004641 -0.0003542 -0.0003762
[.0007228] [.0006933] [.000688] Written Essay4 0.0000932 -0.0016189 -0.0005364
[.0021619] [.0022682] [.0021195] Missing Reading2 .3153487*** .3324963*** .2095208**
[.0729108] [.0741815] [.0768058] Missing Sentence Skills2 -0.0265692 -.1465808* -.1544521*
[.077416] [.0740619] [.0737526] Missing Written Essay2 0.0205766 0.008454 0.0165223
[.0266048] [.0259509] [.0254477] years since HS graduation .0092922***
[.0014382] Missing Year of Graduation2 0.0414768
[.0865134] GED2 -.1901675*
[.0825778] Missing Diploma Type2 0.03203
[.0941508] High School Rank Percentile 0.0003502 [.000273] Missing High School Rank2 -0.0057774 [.0413373] Constant -1.146831*** .4775231*** -1.217838*** -1.300938***
[.100816] [.0600773] [.1107201] [.1176786] N 3,786 3,786 3,786 3,786 R2 0.0721164 0.0061075 0.0783223 0.0947421 AIC 4893.233 5161.418 4879.827 4823.771 1100 point scale 2Binary indicator 3Test score range 20-120 4Test score range 1-8
41
Table A.3. College Descriptives
Institution College 1 College 2 College 3 Onondaga College 4 Schenectad
y College 5
GENERAL COLLEGE INFORMATION
Student Population 7001 5,513 7,712 23984 10098 8458 22093 Full-time Faculty 69 80 151 194 122 79 215 Part-time Faculty 170 177 0 480 409 0 2 Student/Faculty Ratio 20 18 16 23 23 23 16 % Receiving Financial Aid 92% 91% 92% 92% 56% 92% 70% DEMOGRAPHICS
Race/ethnicity:
American Indian or Alaska Native 0% 1% 1% 1% 0% 1% 1% Asian 1% 2% 1% 3% 5% 7% 4% Black or African American 5% 7% 11% 12% 18% 14% 21% Hispanic/Latino 3% 11% 3% 5% 20% 6% 32% Native Hawaiian or Other 0% 0% 0% 0% 0% 1% 0% White 85% 73% 80% 49% 39% 67% 33% Multi-Ethnic 2% 3% 2% 3% 2% 2% 2% Race/Ethnicity Unknown 3% 3% 1% 27% 15% 2% 5% Non-Resident Alien 1% 1% 0% 0% 1% 0% 1% Gender:
Female 60% 58% 59% 52% 54% 53% 53% Male 40% 42% 41% 48% 46% 47% 47% Age:
Under 18 30% 17% 19% 24% 10% 37% 1% 18-24 44% 52% 60% 55% 63% 40% 69% 25-65 26% 31% 21% 21% 26% 23% 30% Age Unknown 0% 0% 0% 0% 0% 0% 0% RETENTION/GRADUATION RATES
Retention · Full-Time Students 56% 55% 63% 57% 68% 56% 64% · Part-Time Students 28% 30% 47% 34% 56% 50% 53% Three-Year Graduation Rate 24% 27% 28% 20% 29% 20% 15% Transfer Out Rate 18% 19% 18% 22% 19% 22% 18% Source: U.S. Department of Education, National Center for Education Statistics, IPEDS, Fall 2015, Institutional Characteristics.
42
Table A.4. Effect on placement into college-level math (1) (2) (3) (4) Treatment 0.044*** 0.047*** 0.049*** 0.050***
(0.014) (0.014) (0.014) (0.014) Control Mean 0.437
College FE YES YES YES YES Demographic Indicators NO YES YES YES Income indicators NO NO YES YES Algorithm Values NO NO NO YES Observations 4,371 4,371 4,371 4,371 Notes: Robust standard errors shown in parenthesis. The model from column (1) includes only fixed effects for college and no additional controls. Column (2) includes fixed effects for colleges and controls for demographic indicators including race, gender and age. Column (3) includes college fixed effects, controls for demographic indicators and proxies for income including Pell and TAP Grant recipient status. Colum (4) includes all the previous controls plus calculated math and English algorithm values. *** p<0.01, ** p<0.05, * p<0.1
Table A.5. Effect on enrolling in college-level math in term 1
(1) (2) (3) (4) Treatment 0.039*** 0.044*** 0.046*** 0.046***
(0.012) (0.012) (0.012) (0.012) Control Mean 0.237
College FE YES YES YES YES Demographic Indicators NO YES YES YES Income indicators NO NO YES YES Algorithm Values NO NO NO YES Observations 4,371 4,371 4,371 4,371 Notes: Robust standard errors shown in parenthesis. The model from column (1) includes only fixed effects for college and no additional controls. Column (2) includes fixed effects for colleges and controls for demographic indicators including race, gender and age. Column (3) includes college fixed effects, controls for demographic indicators and proxies for income including Pell and TAP Grant recipient status. Colum (4) includes all the previous controls plus calculated math and English algorithm values. *** p<0.01, ** p<0.05, * p<0.1
43
Table A.6. Effect on enrolling and passing college-level math in term 1 (1) (2) (3) (4) Treatment 0.025** 0.027*** 0.028*** 0.028***
(0.010) (0.010) (0.010) (0.010) Control Mean 0.134
College FE YES YES YES YES Demographic Indicators NO YES YES YES Income indicators NO NO YES YES Algorithm Values NO NO NO YES Observations 4,371 4,371 4,371 4,371 Notes: Robust standard errors shown in parenthesis. The model from column (1) includes only fixed effects for college and no additional controls. Column (2) includes fixed effects for colleges and controls for demographic indicators including race, gender and age. Column (3) includes college fixed effects, controls for demographic indicators and proxies for income including Pell and TAP Grant recipient status. Colum (4) includes all the previous controls plus calculated math and English algorithm values. *** p<0.01, ** p<0.05, * p<0.1
44
Table A.7. Effect on placement into college-level English (1) (2) (3) (4) Treatment 0.298*** 0.301*** 0.302*** 0.304***
(0.015) (0.015) (0.014) (0.014) Control Mean 0.524
College FE YES YES YES YES Demographic Indicators NO YES YES YES Income indicators NO NO YES YES Algorithm Values NO NO NO YES Observations 3,533 3,533 3,533 3,533 Notes: Robust standard errors shown in parenthesis. The model from column (1) includes only fixed effects for college and no additional controls. Column (2) includes fixed effects for colleges and controls for demographic indicators including race, gender and age. Column (3) includes college fixed effects, controls for demographic indicators and proxies for income including Pell and TAP Grant recipient status. Colum (4) includes all the previous controls plus calculated math and English algorithm values. *** p<0.01, ** p<0.05, * p<0.1
Table A.8. Effect on enrolling in college-level English in term 1
(1) (2) (3) (4) Treatment 0.183*** 0.192*** 0.193*** 0.193***
(0.016) (0.014) (0.014) (0.014) Control Mean 0.408
College FE YES YES YES YES Demographic Indicators NO YES YES YES Income indicators NO NO YES YES Algorithm Values NO NO NO YES Observations 3,533 3,533 3,533 3,533 Notes: Robust standard errors shown in parenthesis. The model from column (1) includes only fixed effects for college and no additional controls. Column (2) includes fixed effects for colleges and controls for demographic indicators including race, gender and age. Column (3) includes college fixed effects, controls for demographic indicators and proxies for income including Pell and TAP Grant recipient status. Colum (4) includes all the previous controls plus calculated math and English algorithm values. *** p<0.01, ** p<0.05, * p<0.1
45
Table A.9. Effect on enrolling and passing college-level English in term 1
(1) (2) (3) (4) Treatment 0.118*** 0.124*** 0.125*** 0.125***
(0.016) (0.014) (0.014) (0.014) Control Mean 0.272
College FE YES YES YES YES Demographic Indicators NO YES YES YES Income indicators NO NO YES YES Algorithm Values NO NO NO YES Observations 3,533 3,533 3,533 3,533 Notes: Robust standard errors shown in parenthesis. The model from column (1) includes only fixed effects for college and no additional controls. Column (2) includes fixed effects for colleges and controls for demographic indicators including race, gender and age. Column (3) includes college fixed effects, controls for demographic indicators and proxies for income including Pell and TAP Grant recipient status. Colum (4) includes all the previous controls plus calculated math and English algorithm values. *** p<0.01, ** p<0.05, * p<0.1
46
Table A.10. Effect on enrolling in any college-level course in term 1
(1) (2) (3) (4) Treatment 0.001 0.042*** 0.009*** 0.009***
(0.011) (0.011) (0.003) (0.003) Control Mean 0.807 0.616
College FE YES YES YES YES Demographic Indicators NO YES YES YES Income indicators NO NO YES YES Algorithm Values NO NO NO YES Observations 4,729 4,729 4,729 4,729 Notes: Robust standard errors shown in parenthesis. The model from column (1) includes only fixed effects for college and no additional controls. Column (2) includes fixed effects for colleges and controls for demographic indicators including race, gender and age. Column (3) includes college fixed effects, controls for demographic indicators and proxies for income including Pell and TAP Grant recipient status. Colum (4) includes all the previous controls plus calculated math and English algorithm values. *** p<0.01, ** p<0.05, * p<0.1
47
Table A.11. Effect on enrolling in and passing any college-level course in term 1 (1) (2) (3) (4) Treatment 0.034** 0.041*** 0.042*** 0.042***
(0.014) (0.011) (0.011) (0.011) Control Mean 0.616
College FE YES YES YES YES Demographic Indicators NO YES YES YES Income indicators NO NO YES YES Algorithm Values NO NO NO YES Observations 4,729 4,729 4,729 4,729 Notes: Robust standard errors shown in parenthesis. The model from column (1) includes only fixed effects for college and no additional controls. Column (2) includes fixed effects for colleges and controls for demographic indicators including race, gender and age. Column (3) includes college fixed effects, controls for demographic indicators and proxies for income including Pell and TAP Grant recipient status. Colum (4) includes all the previous controls plus calculated math and English algorithm values. *** p<0.01, ** p<0.05, * p<0.1
Table A.12. Effect on cumulative college-level credits earned in term 1
(1) (2) (3) (4) Treatment 0.503*** 0.572*** 0.593*** 0.599***
(0.150) (0.130) (0.129) (0.129) Control Mean 5.170
College FE YES YES YES YES Demographic Indicators NO YES YES YES Income indicators NO NO YES YES Algorithm Values NO NO NO YES Observations 4,729 4,729 4,729 4,729 Notes: Robust standard errors shown in parenthesis. The model from column (1) includes only fixed effects for college and no additional controls. Column (2) includes fixed effects for colleges and controls for demographic indicators including race, gender and age. Column (3) includes college fixed effects, controls for demographic indicators and proxies for income including Pell and TAP Grant recipient status. Colum (4) includes all the previous controls plus calculated math and English algorithm values. *** p<0.01, ** p<0.05, * p<0.1
48
Table A.13. Costs for Implementation of Placement Algorithms
Ingredient Price per
FTE
College 1 College 2 College 3 College 4 College 5 Total
Personnel (FTEs): IT a $56,230 0.30 0.18 0.05 0.10 0.18 0.80 Program b $47,500 0.87 0.15 0.14 0.91 0.44 2.50 Senior/Faculty c $62,500 0.25 0.25 0.25 0.07 0.16 0.98 Administrative support d $35,950
0.05 0.05 0.05 0.05 0.05 0.25
Evaluator time e $56,230 0.10 0.10 0.10 0.10 0.10 0.50
Total Personnel Costs
$81,000 $40,180 $32,470 $60,160 $48,600 $262,410
Fringe benefits f $26,730 $13,260 $10,720 $19,850 $16,040 $86,600 Overheads / facilities g
$78,570 $38,970 $31,500 $58,360 $47,140 $254,540
Administer Placement Test h
$82,590 $21,540 $31,260 $12,930 $10,770 $159,090
Total Cost $268,890 $113,950 $105,950 $151,300 $122,550 $762,640
Total Cost over status quo
$186,300 $92,410 $74,690 $138,370 $111,780 $603,550
Students per semester 2,753 718 1,042 431 359 5,303 Average Cost over status quo
$70 $130 $70 $320 $310 $110
Notes: 2016 dollars. Present values (discount rate=3%). Rounded to $10. Ingredients information on FTEs from interviews with key personnel at five colleges (no information was available from Onondaga CC). a Salary data from https://www.cs.ny.gov/businesssuite/Compensation/Salary-Schedules/index.cfm?nu=PST&effdt=04/01/2015&archive=1&fullScreen. b Annual salary (step 4, grade 13) from https://www.suny.edu/media/suny/content-assets/documents/hr/UUP_2011-2017_ProfessionalSalarySchedule.pdf. c Midpoint MP-IV https://www.suny.edu/hr/compensation/salary/mc-salary-schedule/ d https://www.cs.ny.gov/businesssuite/Compensation/Salary-Schedules/index.cfm?nu=CSA&effdt=04/01/2015&archive=1&fullScreen. e Estimated from timesheets (CCRC/MDRC evaluators). f Uprated from ratio of fringe benefits to total salaries (IPEDS data (2013, 846 public community colleges).
g Uprated from ratio of all other expenses to total salaries (IPEDS data (2013, 846 public community colleges).
h Cost to administer placement test from Rodriguez et al. (2014).
49
Table A.14. Operating Costs for Placement Algorithms
Ingredient Price per
FTE
College 1 College 2 College 3 College 4 College 5 Total
Personnel (FTEs):
IT a $56,230 0.03 0.02 0.01 0.01 0.02 0.08 Program b $47,500 0.87 0.15 0.14 0.91 0.44 2.50
Senior/Faculty c $62,500 0.03 0.03 0.03 0.01 0.02 0.10 Administration d $35,950 0.01 0.01 0.01 0.01 0.01 0.03
Total Personnel
Costs $44,520 $10,020 $8,640 $44,140 $23,150 $130,470
Fringe benefits e $14,690 $3,310 $2,850 $14,570 $7,640 $43,060 Overheads /
facilities f $14,250 $3,210 $2,760 $14,130 $7,410 $41,770
Administer Placement Test g
$82,590 $21,540 $31,260 $12,930 $10,770 $159,090
Total Operating Cost (TOC)*
$156,050 $38,080 $45,510 $85,770 $48,970 $374,390
TOC over status
quo $73,460 $16,540 $14,250 $72,840 $38,200 $215,300
Students per
semester 2,753 718 1,042 431 359 5,303
Average Operating Cost over status quo
$30 $20 $10 $170 $110 $40
Notes: 2016 dollars. Present values (d=3%). Rounded to $10. Ingredients information on FTEs from interviews with key personnel at five colleges (no information was available from Onondaga CC). a Salary data from online here. b Annual salary (step 4, grade 13) from online here. c Midpoint MP-IV. e Estimated from timesheets (CCRC/MDRC evaluators). f Uprated from ratio of fringe benefits to total salaries (IPEDS data (2013, 846 public community colleges).
g Uprated from ratio of all other expenses to total salaries (IPEDS data (2013, 846 public community colleges).
h Cost to administer placement test from Rodriguez et al. (2014). * Operating Cost refers to running of new placement system after initial algorithm has been developed and tested.