Using Predictive Analytics to Track Students: Evidence ...psb2101/bergmananalytics.pdf · Using Predictive Analytics to Track Students: Evidence from a ... We created college-specific

1

Using Predictive Analytics to Track Students: Evidence from a Multi-College Experiment∗

Elisabeth Barnett,† Peter Bergman,‡ Elizabeth Kopko,† Vikash Reddy†

EARLY DRAFT VERSION: NOT FOR DISTRIBUTION

Tracking is widespread in U.S. education. In higher education alone, at least 71% of post-secondary institutions use a test to track students, and more than 80% of these institutions use a test as the sole criterion to determine placement. While recent research has shown that tracking can have positive effects on student learning, inaccurate placement has consequences: students face misaligned curricula and must pay tuition for remedial courses that do not bear credits toward graduation. We develop an algorithm to place students that combines multiple measures with predictive analytics. We then conduct an experiment across multiple colleges to evaluate its impact. Compared to colleges’ most commonly-used placement test, the algorithm is more predictive of future performance and substantially increases placements into college-level courses. This is particularly true for English courses and for female, Black and Hispanic students. The algorithm tends to predict pass rates more accurately in math than English.

∗ The research reported here was undertaken through the Center for the Analysis of Postsecondary Readiness and supported by the Institute of Education Sciences, U.S. Department of Education, through Grant R305C140007 to Teachers College, Columbia University. The opinions expressed are those of the authors and do not represent views of the Institute or the U.S. Department of Education. † Community College Research Center. ‡ Teachers College, Columbia University. Email address: [email protected].

mailto:[email protected]

2

1. Introduction

Tracking students by prior test scores is widespread in U.S. education. In higher

education alone, 60% of incoming college freshman take a remedial course in math or

English (National Center for Public Policy and Higher Education & Southern Regional

Education Board, 2010), at least 71% of post-secondary institutions use a test to track

students, and more than 80% of these institutions use a test as the sole criterion to

determine placement (Fields & Parsad, 2012). These rates are higher in two-year colleges,

which enroll 40% of post-secondary students and graduate 39% of their students (Fields &

Parsad, 2012; Chen & Simone, 2016).1 Prior research has shown large potential benefits of

tracking (Card & Giuliano, 2016; Duflo, Dupas, & Kremer, 2011), but inaccurate

placement has consequences: students face misaligned curriculum and must pay tuition for

remedial courses that do not bear credits toward graduation. The benefits of tracking are

predicated on the validity, reliability, and fairness of the tests used to place students

(Heubert & Hauser, 1999).2

In practice however, the tests used for tracking may lack these characteristics. For

example, using data from the University of California system, Rothstein (2004) finds that

a significant part of the SAT’s predictive value derives from its correlations with student

demographics, and that GPA is an important additional predictor of future college

performance. In the community college setting, Scott-Clayton et al. (2014) find that

placement scores explain little of the variation in college performance, and this validity

1 This is the graduation rate within six years. 2 Validity examines how well the test measures what its users intend it to measure, such as a student’s mastery of a subject area. Reliability assesses the reproducibility of the test’s results; that is, if a student were to take the test on multiple occasions, then they would receive similar scores each time. Fairness, which is closely related to validity, is the ability of the test to provide valid results for all subgroups. For instance, a test may inaccurately measure the math skills of a certain subgroup by relying on culturally-specific language in a word problem. See (Heubert & Hauser, 1999).

3

varies by subgroup. This could impede the ability of colleges to align curricula to

students’ abilities. Scott Clayton et al. show how using multiple student characteristics—

especially high school GPA—more accurately predicts college performance than test scores

alone. These findings contribute to a concern from colleges about the fairness of using a

test score as the sole criterion for placement (Saxon & Morante, 2014). Given that

placement tests aim to predict students’ readiness for college-level courses, using multiple

student characteristics, such as high school GPA, to predict course performance could

mitigate concerns about validity and fairness (Mullainathan & Spiess, 2017).

In this paper, we develop and test a placement algorithm that combines multiple

measures and predictive analytics. To do so, we recruited seven community colleges and

gathered historical data on their students to estimate models predicting students’

likelihood of passing college-level math and English courses. These predictions

incorporated measures such as placement-exam scores, high school GPA, diploma status,

and time since high school graduation. We created college-specific placement algorithms

for math and English that placed students into a remedial course if a student’s predicted

probability of passing a college-level course was below a cut point chosen by each college.

Students were randomly assigned to either colleges’ business-as-usual placement system or

the placement algorithm.

At the outset, it is unclear what impacts this algorithm will have on placement

outcomes. Measures such as high school GPA can reflect a wider array of cognitive and

non-cognitive skills than test scores alone (Kautz, et al., 2014; Kautz & Zanoni, 2014;

Borghans, Golsteyn, & Heckman, 2016). GPA also has a high degree of reliability (Bacon

& Bean, 2006). Improving the validity and the reliability of the placement instrument

could help colleges align course curricula to students’ abilities. If colleges hold placement

rates constant, this should improve students’ course outcomes. However, a higher-quality

4

placement instrument does not imply that placement rates will change either on net or for

a given individual. At particular cut points (e.g. the extremes) it is possible that the

placements assigned by the algorithm and the test score will be the same. Furthermore,

colleges can choose the cut points for placement into the college-level courses, which

affects the number of students placed into these courses and their expected pass rates

conditional on placement.

This paper presents our findings on how five of the seven participating colleges

implemented the placement algorithm, how it affected students’ placement outcomes, and

what impacts this had on first-semester college performance.3 We find that colleges

generally chose cut points to either hold placement rates constant or to hold pass rates

constant. Overall, placement rates changed significantly: relative to colleges’ business as

usual, 21% of math placements changed and 48% of English placements changed. Of all

students, the algorithm placed 14% into a higher-level math class and 7% into a lower-

level math course compared to what would have occurred under the placement test. These

numbers were 41% and 7% for English placements.

These changes led to increases in enrollment and pass rates in college-level courses.

Placement via the algorithm increased enrollment in college-level math and English

courses by 5 percentage points and 19 percentage points, respectively. Treatment group

students were also 3 percentage points more likely to pass college-level math and 13

percentage points more likely to pass college-level English.

We also find evidence that the placement algorithm significantly narrowed certain

demographic gaps in placement rates. Even after controlling for multiple-testing,

placement by the algorithm significantly increased the representation of female students

in college-level math courses by 8 percentage points, the representation of Black students

3 The remaining two colleges started student intake after Fall 2016, so their data will not be available until 2019.

5

in college-level English placement by 18 percentage points (relative to white students),

and the representation of Hispanic students in college-level English placement by 13

percentage points (relative to white students).

Lastly, the algorithm more accurately predicts pass rates in math courses than English

courses. Actual pass rates were 2 percentage points lower than predicted in math and 7

percentage points lower than predicted in English. This pattern matches our findings from

the development of the algorithm: our measures of model fit were better for predicting

passing math grades than passing English grades. With richer high school transcript

information and fewer implementation constraints, which restricted the set of models we

could use, it is possible this accuracy could be improved.

Our research contributes to a broader literature that focuses on the effects of tracking.

Oakes (1985) argued that the evidence on tracking is inconsistent, and, in practice,

higher-track classes tend to have higher-quality classroom experiences than lower-track

classes. More recently, Duflo, Dupas, & Kremer (2011) randomized students in Kenya to

schools that either tracked students by test scores or assigned students randomly to

classrooms. They found that test scores in schools with tracking improved relative to the

control group, both for students placed in the higher-scoring and the lower-scoring tracks.

Card and Giuliano (2016) studied a district policy in which students are placed into

classrooms based on their test scores. This program caused large increases in the test

scores of Black and Hispanic students.

A number of studies look at the effects of being placed into a higher track versus a

lower track (as opposed to the effects of implementing a tracking system). Bui, Craig, and

Imberman (2014) and Card and Giuliano (2014) find that gifted students’ placement into

advanced coursework does not change test scores. However, Cohodes (2015) and Chan

6

(2018) find increases in enrollment in advanced high-school coursework and college.4 In

higher education, the evidence that placement into college-level courses improves

academic outcomes for marginal students is more mixed, and several regression-

discontinuity analyses find no effects (Calcagno & Long 2008; Bettinger & Long 2009;

Boatman & Long 2010; Martorell and McFarlin 2011; Allen & Dadgar 2012; Hodara 2012;

Scott-Clayton & Rodriguez, 2015).

Recently, economists have argued that data-driven algorithms can improve human

decision making and reduce biases (Mullainathan & Spiess, 2017). Kleinberg et al. (2017)

show that a machine-learning algorithm has the potential to reduce bias in bail decisions

relative to judges’ decisions alone. At the same time, others are concerned that these

algorithms could embed biases into decision making and exacerbate inequalities (Eubanks,

2018). We contribute to this literature by experimentally comparing the impacts of a

data-driven algorithm to another quantitative measure, test scores; we find that

placement by the algorithm tends to reduce certain gaps in college placement outcomes

across demographic subgroups.

To the extent that colleges want to implement such a placement algorithm, any

potential benefits should be weighed against its costs. We estimate that the cost per

student in the initial year of the study—above and beyond the business as usual—is $110.

The first year of implementation involves a number of fixed costs, and in subsequent years

we estimate operating costs of $40 additional dollars per student, which could be

significantly lower depending on how colleges collect historical data from students (e.g.

moving away from hand-data entry) and what technology they use to implement the

placement system. Colleges asked whether further savings could arise by not paying to use

4 Several other studies look at the effects of placing into high-test score schools and the results are much more mixed (Jackson 2010; Pop-Eleches and Urquiola 2013; Abdulkadiroglu, Angrist, and Pathak 2014; Dobbie and Fryer 2014).

7

copyrighted exams. We examine the extent to which the algorithm would place students

differently if test scores were not used for prediction. We find that placement rates would

change substantially more for math courses than for English courses, where only 5%-8% of

placements would change.

The rest of this paper proceeds as follows. Section 2 provides further background

information about tracking in postsecondary institutions and study implementation.

Section 3 describes the experimental design, data and empirical strategies. Section 4

presents our findings. Section 5 provides a detailed cost analysis, and Section 6 concludes.

2. Background, Site Recruitment, Algorithm Implementation

Remedial education represents a significant aspect of the public higher education system,

both in terms of enrollment and cost. In the 2011-12 academic year, 41% of first- and

second-year students at four-year institutions had taken a remedial course, while at two-

year institutions, even more—68% of students—had taken a remedial course (Chen, 2016).

The cost of remedial education has been estimated to be as much as $2.9 billion annually

(Strong American Schools, 2010).

The primary purpose of remedial education is to provide differentiated instruction to

under-prepared students so they have the skills to succeed in college-level coursework

(Bettinger & Long, 2009). However, there is evidence that community-colleges’ tracking

systems frequently underplace students—track them into remedial courses when they

could have succeeded in college-level courses—and overplace students—track them into

college-level courses when they were unlikely to be successful (Belfield & Crosta, 2012;

Scott-Clayton, 2012).

Most institutions administer a multiple-choice test in mathematics, reading, and

writing to determine whether incoming students should be placed into remedial or college-

level courses. The ACCUPLACER, a computer-adaptive test offered by the College

8

Board, is the most widely used college placement system in the U.S. (College Board,

2015). Colleges choose a cut score for each test, and place students scoring above this

score into college-level courses and students below the cut score into remedial courses.5

Given the placement rules and immediate test results provided by the ACCUPLACER

platform, students often learn their placement immediately after completing their exam.

Site Selection and Descriptions

All of the participating colleges are part of the State University of New York (SUNY)

system. Table A.3 of the Appendix provides an overview of each college’s characteristics

using public data. The smallest of the colleges serves roughly 5,500 students while the

largest serves over 22,000 students. As is common in community college settings, a large

share of the students is part-time and many are adult learners, with between 21% and

30% of students over the age of 25. For most of the colleges, the majority of students

receive financial aid. The colleges have similar transfer-out rates of between 18% and 22%

and three-year graduation rates are between 15% and 29%. The colleges also tend to serve

local student populations. Lastly, all of the colleges have an open-door admissions policy.

This means that the colleges do not have admission requirements beyond having

graduated from high school or earned a GED.

Creating the Placement Algorithm

Colleges preferred that we develop college-specific algorithms. We created separate

algorithms for each college using data on each college’s previous cohorts of students.

Five colleges in the study had been using ACCUPLACER for several years. One

college had been using ACCUPLACER assessments for English but had transitioned from

5 Certain colleges may offer exemptions from testing; for instance, this can occur for students who speak English as a second language or who have high SAT scores.

9

a home-grown math assessment to the ACCUPLACER math assessments too recently to

generate historical data; at this college we tested an algorithm for English placement only.

One college in our sample had been using the COMPASS exam, which was discontinued

by ACT shortly after this study began. At this college, we tested an algorithm that does

not use any placement test scores against a placement system that incorporates only

ACCUPLACER test results, and only for math placement.

We worked with administrators at each college to obtain the data needed to estimate

each algorithm. In some instances, these measures were stored in college databases. In

other instances, colleges maintained records of high school transcripts as digital images; in

these cases, we had the data entered into databases by hand.

In order to estimate the relationships between predictors in the dataset and

performance in initial college-level courses, we restricted the historical data to students

who took placement tests and who enrolled in a college level course without first taking a

remedial course. Importantly, the latter group of students was selected into college-level

courses based on observable variables. This set of students constituted our estimation

sample for developing the algorithm.

We aimed to predict “success” in future coursework for each student. We met with

colleges to decide how to define success, who agreed to define this as a grade of C or

better in the initial college-level course associated with the placement decision. We then

regressed an indicator for success in a college-level course on various sets of predictors

using Probit and linear probability models (LPM). We used the results of the LPMs

because we could not implement non-linear models into colleges’ existing placement

software. Nonetheless, the non-linear models yielded similar placement decisions as LPMs,

especially around the relevant cut points that colleges chose to determine placements.

10

For each college, we estimated regressions relating placement test scores or high school

GPA to “success” in initial college-level classes for a given subject. We added additional

information from high school transcripts where such information was available. This

information included the number of years that have passed since high school completion

and whether the diploma was a standard high school diploma or a GED (diploma status).

We also tested the inclusion of additional variables such as SAT scores, ACT scores, high

school rank, indicators for high school attended, and scores on the New York Regents

exams where they were available (often these were missing), as well as interaction terms

and higher-order terms for variables. When variables were missing, we imputed values and

added indicators for missing. Identical procedures were followed for both English and

Math.

[1] 1(C or Better)𝑖𝑖 = α + (HS GPAi)β1 + εi

[2] 1(C or Better)i = α + (ACCUPLACERi)β1 + εi

[3] 1(C or Better)i = α + (HS GPAi)β1 + (ACCUPLACERi)β2 + εi

[4] 1(C or Better)i = α + (HS GPAi)β1 + (ACCUPLACERi)β2 + Xiβ3 + εi

The focus of this analysis was the overall predictive power of the model. As such, we

calculated the Akaike Information Criterion (AIC) statistics for each model. The AIC is a

penalized-model-fit criterion that combines a model’s log-likelihood with the number of

parameters included in a model (Akaike, 1998; Burnham, Anderson, & Burnham, 2002;

Mazerolle, 2004).6 In practice, we did not have many variables to select from and higher-

order and interaction terms had little effect on prediction criteria (and additional

complexity was difficult to implement).

6 The AIC is asymptotically equivalent to leave-one-out-cross-validation for normally distributed error terms (Stone, 1977).

11

Figures A.1 and A.2, located in the Appendix, list the full set of variables used by

each college to calculate students’ math and English algorithm scores, respectively. Tables

A1 and A2 in the appendix show typical examples of our regression results for math and

English. Across colleges, explanatory power was much higher for math course grades than

for English course grades. Placement scores typically explained less than 1% of the

variation in passing grades for English. Test scores were better predictors for passing

math grades, explaining roughly 10% of the variation. Adding high school grades typically

explained an additional 10% of the variation in both subjects. Thus, combining multiple

measures with predictive analytics is no panacea for predicting future grades, but it does

significantly improve validity of the placement instrument relative to test scores alone.

Setting cut probabilities: After we selected the final models, we used the coefficients

from the regression to simulate placement rates for each college. Consider the following

simplified example where a placement test score (R) and high school GPA (G) are used to

predict success in college-level math (Y), defined as earning a grade of C or better. The

regression coefficients combined with data on R and G can then predict the probability of

earning a C or better in college-level math for incoming students (Y�). A set of decision

rules must then be determined based on these predicted probabilities. A hypothetical

decision rule would be:

Placementi = � College Level if Y�i ≥ 0.6Remedial if Y�i < 0.6

For each college, we generated spreadsheets projecting the share of students that

would place into college-level coursework at any given cut-point as well as the share of

those students we would anticipate earning a C or better. These spreadsheets were given

to colleges so that faculty in the pertinent departments could set cut-points for students

entering their programs.

12

Figure 1 shows a hypothetical example of one such spreadsheet provided to colleges.

The top panel shows math placement statistics and the bottom panel shows statistics for

English. The highlighted row shows the status quo at the college and the percent of tested

students placed into college level is shown in the second column. For instance, for math,

the status quo placement rate is 30%. The third column shows the pass or success rate,

which is a grade “C” or better in the first college-level course in the relevant subject. In

this example, the status-quo pass rate for math is 50% conditional on placement into the

college-level math course.

Below the highlighted row, we show what would happen to placement and pass rates

at different cut points for placement. The first column shows these cut points (“Minimum

probability of success”). For instance, for math, the first cut point we show is 45%, which

implies that for a student to be placed into college-level math under the algorithm, the

student must have a predicted probability of receiving a “C” or better in the gate-keeper

math course of at least 45%. If this 45% cut point is used, columns two and three show

what would happen to the share of students placed into college-level math under the

algorithm (column two) and what would happen to the share who would pass this course

conditional on placement (column three). In this example, for math, if the 45% cut point

is used, the algorithm would place 40% of students into college-level math and we

anticipate 60% of those students would pass. The cut point differs from the expected pass

rate because the cut point is the lowest probability of passing for a given student; the cut

point implies that every student must have that probability of passing or higher. For

instance, if the cut point is 40%, then every student has 40% chance or greater of passing

the college-level course—most students will have above a 40% chance.

Many faculty opted to create placement rules that either kept pass-rates in college-

level courses the similar to historical pass rates or kept the college-level placement rates

13

the similar to historical placement rates. Under this rule, the algorithm tended to predict

increases in the number of students placed into college-level coursework. For instance, in

the example, the status quo pass rate for English is 60%. A cut point of 45% would induce

the same pass rate, 60%, but would place 75% of students into the college-level English

course.

Installation of new placement method in college systems: We developed two

procedures to place students to maintain the timing of placement decisions. At colleges

running our algorithm through the computerized ACCUPLACER-test platform, we

programmed custom rules into the ACCUPLACER platform for students selected to be

part of the treatment group.7

Other colleges ran their placement through a custom server built for the study.

Student information was sent to servers to generate the probability of success and the

corresponding placement, which was returned to the college.

3. Experimental Design, Data, Empirical Strategy

The sample frame consisted of entering freshmen enrolling at each college who were

required to take the placement exams.8 Random assignment was integrated by computer

into each college’s placement platform described above. Upon taking their placement

exams, students were randomly assigned to be placed using either the business-as-usual

method or the algorithm. Students were blinded to their assignment. If a student took

both the English and math placement exams, they were either assigned to the business-as-

usual placement for both subjects or the algorithm for both subjects. Some students only

7 This process placed constraints on the algorithm’s complexity—interaction terms and non-linear models, for instance, are difficult to implement within the ACCUPLACER system. 8 Some colleges asked to exempt English as a Second Language speakers from the study and students with high SAT scores. However, as these are non-selective colleges, few students take the SAT.

14

took a placement exam in one subject. After taking placement exams, students were

notified of their placements either by a counselor or through an online portal, depending

on the college.

Data

Data came from two sources: placement test records and administrative data from each

college. Student-level placement test records include indicators for each students’

placement level in math and English, as well as the information that would be needed to

determine students’ placements regardless of treatment status. Placement test records

from each college contained high school grade point averages (when available) and scores

on individual placement tests. Additional variables included in placement test records

varied by each college’s placement algorithm. Examples of additional variables

incorporated for certain colleges include the number of years between high school

completion and college enrollment, type of diploma (high school diploma vs. GED), SAT

scores, and New York State Regents Exam scores.

In addition to placement test records, college administrative data included

demographic information, such as gender, race, age, financial aid status, and transcript

data that provided course levels, credits attempted and earned, and course grades.

Demographic data are only available for students who enrolled in our sample colleges.

Our initial sample consists of 4,729 first-year students across five colleges. Not all

students who take placement exams subsequently enroll in courses: 864, or about 18

percent, took a placement exam and did not enroll in at least one course during the fall

2016 term. This could be important if we would like to conduct subgroup analyses by

demographic group and there is a treatment effect on enrollment. As we show below

however, we do not find evidence of the latter.

15

Table 1 shows sample baseline characteristics for students who enrolled in the fall of

2016 at one of the five study colleges. The first two columns show overall sample

characteristics and the additional columns break these down by. On average, students in

the sample were 43 percent white. Approximately half of all students enrolling in at least

one course during the fall 2016 term received a federal Pell Grant for that term. There is

some variation in demographic characteristics. For instance, Colleges 1, 2, and 3 serve

more white students compared to Colleges 4 and 5, which enroll a higher share of

Hispanic students. Using Pell Grant receipt as a proxy for income, average family income

for study participants also varies across colleges. While Pell Grant recipients comprise

more than 60 percent of all study participants from College 1, 2, and 3, they represent less

than half of the students enrolling in the study from Colleges 4 and 5. Comparing these

characteristics to Appendix Table A.3 shows that the sample characteristics roughly

match the overall characteristics of students each college serves.

Lastly, Figure 2 shows the gap in placement rates across demographic subgroups. The

first two bars show that the white students are placed into college-level math and English

courses at rates 14 percentage points and 19 percentage points higher than Black

students. These gaps tend be smaller between Hispanic and white students, and between

male and female students, but also quite large between Pell recipients and non-Pell

recipients—16 and 12 percentage points for math and English, respectively.

Outcomes

We study the effects of assignment to the placement algorithm on several primary

outcomes, by subject. First, we descriptively examine how placements changed as a result

of the algorithm: what share of students had their placement changed relative to the

control, and of these, what share had their placement changed from a remedial-course

assignment to a college-level assignment, and what share had their placement changed

16

from a college-level course assignment to a remedial assignment. Second, we show

treatment effects on enrollment and pass rates for math and English separately. If a

student does not enroll in any college course—remedial or otherwise—we code enrollment

into the college-level course, passing the college-level course, and credits accumulated as a

zero. Thus, attrition from the colleges is incorporated into the outcome, and, as shown

below, we do not find any effects on initial enrollment. Lastly, we descriptively compare

the algorithm’s predicted pass rates in college level courses (conditional on enrollment) to

the actual pass rates.

Impact analyses

We use an intent-to-treat analysis to examine the impacts of using the placement

algorithm versus the single-placement test status quo. We estimate the following model:

[7] Yi = α + βRi + λφi + ηXi + δZi + εi

where Yi are first-semester academic outcomes for student i, such as placement into a

college-level course and passing a college-level course; Ri indicates whether the individual

was randomly assigned to be placed using the algorithm; φi is an indicator for the

institution a student attends; Xi is a vector of baseline covariates including gender, race,

age, and financial aid status; Zi is students’ math and English algorithm calculations,

which are baseline measures of academic preparedness, and εi is a random error term. The

coefficient of interest is β, which is the effect of assignment to the placement algorithm on

outcomes at the end of the first semester discussed above. We show results with and

without baseline covariates.

As not everyone takes a placement exam in both subjects, we run these regressions by

subject placement (e.g. for enrollment in college-level math for those who took a math

17

placement exam) and results for those who took any placement exam (e.g. for enrollment

in any college-level course).

Subgroup Analyses

Colleges were interested to understand what impact the placement algorithm has on the

composition of students placed into remedial and college-level courses, and particularly for

students typically under-represented in these courses. We conduct subgroup analyses as

follows to estimate whether there were significant differential effects.

[8] Yi = α1 + β1kRi + β2kRi × Subgroupk + λ1φi + η1Xi + δ1Zi + ε1i

Yi are placement in college-level math, placement in college-level English, and passing

these courses as well. The coefficient of interest is β2k, which assesses the extent to which

Subgroup k exhibits a different effect from the reference group. The subgroups of interest

are: Black and Hispanic students (separately) compared to white students; female

students relative compared to male students; and Pell recipients compared to non-Pell

recipients.

However, this process yields many tests, which increases the likelihood of type-I errors.

To reduce the number of tests, we estimate the interaction effects simultaneously by

estimating a system of Seemingly Unrelated Regressions and conduct an F test of whether

the interactions effects are jointly significantly different from zero. We also use a step-

down method from Holm (1979) to discern which specific hypotheses can be rejected.

Treatment-Control Baseline Balance

Randomization should ensure, in expectation, that students assigned to the treatment are

similar to those assigned to the control-group placement rules. Table 2 provides evidence

that participants’ demographic and academic characteristics are balanced across

18

treatment and control groups. Students’ ACCUPLACER exam scores also are similar

across both groups. Overall, the magnitudes of differences between treatment and control

groups are small and only two are significant at the 10 percent level, which is expected

with 20 variables tested. Though not shown, this balance also holds for the subgroup of

students who only took an English or a Math placement exam as well. Observation counts

change because colleges only provided demographic variables for those students who

enrolled and took the placement exams. This is potentially problematic for subgroup

analyses (not for overall effects), however, treatment assignment does not impact

subsequent enrollment, which is not surprising given the close proximity in time between

enrollment and placement.9 Lastly, note that high school GPA, a strong predictor of

future performance, is missing for many incoming students.

4. Results

Descriptive Changes in Placements

We begin with a descriptive summary of placement changes to show the various ways the

algorithm changed students’ placements relative to the business as usual. As stated above,

it is not obvious how the algorithm will change net placement rates. For instance, relative

to the business as usual, placement by the algorithm could change an equal number of

placements into and out of college-level courses, which implies no net change in college-

level course placement. Table 3 summarizes changes in placement for program-group

students entering the study in fall 2016. Of the 2,455 students assigned to the program-

group, 92% took a placement exam in math and 76% took a placement exam in English.

Among those students who took a math placement exam, 21% experienced a math

placement different from what would have been expected under the status quo placement

9 Other papers studying the effects of placement into remedial classes have found that this does not affect subsequent enrollment (Martorell and McFarlin 2011; Scott-Clayton and Rodriguez 2014).

19

rules. Of those with a changed math placement, 14% were placed into a higher-level math

course than would have been expected under a single test placement system, and 7%

placed in a lower level math course.

Additionally, 76% of program-group students took a placement exam for English. Of

those who took the English placement exams, approximately 48% of program-group

students experienced a change in the level of their English level placement; 42% placed

into a higher level English course and 7% placed into a lower level course than they would

have under the status quo placement strategy.

Students generally followed the placement assignments that they received. Compliance

rates were high across math and English, ranging from approximately 88% to 99% across

participating colleges. Instances of non-compliance can be at least partially explained by

the fact that we only report on initial placements (intent-to-treat effects based on their

first exam scores), which may subsequently if students re-took the placement exam.

Treatment Effects on Placement, Course Taking, and Credits

To test whether placement algorithm affected student outcomes, we estimate four models

that build upon each other: (1) we first estimate a simple regression including only college

fixed effects (strata), (2) we then add controls for the set of demographic characteristics

including gender, race, and age at entry, (3) we then add proxies for income including Pell

and TAP Grant10 recipient status, and (4) finally, we add the calculated math and

English algorithm values for all students, which are estimated using baseline covariates.

The results are robust across these model specifications, so show results from Model 4, but

all of our estimates across specifications can be found in Tables A.4 through A.10 in the

Appendix.

10 The Tuition Assistance Program (TAP) is a need-based grant available to New York State residents who enroll as a first-time freshman at an approved postsecondary institution in New York State.

20

Table 4 summarizes the results for each outcome, for each subject. The placement

algorithm resulted in increases in all three outcomes: placement into college-level courses,

enrollment in college-level courses, and college-level credits earned. Students assigned to

the placement algorithm are 5.0 percentage points more likely to be placed into a college-

level math course, 4.6 percentage points more likely to enroll in a college-level math

course, and 2.8 percentage points more likely to pass a college-level math course during

the first term. All results are statistically significant at the 1 percent level. One

explanation for the difference between placement and enrollment into a college-level math

course is that some students placing into college-level math did not have to complete a

college-level math course prior to enrolling in other college-level courses in the first term.

For example, if a student placed into college-level math, she does not have to take that

college-level course right away and could instead enroll in other field courses that term.

There are positive and substantially larger effects for English placement, enrollment,

and completion than for outcomes on math courses. Students who were placed by the

algorithm were 30.4 percentage points more likely to place into a college-level English

course, 19.3 percentage points more likely to enroll in a college-level English course, and

12.5 percentage points more likely to pass a college-level English course in the first term.

All results are significant at the 1 percent level. Again, the difference between placement

and enrollment into a college-level English course may occur for the same reason as above

for college-level math enrollment.

In addition to subject-specific impacts, we also test whether program status had any

impact on overall college-level course taking.11 Table 5 shows that algorithm assignment

increases the probability of enrolling in any college-level course by about 1 percentage

11 As expected given the timing of randomization, placement and enrollment, there is no impact of treatment assignment on enrollment (Table 3).

21

point and increases the probability of enrolling and passing any college-level course by 4.2

percentage points. These effects are all significant at least at the 5 percent level. The

smaller impact on college-level course enrollment is likely because students who place into

remedial-level courses in either English or math are, in most cases, still eligible to enroll in

other college-level courses that do not require a certain level of English or math

proficiency.

Colleges can select various cut points for assignment to college-level coursework, which

can make these effects difficult to interpret. Depending on what cut point colleges choose,

we may expect higher pass rates or lower pass rates. One way to gauge success is whether

the algorithm correctly predicted pass rates in college-level courses among those who were

placed and enrolled in these courses. Prediction accuracy was better for math than for

English: Actual pass rates were 2 percentage points lower than predicted in math and 7

percentage points lower than predicted in English.

Lastly, Table 5 shows there is an increase in credit accumulation of 0.60 credits, which

is a 12 percent increase over the control-group mean. This effect is significant at the 1

percent level.

Subgroup Effects

Table 6 shows the interaction effects for the subgroups listed above. The largest

differential effects occur for Black and Hispanic students into college-level English courses;

these groups show 18 and 13 percentage point increases in college-level English relative to

white students. The differential effects are much closer to zero for college-level math

placement, with the exception of female students, who are 8 percentage points more likely

to be placed into college-level math than male students. The interaction effects are

generally smaller and not statistically significant for passing college-level courses.

22

We test the significance of these interaction effects in several ways. First, an F test of

all of the interaction effects for placement into college-level math and English rejects the

null hypothesis that these effects are zero (p<0.01). Second, if we test each interaction

separately to identify which is significant while controlling the Familywise Error Rate

using Holm (1979), we find that female students, Black students and Hispanic students

experience significantly different increases in placement to college-level courses. The

evidence of significant differential effects for passing college-level math and English

courses is weaker however. The same F test does not reject the null (p=0.60) and none of

the interactions effects is significantly different from zero after adjusting for multiple

hypothesis testing.

5. Cost Analysis

The initial investment to build the algorithm has three components. First, data collection

on students’ characteristics (including high school transcripts), placements based on test

results, and subsequent college outcomes. In some colleges, these data are already

available, but other colleges required more extensive data collection. Second, data must be

analyzed to determine a new placement algorithm. Third, resources must be allocated to

create and implement the new system within the college, which includes training

personnel. After the initial investment, implementation requires collecting data from

entering students and personnel to assign students to either remedial or college-level

courses.

The cost estimate of the new placement system is relative to the cost of business-as-

usual testing for placement. For both placement systems there are costs in administering

placement tests. Also, for both systems, future resources may be required as students

progress into college-level courses after completing remedial coursework. If more students

progress into college-level courses, colleges may have to shift resources toward college

23

courses and away from remedial courses in conjunction with any changes in revenue per

student. Currently, no information on differences in coursework costs across the two

options is available, so we cannot incorporate this resource.

We calculated costs for five colleges using the ingredients method (Levin et al.,

2017).12 We collected information on ingredients from direct interviews with personnel

who implemented the new testing protocols. We gathered information on input prices and

overheads costs from secondary sources. The resulting estimates reflect the expected cost

of implementing the new placement system at a college of similar size and organization as

the five sample colleges.

Across the five colleges, implementing the new system costs, on net, $110 per student,

which varies between $70 and $320, depending on the college. We estimate the operating

cost per student per semester over the status quo is $40 per student, which varies from

$10 to $170.13

This variation is primarily driven by the number of students at each site. More

enrollments lead to lower costs because the costs of creating the algorithm are mostly

fixed. The costs per college also varied depending on how much information was

previously available to determine the new placement algorithm and how many students

had the requisite information for the placement system to operate. Data entry costs were

lower if the college had all high school information pre-loaded into their databases; in

contrast, data entry costs were higher if each student’s information had to be entered into

the computing system individually.

Interviewees described strategies by which the number of students with readily-

12 We derived costs from the inputs used at each college multiplied by standardized prices per input expressed in 2016 dollars. 13 See Tables A.13 and A.14 in the Appendix. Details of the prices and analyses behind these calculations available upon request.

24

available information might be increased in the future. Interviewees also did not indicate

significant resource changes with respect to instruction. Potentially, the new placement

system may change assignments such that more students are now in college-level classes,

which could require more college-level faculty and perhaps more sections of college-level

courses. However, most colleges indicated that faculty could be reassigned from teaching

remedial classes to teaching college-level classes, though few changes in class size were

anticipated even given the changes in placement rates.

Colleges could save money by not purchasing the ACCUPLACER exams, and they

asked whether students could be placed via the algorithm as accurately without using

these test scores. We examined the extent to which the algorithm would place students

differently if test scores were not used for prediction. We find that placement rates would

change substantially for math courses—by 18% to 34%--depending on whether colleges

decided to hold predicted pass rates the same or placement rates the same. However, for

English courses, only 5% to 8% of placements would change. This finding again is in line

with the increased predictive value we find for math test scores over English test scores.

6. Discussion and Conclusion

Our findings indicate that combining predictive analytics with multiple measures

significantly impacts how community colleges track students into either college-level or

remedial courses. First, the algorithm allowed colleges to choose cut points that explicitly

targeted predicted placement rates and pass rates. Second, the algorithm led to changes in

the placement of students. Across the five colleges in the analysis, students placed using

the algorithm were more likely to be assigned to college-level courses than those placed

using the business as usual system. In addition, more treatment-group students took and

passed college-level courses in math and English than students in the control group. These

results are due to a combination of the cut points that colleges chose and the alternative

25

placement decisions of the algorithm. There were particularly large increases in college-

level placements for female students in math, and Black and Hispanic students in English.

The algorithm’s prediction accuracy was better for math courses than English courses.

However, the algorithm we developed could be improved. Most notably, our model was

constrained by implementation in several ways. To produce rapid placement decisions, we

had to embed our algorithm into existing systems, which restricted our modeling

choices—we could not for instance, implement a non-linear model. Future models could

also use richer transcript data; the colleges we worked with could only provide high school

GPA, which could be missing for incoming students, nor did it contain course-level high

school grades that could be predictive of future performance as well. More generally, as

colleges develop more consistent ways to record incoming student information, the ability

to predict future performance could improve.

One question is how our results would differ if all students within a college were

placed according to the algorithm. We interviewed college administrators and department

chairs at each college to document their impressions to the algorithm’s implementation.

Most perceived no changes in classroom composition and no need to change faculty

allocations. However, this could change if all students were placed via the algorithm,

especially in English courses where placement changes were more significant.

Our initial results have important implications because the high cost of remedial

education falls directly onto students placed into these courses and indirectly onto

taxpayers whose money helps subsidize public postsecondary institutions. As a result,

there is both a private and social benefit to ensuring that remedial education is correctly

targeted. Colleges recognize this, and some have begun to implement these placement

algorithms. Long Beach City College (LBCC) created a placement formula that uses

student high school achievement in addition to standardized assessment scores. The

26

formula weights each measure based on how predictive it is of student performance in

college courses (Long Beach City College, Office of Institutional Effectiveness, 2013). This

paper provides evidence that these placement systems not only affect student outcomes

through changes in the placement instrument, but also through colleges’ improved ability

to target pass rates and placement rates explicitly. Future research could test more

intricate predictive models than we could implement in the current study.

27

References Akaike, H. (1998). Information theory and an extension of the maximum likelihood

principle. In E. Parzen, K. Tanabe, & G. Kitagwa (Eds.) Selected Papers of Hirotugu Akaike. Springer Series in Statistics (Perspectives in Statistics). New York, NY: Springer.

Allen, D., & Dadgar, M. (2012). Does dual enrollment increase students’ success in college? Evidence from a quasi-experimental analysis of dual enrollment in New York City. New Directions for Higher Education, 2012, 11–19. doi:10.1002/ he.20010

Bacon, D. R., & Bean, B. (2006). GPA in research studies: An invaluable but neglected opportunity. Journal of Marketing Education, 28, 35–42.

Belfield, C., & Crosta, P. M. (2012). Predicting success in college: The importance of placement tests and high school transcripts (CCRC Working Paper No. 42). New York, NY: Community College Research Center. Retrieved from: http://ccrc.tc.columbia.edu/publications/predicting-success-placement-tests-transcripts.html

Bettinger, E.P., Boatman, A., & Long, B.T. (2013). Student supports: Remedial education and other academic programs.” The Future of Children, 23(1), 93 – 115. Retrieved from: http://muse.jhu.edu/article/508222/pdf

Bettinger, E. P., & Long, B. T. (2009). Addressing the needs of underprepared students in higher education: Does college remediation work? Journal of Human Resources, 44(3), 736–771.

Boatman, A., & Long, B. T. (2010). Does remediation work for all students? How the effects of postsecondary remedial and remedial courses vary by level of academic preparation (An NCPR Working Paper). New York, NY: National Center for Postsecondary Research.

Borghans, L., Golsteyn, B. H. H., Heckman, J. J., & Humphries, J. E. (2016). What grades and achievement tests measure. Proceedings of the National Academy of Science, 113(47), 13354-13359.

Bui, S. A., Craig, S. G., and Imberman, S. A. (2014). Is gifted education a bright idea? Assessing the impact of gifted and talented programs on students. American Economic Journal: Economic Policy, 6(3):30 – 62.

Burnham, K. P. & Anderson, D. R., (2002). Model selection and multimodel inference: A practical information-theoretic approach, second edition. New York, NY: Springer.

Calcagno, J. C., & Long, B. T. (2008). The impact of postsecondary remediation using a regression discontinuity approach: Addressing endogenous sorting and noncompliance (NBER Working Paper No. 14194). Cambridge, MA: National Bureau of Economic Research. http://www.nber.org/papers/w14194

http://ccrc.tc.columbia.edu/publications/predicting-success-placement-tests-transcripts.html




http://muse.jhu.edu/article/508222/pdf

http://muse.jhu.edu/article/508222/pdf

28

Card, D. and Giuliano, L. (2014). Does gifted education work? For which students? Technical Report 20453, National Bureau of Economic Research.

Card, D. and Giuliano, L. (2016). Can tracking raise the test scores of high-ability minority students? The American Economic Review, 106(10):2783–2816.

Chan, E.W. (2018). Parent Engagement and Gifted Students. Teachers College, Columbia University. Working Paper.

Chen, X. (2016). Remedial Coursetaking at U.S. Public 2- and 4-year Institutions: Scope, Experiences, and Outcomes (NCES 2016-405). U.S. Department of Education. Washington, DC: National Center for Education Statistic s. Retrieved from: https://nces.ed.gov/pubs2016/2016405.pdf

Cohodes, S. R. (2015): “The Long-Run Impacts of Tracking High-Achieving Students: Evidence from Boston’s Advanced Work Class,” Working Paper.

Dadgar, M., Collins, L., & Schaefer, K. (2015). Placed for success: How California community colleges can improve accuracy of placement in English and math courses, reduce remediation rates, and improve student success. Oakland, CA: Career Ladders Project.

Duflo, Esther, Pascaline Dupas, and Michael Kremer. (2011). Peer effects, teacher incentives, and the impact of tracking: Evidence from a randomized evaluation in Kenya. American Economic Review 101, no. 5:1739–74.

Eubanks, V. (2018). Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. New York, NY. St. Martin’s Press.

Fields, R. & Parsad, B. (2012). Tests and Cut Scores Used for Student Placement in Postsecondary Education: Fall 2011. Washington, DC: National Assessment Governing Board.

Gates, A. G., & Creamer, D. G. (1984). Two-year college attrition: Do student or institutional characteristics contribute most?. Community Junior College Research Quarterly of Research and Practice, 8(1-4), 39–51.

Hodara, M. (2012). Language Minority Students at Community College: How Do Developmental Education and English as a Second Language Affect Their Educational Outcomes? (Doctoral dissertation). Retrieved from ProQuest Dissertations and Theses. (Accession Order No. [3505981]).

Hodara, M., Jaggars, S. S., & Karp, M. M. (2012). Improving remedial education assessment and placement: Lessons from community colleges across the country (CCRC Working Paper No. 51). New York, NY: Community College Research Center. Retrieved from https://ccrc.tc.columbia.edu/publications/remedial-education-assessment-placement-scan.html

Holm, S. (1979). A Simple Sequentially Rejective multiple test procedure. Scandinavian Journal of Statistics, 6(2), 65-70.

https://nces.ed.gov/pubs2016/2016405.pdf

http://files.eric.ed.gov/fulltext/ED537433.pdf

https://ccrc.tc.columbia.edu/publications/developmental-education-assessment-placement-scan.html

https://ccrc.tc.columbia.edu/publications/developmental-education-assessment-placement-scan.html

29

Heubert, J. P., & Hauser, R. M. (Eds.). (1999). High-stakes: Testing for Tracking, Promotion, and Graduation. Washington, DC: National Academy Press.

Kautz, T., Heckman, J. J., Diris, R., ter Weel, B., and Borghans, L. (2014), Fostering and Measuring Skills: Improving Cognitive and Non‐cognitive Skills to Promote Lifetime Success, OECD, Paris.

Kautz, T. D., & Zanoni, W. (2014). Measuring and fostering noncognitive skills in adolescence: Evidence from Chicago Public Schools and the OneGoal Program. Unpublished manuscript, Department of Economics, University of Chicago, Chicago, IL.

Levin, H.M., McEwan, P.J., Belfield, C.R., Bowden, A.B., & Shand. R. (2017). Economic evaluation of education: Cost-effectiveness analysis and benefit-cost analysis. Thousand Oaks, CA: SAGE Publications.

Long, B.T., & Boatman, A. (2013). The role of remedial and remedial courses in access and persistence. In A. Jones & L. Perna (Eds.), The state of college access and completion: Improving college success for students from underrepresented groups. New York, NY: Routledge Books.

Long Beach City College, Office of Institutional Effectiveness. (2013). Preliminary overview of the effects of the fall 2012 Promise Pathways on key educational milestones. Long Beach, CA: Office of Institutional Effectiveness.

Martorell, P., & McFarlin, I., Jr. (2011). Help or hindrance? The effects of college remediation on academic and labor market outcomes. The Review of Economics and Statistics, 93(2), 436–454.

Mazerolle, M. J. (2004). Appendix 1: Making sense out of Akaike’s Information Criterion (AIC): its use and interpretation in model selection and inference from ecological data. Retrieved from http://theses.ulaval.ca/archimede/fichiers/21842/apa.html

McGurk, S. (2015). ACCUPLACER Overview. College Board. Retrieved from: https://www.accs.cc/default/assets/File/DPE_ISS/developmental%20education/accuplacer/overview.pdf

S. Mullainathan and J. Spiess. (2017). Machine learning: an applied econometric approach. Journal of Economic Perspectives, 31(2):87–106.

National Center for Public Policy and Higher Education and the Southern Regional Education Board. (2010). Beyond the Rhetoric: Improving College Readiness Through Coherent State Policy. Retrieved from: http://www.highereducation.org/reports/college_readiness/CollegeReadiness.pdf

North Carolina Community College System. (2015). NCCCS policy using high school transcript GPA and/or standardized test scores for placement (Multiple measures for placement). Retrieved from

http://theses.ulaval.ca/archimede/fichiers/21842/apa.html

30

https://www.southwesterncc.edu/sites/default/files/testing/Multiple%20Measures%20Revised%202015.pdf

Rodríguez, O., Bowden, A.B., Belfield, C.R., & Scott-Clayton, J. (2014) Remedial placement testing in community colleges: What resources are required, and what does it cost? (CCRC Working Paper No. 73). New York, NY: Community College Research Center. Retrieved from: https://ccrc.tc.columbia.edu/media/k2/attachments/remedial-placement-testing-resources.pdf

Rodríguez, O., Bowden, Belfield, C., & Scott-Clayton, J. (2015). Calculating the costs of remedial placement testing (CCRC Analytics). New York, NY: Community College Research Center.

Rothstein, J. M. (2004) College performance predictions and the SAT. Journal of Econometrics, 121(1-2), 2917-317.

Rutschow, E. Z., & Schneider, E. (2011). Unlocking the gate: What we know about improving remedial education. New York, NY: MDRC. Retrieved from: http://www.mdrc.org/sites/default/files/full_595.pdf.

Saxon, D., & Morante, E. (2014). Effective student assessment and placement: Challenges and recommendations. Journal of Developmental Education, 37(3), 24-31.

Scott-Clayton, J. (2012). Do high-stakes placement exams predict college success? (CCRC Working Paper No. 41). New York, NY: Community College Research Center. Retrieved from http://ccrc.tc.columbia.edu/media/k2/attachments/high-stakes-predict-success.pdf

Scott-Clayton, J., & Rodriguez, O. (2015). Development, discouragement, or diversion? New evidence on the effects of college remediation policy. Education Finance and Policy, 10(1), 4–45.

Scott-Clayton, J., Crosta, P. M., & Belfield, C. R. (2014). Improving the targeting of treatment: Evidence from college remediation. Educational Evaluation and Policy Analysis, 36(3), 371–393.

Strong American Schools. (2008). Diploma to nowhere. Retrieved from: http://paworldclassmath.webs.com/8534051-Diploma-To-Nowhere-Strong-American-Schools-2008.pdf.

https://ccrc.tc.columbia.edu/media/k2/attachments/remedial-placement-testing-resources.pdf

https://ccrc.tc.columbia.edu/media/k2/attachments/remedial-placement-testing-resources.pdf

http://www.mdrc.org/sites/default/files/full_595.pdf



http://ccrc.tc.columbia.edu/media/k2/attachments/high-stakes-predict-success.pdf



http://paworldclassmath.webs.com/8534051-Diploma-To-Nowhere-Strong-American-Schools-2008.pdf

http://paworldclassmath.webs.com/8534051-Diploma-To-Nowhere-Strong-American-Schools-2008.pdf

31

FIGURES

Figure 1. Hypothetical spreadsheet provided to colleges on placement projections

Example Community College Math Success: C or above Minimum Probability of Success

Percent Placed into College Level

Percent Passing College Level

Cohort 3, Status Quo 30% 50% 45% 40% 60% 55% 20% 70% 65% 10% 75%

Eng. Success: C or above Minimum Probability of Success

Percent Placed into College Level

Percent Passing College Level

Cohort 3, Status Quo 40% 60% 45% 75% 60% 55% 60% 65% 65% 20% 70%

32

Figure 2. Gaps in Placement Rates Across Demographic Groups

Notes: Sample includes any student who took a placement exam in at least one subject and first enrolled at one of the five study colleges in the fall of 2016. Currently dual enrollment students and students who tested into ESL courses are excluded.

0%

5%

10%

15%

20%

25%

Black-WhiteGap

Hispanic-WhiteGap

Male-FemaleGap

Non-Pell-PellGap

Gap

in P

lace

men

t R

ate

Math English

33

Table 1. Sample Demographics by College Overall College 1 College 2 College 3 College 4 College 5 Female 48% 58% 48% 52% 50% 44% Race

White 43% 83% 78% 74% 38% 31% Asian 3% 0% 2% 0% 6% 3% Black 18% 7% 17% 21% 22% 18% Hispanic 22% 7% 0% 1% 30% 25% Pacific Islander 0% 0% 1% 0% 0% 0% Two or More Races 11% 1% 0% 0% 3% 19% Non-Resident Alien 0% 1% 0% 0% 0% 0% Race Unknown 2% 0% 0% 0% 0% 4% Race Missing 12% 0% 12% 76% 5% 0%

Age at Entry 19.94 20.10 22.06 21.03 20.43 18.99 Age at Entry Missing 0% 0% 0% 0% 0% 0% Pell Grant Recipient 49% 64% 66% 61% 36% 47% Total 3,865 327 408 673 1,002 2,319 Notes: Sample includes any student who took a placement exam in at least one subject and first enrolled at one of the five study colleges in the fall of 2016. Currently dual enrollment students and students who tested into ESL courses are excluded.

34

Table 2. Baseline characteristics by treatment assignment

Variable Control Mean

Treatment Mean Difference P-value Observations

Enrollment 82.10% 81.30% -0.01% 0.48 4,729 Female 47.50% 47.90% 0.40% 0.76 3,865 Race

White 43.50% 41.70% -1.80% 0.30 3,382 Asian 3.10% 3.70% 0.60% 0.34 3,382 Black 16.90% 19.20% 2.30% 0.09 3,382 Hispanic 21.90% 21.50% -0.40% 0.77 3,382 Two or More Races 11.50% 10.90% -0.60% 0.57 3,382 Race Missing 28.30% 28.70% 0.40% 0.76 4,729

Age at Entry 19.9 20

0.10 0.52 3,593 Pell Grant Recipient 50.10% 53.30% 3.20% 0.06 3,672 TAP Grant Recipient 39.10% 39.60% 0.50% 0.78 3,721 GED Recipient 4.50% 4.20% -0.30% 0.57 4,600 HS GPA (100 Scale) 78 78 0.00% 0.97 1,862 HS GPA (missing) 59.80% 61.70% 1.90% 0.18 4,729 ACCUPLACER Exam Scores

Arithmetic 45.0 45.9 0.90 0.26 3,439 Algebra 53.1 53.7 0.60 0.77 4,407 College-level math 35.5 35.4 -0.10 0.89 455 Reading 72.3 71.9 -0.40 0.47 3,696 Sentence Skills 76.3 76.1 -0.20 0.12 1,072 Written Exam 6.1 6.1 0.00 0.11 3,324

Total 2,274 2,455 4,729 Notes: Sample includes any student who took a placement exam in at least one subject at one of the five study colleges in the fall of 2016. Currently dual enrollment students and students who tested into ESL courses are excluded.

35

Table 3. Changes in Placement for Program-Group Students

(1) (2) (3) (4) (5) (6)

Exempt from Placement

Took Placement

Exam

Same Placement

Under Business as

Usual

Placement Changed from Business as

Usual

Higher Placement

than Business as Usual

Lower Placement

than Business as Usual

Math Placement

N 190 2,265 1,795 470 310 160 % of Total Program Sample

7.74% 92.26% 73.12% 19.14% 12.63% 6.52%

% of Students Placed in Math

- 100.00% 79.25% 20.75% 13.69% 7.06%

English Placement

N 591 1864 967 897 774 123 % of Total Program Sample

24.07% 75.93% 39.39% 36.54% 31.53% 5.01%

% of Students Placed in English

- 100% 51.88% 48.12% 41.52% 6.60%

Notes: Sample includes any student who took a placement exam in at least one subject at one of the five study colleges in the fall of 2016. Currently dual enrollment students and students who tested into ESL courses are excluded.

36

Table 4. Effect on Math and English Coursework (1) (2) (3) (4) (5) (6)

Placed Math

Enrolled Math

Passed Math

Placed English

Enrolled English

Passed English

Treatment 0.050*** 0.046*** 0.028*** 0.304*** 0.193*** 0.125*** (0.014) (0.012) (0.010) (0.014) (0.014) (0.014)

Control Mean 0.437 0.237 0.134 0.524 0.408 0.272

Observations 4,371 4,371 4,371 3,533 4,371 4,371 Notes: Robust standard errors shown in parenthesis. All models include fixed effects for college, controls for demographic indicators including race, gender and age, Pell and TAP Grant recipient status, and calculated math and English algorithm values.

Table 5. Effect on college-course outcomes (1) (2) (3)

Enrolled Any College

Course Passed Any College

Course Total Credits

Earned

Treatment 0.009*** 0.046*** 0.599*** (0.003) (0.012) (0.129)

Control Mean 0.807 0.237 5.170 Observations 4,729 4,729 4,729

Notes: Robust standard errors shown in parenthesis. All models include fixed effects for college, controls for demographic indicators including race, gender and age, Pell and TAP Grant recipient status, and calculated math and English algorithm values.

37

Table 6. Interaction coefficients on college-course outcomes Math Outcomes English Outcomes

CL

Placement CL

Enrollment CL

Completion CL

Placement CL

Enrollment CL

Completion Treatment X Black -0.012 0.000 0.003 0.180*** 0.173*** 0.055

(0.033) (0.046) (0.040) (0.043) (0.049) (0.052)

Treatment X Hispanic -0.007 0.012 0.014 0.134*** 0.116** 0.022 (0.043) (0.043) (0.037) (0.041) (0.045) (0.049)

Observations 2,568 2,568 2,568 2,081 2,081 2,081

Treatment X Female 0.083*** 0.051* 0.045* 0.025 -0.003 0.035 (0.031) (0.030) (0.026) (0.030) (0.033) (0.035)

Observations 3,563 3,563 3,563 2,832 2,832 2,832

Treatment X Pell 0.022 -0.009 0.008 0.021 0.066** 0.045 (0.032) (0.031) (0.026) (0.030) (0.033) (0.036)

Observations 3,396 3,396 3,396 2,748 2,748 2,748 College FE YES YES YES YES YES YES Demographic Indicators YES YES YES YES YES YES Income indicators YES YES YES YES YES YES Algorithm Values YES YES YES YES YES YES Notes: Robust standard errors shown in parenthesis. Each coefficient calculated using a model that includes fixed effects for colleges and controls for demographic indicators including race, gender and age, proxies for income including Pell and TAP Grant recipient status and calculated math and English algorithm values.

38

APPENDIX

Figure A.1. Math algorithm components by college

H.S. GPA

Years Since H.S. Graduation

GED Status

Regents Math Score

SAT Math Score

Arithmetic Score

Algebra Score

College-level math

College 1 X X X X X X College 2 X X X X X X X X College 3 X X X X X College 4 X X X X X College 5 X X X X

Figure A.2. English algorithm components by college

H.S. GPA

Years Since H.S. Graduation

GED Status

Reading Score

Sentence Skills Score

Essay Score

College 1 X X X X X College 2 X X X X X X College 3 X X X X X College 4 X X X X X College 5 X X X X

39

Table A.1. Math predictive models College 1 – Math LPM_1 LPM_2 LPM_3 LPM_4 hs_avg_1 .0345778*** .0275944*** .0302728*** [.0023706] [.0025555] [.0024947] missing_gpa 2.821865*** 2.269872*** 2.583032*** [.1947364] [.2088917] [.2098888] ACPL_algebra_scr_1 .00637*** .0044024*** .0041625*** [.0012656] [.0012155] [.0012265] ACPL_arithmetic_scr_1m 0.0556 0.037867 0.065269 [.0395568] [.041076] [.0416692] ACPL_algebra_scr_1m .6339479*** .3611804** .3349449* [.1408736] [.136684] [.1397535] ACPL_coll_lvl_math_scr_1m -0.08707 -.0881443~ -0.08396 [.0554991] [.0513349] [.0513172] years_out .0201255*** [.0041444] hs_grad_yr_1m -0.05616 [.0679753] GED -.1924288** [.0705082] missing_hs_diploma 0.120686 [.1002104] _cons -2.336633*** 0.037725 -2.047722*** -2.303033*** [.1920042] [.1224331] [.2166218] [.2128129] N 1166 1166 1166 1166 r2 0.124598 0.104755 0.176263 0.206776 aic 1538.419 1568.555 1475.49 1439.478 1100 point scale 2Binary indicator 3Test score range 20-120 4Test score range 1-8

40

Table A.2. English algorithm models

English Model 1 Model 2 Model 3 Model 4 HS GPA1 .0223682*** .0224298*** .0237109***

[.0012347] [.0012391] [.0013036] Missing GPA2 1.774132*** 1.760689*** 1.958613***

[.1025804] [.1032799] [.1139943] Reading3 .0012387* .0011471* 0.0005107

[.0005945] [.0005718] [.0005756] Sentence Skills3 0.0004641 -0.0003542 -0.0003762

[.0007228] [.0006933] [.000688] Written Essay4 0.0000932 -0.0016189 -0.0005364

[.0021619] [.0022682] [.0021195] Missing Reading2 .3153487*** .3324963*** .2095208**

[.0729108] [.0741815] [.0768058] Missing Sentence Skills2 -0.0265692 -.1465808* -.1544521*

[.077416] [.0740619] [.0737526] Missing Written Essay2 0.0205766 0.008454 0.0165223

[.0266048] [.0259509] [.0254477] years since HS graduation .0092922***

[.0014382] Missing Year of Graduation2 0.0414768

[.0865134] GED2 -.1901675*

[.0825778] Missing Diploma Type2 0.03203

[.0941508] High School Rank Percentile 0.0003502 [.000273] Missing High School Rank2 -0.0057774 [.0413373] Constant -1.146831*** .4775231*** -1.217838*** -1.300938***

[.100816] [.0600773] [.1107201] [.1176786] N 3,786 3,786 3,786 3,786 R2 0.0721164 0.0061075 0.0783223 0.0947421 AIC 4893.233 5161.418 4879.827 4823.771 1100 point scale 2Binary indicator 3Test score range 20-120 4Test score range 1-8

41

Table A.3. College Descriptives

Institution College 1 College 2 College 3 Onondaga College 4 Schenectad

y College 5

GENERAL COLLEGE INFORMATION

Student Population 7001 5,513 7,712 23984 10098 8458 22093 Full-time Faculty 69 80 151 194 122 79 215 Part-time Faculty 170 177 0 480 409 0 2 Student/Faculty Ratio 20 18 16 23 23 23 16 % Receiving Financial Aid 92% 91% 92% 92% 56% 92% 70% DEMOGRAPHICS

Race/ethnicity:

American Indian or Alaska Native 0% 1% 1% 1% 0% 1% 1% Asian 1% 2% 1% 3% 5% 7% 4% Black or African American 5% 7% 11% 12% 18% 14% 21% Hispanic/Latino 3% 11% 3% 5% 20% 6% 32% Native Hawaiian or Other 0% 0% 0% 0% 0% 1% 0% White 85% 73% 80% 49% 39% 67% 33% Multi-Ethnic 2% 3% 2% 3% 2% 2% 2% Race/Ethnicity Unknown 3% 3% 1% 27% 15% 2% 5% Non-Resident Alien 1% 1% 0% 0% 1% 0% 1% Gender:

Female 60% 58% 59% 52% 54% 53% 53% Male 40% 42% 41% 48% 46% 47% 47% Age:

Under 18 30% 17% 19% 24% 10% 37% 1% 18-24 44% 52% 60% 55% 63% 40% 69% 25-65 26% 31% 21% 21% 26% 23% 30% Age Unknown 0% 0% 0% 0% 0% 0% 0% RETENTION/GRADUATION RATES

Retention · Full-Time Students 56% 55% 63% 57% 68% 56% 64% · Part-Time Students 28% 30% 47% 34% 56% 50% 53% Three-Year Graduation Rate 24% 27% 28% 20% 29% 20% 15% Transfer Out Rate 18% 19% 18% 22% 19% 22% 18% Source: U.S. Department of Education, National Center for Education Statistics, IPEDS, Fall 2015, Institutional Characteristics.

42

Table A.4. Effect on placement into college-level math (1) (2) (3) (4) Treatment 0.044*** 0.047*** 0.049*** 0.050***

(0.014) (0.014) (0.014) (0.014) Control Mean 0.437

College FE YES YES YES YES Demographic Indicators NO YES YES YES Income indicators NO NO YES YES Algorithm Values NO NO NO YES Observations 4,371 4,371 4,371 4,371 Notes: Robust standard errors shown in parenthesis. The model from column (1) includes only fixed effects for college and no additional controls. Column (2) includes fixed effects for colleges and controls for demographic indicators including race, gender and age. Column (3) includes college fixed effects, controls for demographic indicators and proxies for income including Pell and TAP Grant recipient status. Colum (4) includes all the previous controls plus calculated math and English algorithm values. *** p<0.01, ** p<0.05, * p<0.1

Table A.5. Effect on enrolling in college-level math in term 1

(1) (2) (3) (4) Treatment 0.039*** 0.044*** 0.046*** 0.046***

(0.012) (0.012) (0.012) (0.012) Control Mean 0.237


43

Table A.6. Effect on enrolling and passing college-level math in term 1 (1) (2) (3) (4) Treatment 0.025** 0.027*** 0.028*** 0.028***

(0.010) (0.010) (0.010) (0.010) Control Mean 0.134


44

Table A.7. Effect on placement into college-level English (1) (2) (3) (4) Treatment 0.298*** 0.301*** 0.302*** 0.304***

(0.015) (0.015) (0.014) (0.014) Control Mean 0.524


Table A.8. Effect on enrolling in college-level English in term 1

(1) (2) (3) (4) Treatment 0.183*** 0.192*** 0.193*** 0.193***

(0.016) (0.014) (0.014) (0.014) Control Mean 0.408


45

Table A.9. Effect on enrolling and passing college-level English in term 1

(1) (2) (3) (4) Treatment 0.118*** 0.124*** 0.125*** 0.125***

(0.016) (0.014) (0.014) (0.014) Control Mean 0.272


46

Table A.10. Effect on enrolling in any college-level course in term 1

(1) (2) (3) (4) Treatment 0.001 0.042*** 0.009*** 0.009***

(0.011) (0.011) (0.003) (0.003) Control Mean 0.807 0.616


47

Table A.11. Effect on enrolling in and passing any college-level course in term 1 (1) (2) (3) (4) Treatment 0.034** 0.041*** 0.042*** 0.042***

(0.014) (0.011) (0.011) (0.011) Control Mean 0.616


Table A.12. Effect on cumulative college-level credits earned in term 1

(1) (2) (3) (4) Treatment 0.503*** 0.572*** 0.593*** 0.599***

(0.150) (0.130) (0.129) (0.129) Control Mean 5.170


48

Table A.13. Costs for Implementation of Placement Algorithms

Ingredient Price per

FTE

College 1 College 2 College 3 College 4 College 5 Total

Personnel (FTEs): IT a $56,230 0.30 0.18 0.05 0.10 0.18 0.80 Program b $47,500 0.87 0.15 0.14 0.91 0.44 2.50 Senior/Faculty c $62,500 0.25 0.25 0.25 0.07 0.16 0.98 Administrative support d $35,950

0.05 0.05 0.05 0.05 0.05 0.25

Evaluator time e $56,230 0.10 0.10 0.10 0.10 0.10 0.50

Total Personnel Costs

$81,000 $40,180 $32,470 $60,160 $48,600 $262,410

Fringe benefits f $26,730 $13,260 $10,720 $19,850 $16,040 $86,600 Overheads / facilities g

$78,570 $38,970 $31,500 $58,360 $47,140 $254,540

Administer Placement Test h

$82,590 $21,540 $31,260 $12,930 $10,770 $159,090

Total Cost $268,890 $113,950 $105,950 $151,300 $122,550 $762,640

Total Cost over status quo

$186,300 $92,410 $74,690 $138,370 $111,780 $603,550

Students per semester 2,753 718 1,042 431 359 5,303 Average Cost over status quo

$70 $130 $70 $320 $310 $110

Notes: 2016 dollars. Present values (discount rate=3%). Rounded to $10. Ingredients information on FTEs from interviews with key personnel at five colleges (no information was available from Onondaga CC). a Salary data from https://www.cs.ny.gov/businesssuite/Compensation/Salary-Schedules/index.cfm?nu=PST&effdt=04/01/2015&archive=1&fullScreen. b Annual salary (step 4, grade 13) from https://www.suny.edu/media/suny/content-assets/documents/hr/UUP_2011-2017_ProfessionalSalarySchedule.pdf. c Midpoint MP-IV https://www.suny.edu/hr/compensation/salary/mc-salary-schedule/ d https://www.cs.ny.gov/businesssuite/Compensation/Salary-Schedules/index.cfm?nu=CSA&effdt=04/01/2015&archive=1&fullScreen. e Estimated from timesheets (CCRC/MDRC evaluators). f Uprated from ratio of fringe benefits to total salaries (IPEDS data (2013, 846 public community colleges).

g Uprated from ratio of all other expenses to total salaries (IPEDS data (2013, 846 public community colleges).

h Cost to administer placement test from Rodriguez et al. (2014).

49

Table A.14. Operating Costs for Placement Algorithms

Ingredient Price per

FTE

College 1 College 2 College 3 College 4 College 5 Total

Personnel (FTEs):

IT a $56,230 0.03 0.02 0.01 0.01 0.02 0.08 Program b $47,500 0.87 0.15 0.14 0.91 0.44 2.50

Senior/Faculty c $62,500 0.03 0.03 0.03 0.01 0.02 0.10 Administration d $35,950 0.01 0.01 0.01 0.01 0.01 0.03

Total Personnel

Costs $44,520 $10,020 $8,640 $44,140 $23,150 $130,470

Fringe benefits e $14,690 $3,310 $2,850 $14,570 $7,640 $43,060 Overheads /

facilities f $14,250 $3,210 $2,760 $14,130 $7,410 $41,770

Administer Placement Test g

$82,590 $21,540 $31,260 $12,930 $10,770 $159,090

Total Operating Cost (TOC)*

$156,050 $38,080 $45,510 $85,770 $48,970 $374,390

TOC over status

quo $73,460 $16,540 $14,250 $72,840 $38,200 $215,300

Students per

semester 2,753 718 1,042 431 359 5,303

Average Operating Cost over status quo

$30 $20 $10 $170 $110 $40

Notes: 2016 dollars. Present values (d=3%). Rounded to $10. Ingredients information on FTEs from interviews with key personnel at five colleges (no information was available from Onondaga CC). a Salary data from online here. b Annual salary (step 4, grade 13) from online here. c Midpoint MP-IV. e Estimated from timesheets (CCRC/MDRC evaluators). f Uprated from ratio of fringe benefits to total salaries (IPEDS data (2013, 846 public community colleges).

g Uprated from ratio of all other expenses to total salaries (IPEDS data (2013, 846 public community colleges).

h Cost to administer placement test from Rodriguez et al. (2014). * Operating Cost refers to running of new placement system after initial algorithm has been developed and tested.

https://www.cs.ny.gov/businesssuite/Compensation/Salary-Schedules/index.cfm?nu=PST&effdt=04/01/2015&archive=1&fullScreen

https://www.suny.edu/media/suny/content-assets/documents/hr/UUP_2011-2017_ProfessionalSalarySchedule.pdf

https://www.suny.edu/hr/compensation/salary/mc-salary-schedule/%20d%20https:/www.cs.ny.gov/businesssuite/Compensation/Salary-Schedules/index.cfm?nu=CSA&effdt=04/01/2015&archive=1&fullScreen

Documents

Using Predictive Analytics to Track Students: Evidence ...psb2101/bergmananalytics.pdf · Using Predictive Analytics to Track Students: Evidence from a ... We created college-specific