23
Simon Webb & Julien Dicaire ECON 774 | MCMASTER UNIVERSITY Neighborhood Effects on Early Childhood Test Scores AN EXAMINATION OF THE ACHIEVEMENT GAP

Webb_Dicaire_774Final

Embed Size (px)

Citation preview

Page 1: Webb_Dicaire_774Final

Simon Webb & Julien Dicaire ECON 774 | MCMASTER UNIVERSITY

Neighborhood Effects on Early Childhood Test Scores AN EXAMINATION OF THE ACHIEVEMENT GAP

Page 2: Webb_Dicaire_774Final

1

Introduction

Expenditure on education takes up a large portion of the budget for the Ontario Provincial

Government. It is therefore imperative that this expenditure be allocated in the most efficient

way possible. As we will show, neighborhood and family factors are deemed important to

student outcomes in most of the literature. The Ontario Ministry of Education implemented the

Ontario Early Years Policy Framework which recommended the introduction of new Early Years

Centers and the four year phase-in of Full-day Kindergarten in Ontario. Since this is a large and

expensive policy, it is of utmost interest to the Government of Ontario to ensure that these

strategies have the most significant effect on student outcomes to ensure the funds are being

allocated effectively.

These two policies are mainly aimed at exposing the student to better social and academic

surroundings than they might have been previously experiencing. That is, the strategies are

expected to lessen the impacts which sub-optimal environments can have on early childhood

development. We use a set of neighborhood and family variables which potentially have an

effect on early childhood development to create an index which represents the socio-economic

status of a neighborhood as well as the general environment of learning.

Focusing on Ontario, we examine whether or not the socioeconomic index has an impact

on early childhood performance. We create two separate datasets of grade 3 and 6 students, as

well as a dataset of students linked between grade 3 and grade 6. This third dataset is used to

conduct a value-added methodology to determine whether or not socioeconomic status is

affecting test scores between the years of grade 3 and 6. This methodology allows us to see if the

gap in student achievement is increasing, decreasing, or not affecting test scores between the

years of grade 3 and 6. The other two datasets, namely those datasets of our unlinked grade 3 and

grade 6 students, allow us to see if there are observable gaps at those ages.

If a gap exists, then recommendations can be made for targeting government funds

towards early childhood education in disadvantaged communities. However, in our case, the

results seem to indicate a negligible gap, and our recommendation is instead that further research

is needed. We then explain the possible sources of bias that might be causing our results to be

both economically and statistically insignificant.

Background

Rising income inequality has become an extremely important issue across much of the

world. The present day media shows an increasing amount of political unrest surrounding the

issue of relatively high-income individuals becoming ever wealthier while the rest of the income

distribution seems to have plateaued1. Delving deeper into this issue, many academics have

sought to determine how this income gap develops in the first place. It is a common belief that

1 Saez, Emmanuel, and Michael R. Veall. "The evolution of high incomes in Northern America: lessons

from Canadian evidence." American Economic Review (2005). pp. 840.

Page 3: Webb_Dicaire_774Final

2

individuals who are raised in privileged families fare better economically in their adult lives than

children raised in underprivileged families, but a still relatively unanswered question is the

mechanism with which this gap develops. To what degree is the income gap merely a product of

differences in inherited wealth and to what degree is it woven into the skill sets of privileged

children? The current study aims to shed further light on this second point, and in particular to

uncover whether or not performance gaps are observable in early childhood test scores.

When do observable achievement gaps develop?

Other studies before ours have also tested for observable performance gaps among

different groups of children at young ages. Fryer and Levitt find in their article statistically

significant differences in performance of children as early as 6 months old. Their study aims to

see if there is a difference in cognitive ability among kids of different backgrounds. This

particular study could find no difference in ability between children of different backgrounds,

except for minor differences among race. Therefore, this suggests that if we observe differences

in test scores later in childhood, they must be due to some factor that occurs after the child is

born and up to the test that we are observing a gap.2

Sammons, Hall, Sylva, Melhuish, Siraj-Blatchford & Taggart find, utilizing an index,

whether or not the child is in a state of deprivation has an effect on test scores. Through this, they

attempt to determine whether deprivation has an effect on self-regulation and the test scores of

the student. Using a value-added method, which is identical to our methodology for the linked

students, the authors find a direct and indirect source of deprivation. The direct source of

deprivation is the immediate reduction in student effectiveness on tests, and the indirect effect is

the lasting effect of deprivation on student self-regulation and overall learning from previous

years.3

In a third study, Fryer and Levitt find that white students perform better in kindergarten

math than black students and that this gap in achievement becomes larger over time. In reading,

they find that the gap is present as well but smaller and that this gap also widens over time. Fryer

and Levitt could not come to the conclusion that school quality was the reason for the gap’s

existence. These are simply mean comparisons, however, and when a regression was conducted

the gap between black-white students remains at roughly one tenth of a standard deviation per

year in both math and reading test scores. Fryer and Levitt could not identify the source of the

gap to be the school, the student’s environment or parental factors.4

These two studies show conflicting results. The first failed to find observable gaps

between children at 6 months of age and the second found some significant gaps (relating to

race) in kindergarten level test scores. In order to test the presence of an achievement gap in our

2 Fryer Jr, Roland G., and Steven D. Levitt. "Testing for racial differences in the mental ability of young

children." The American Economic Review 103.2 (2013). pp 985. 3 Sammons, Pam, James Hall, Kathy Sylva, Edward Melhuish, Iram Siraj-Blatchford & Brenda Taggart,

“Protecting the development of 5-11-year-olds from the impacts of early disadvantage: the role of primary school academic effectiveness,” School Effectiveness and School Improvement 24.2 (2013): 256. 4 Fryer, Roland G. Jr. & Steven D. Levitt, “The Black-White Test Score Gap Through Third Grade”, NBER

Working Paper No. 11049, January 2005,JEL No. I2

Page 4: Webb_Dicaire_774Final

3

current study, we need to look deeper into the possible underlying drivers of such a gap. We now

look to the field of Early Childhood Development to uncover some of these possible drivers.

Early childhood development

”The early years are a period of intense learning and development, when

tremendous changes occur in the brain over a short period of time. In the first

year of life, the architecture of the brain takes shape at an astounding rate –

approximately 700 new neural connections are being built per second.

Scientists now know that this process is not genetically predetermined, but is in

fact dramatically influenced by children’s early experiences with people and

their surroundings.” - Center for the Developing Child, Harvard University5

This quote from the Center for the Developing Child at Harvard University implies that at

birth, all individuals have equal opportunity in terms of brain development. The quote also

mentions that this brain development in young children is influenced by the people and

environment they are surround by. If children from privileged families and neighborhoods were

to have observably higher test score performance, this could be evidence that this ‘privileged

state’ is influencing the brain development of these children. This could be evidence that the

income inequality seen in the world develops in part due to gaps in the earliest phases of brain

development of privileged and underprivileged children. We now turn to the issue of how to

define privileged and unprivileged children in order to compare their performance.

What factors cause achievement gaps in early childhood?

Some studies have reviewed what they believe are the factors attributed to the differences

in test scores among individuals. These range from environmental factors such as crime, to

family factors such as the educational attainment of the mother and father.6 These factors, while

they may not directly affect the learning of the child, are usually the closest measures which can

be obtained to represent the parental commitment to the child’s learning. It is assumed these

factors also capture the societal forces that may be stemming from the neighborhood as it is

difficult to observe how nurturing a neighborhood is to a primary student.

Parental factors affect learning through two possible methods: Educational Effectiveness

and School Effectiveness7. The home learning environment is also important. The amount of

time a father and mother put into their child’s education seems to pass on some aspect of their

education as the learning environment is important, but also important are the education levels of

5 Retrieved from

http://developingchild.harvard.edu/index.php/resources/briefs/inbrief_series/inbrief_the_science_of_ecd/ 6 Shonkoff, Jack P., and Deborah A. Phillips, eds. From neurons to neighborhoods: The science of early

childhood development. National Academies Press, 2000. 7 Sammons, Pam, James Hall, Kathy Sylva, Edward Melhuish, Iram Siraj-Blatchford & Brenda Taggart,

“Protecting the development of 5-11-year-olds from the impacts of early disadvantage: the role of primary school academic effectiveness,” School Effectiveness and School Improvement 24.2 (2013). pp 252.

Page 5: Webb_Dicaire_774Final

4

the mother and father, to varying degrees. The education level of the mother is far more

important to the child’s success in school compared to the father’s, possibly due to the fact that

mother’s tend to spend more time with the child.8 Neighborhood effects work in some of the

same manner. Young children look for mentors outside the home, people to look up to, and what

kind of person that mentor is, can be quite influential to the learning of the child.

Several studies suggest that the socio economic status of a student matters greatly to that

student’s success in school. In an article by Sammons, Hall, Sylva, Melhuish, Siraj-Blatchford

& Taggart8 utilizing an index of multiple deprivation that Melhuish previously created

9 using

census measures, the authors proceed to determine the effects of placing the student into a more

academically stimulating, less deprived environment. The authors utilize a value-added approach

to come to the conclusion that deprivation has a significant negative effect on the student which

becomes internalized over time. This means that, should the student be placed in an environment

of significant deprivation for a lengthy period of time, the impact of the deprivation will

accumulate, impacting further self-regulation and educational attainment.

In Figure 1 below, Edward Melhuish shows factors he found to be affecting student

performance at age 11 in the United Kingdom.9 In the figure, we see that family factors are

indeed important to student development and their effect, listed in standard deviations on the y-

axis, lead to a larger effect when combined to the influences schools have on test scores.

Figure 1: Bar graph from Edward Melhuish study demonstrating the relative importance of different factors in affecting children's

academic achievement.

Returning to our original question, we are trying to examine the performance gaps that

exist between privileged and underprivileged children. Also, given that we only have access to

neighborhood level Census data, our definition of privileged and underprivileged must be at the

neighborhood level. As Melhuish notes in the graph, there are a range of factors affecting student

performance in grade 6. Many of these factors are also highly correlated. Therefore in order to

create a definition of privileged children at the neighborhood level, we decided to create an index

which would represent the socioeconomic status of the neighborhoods in which the children

8 Melhuish, Edward, Keynote Presentation,“Excellence and Equity in Early Childhood Education and

Care,” EU conference, Budapest. 2011.

9 Melhuish, Edward C. "Preschool matters." Science 333.6040 (2011). pp 299-300.

Page 6: Webb_Dicaire_774Final

5

attend school. This creates a simple way of comparing children over a large number of variables

which are all correlated.

Dr. Krishnan from the University of Alberta wrote a paper titled “Constructing an Area-

Based Socioeconomic Status Index: A Principal Components Analysis Approach”.10

In this

paper he outlines a method to create a socioeconomic index which compares neighborhoods

using a set of variables (constructed using Alberta Census data) which are expected to have an

effect on early childhood development. Since this index both matched our data and was highly

applicable to the research question, we decided to create our own socioeconomic status index for

Ontario in order to compare students. The next section outlines the steps we took to create this

index.

Socioeconomic Index Krishnan (2010) outlined a method to combine a large group of socioeconomic variables

into one socioeconomic index (or henceforth SEI) using factor analysis. Factor analysis is a

method which determines the maximum variation which is common among the variables and can

be explained by the data. The idea is that the factor which is common among all of the variables

which are chosen in Krishnan (2010) is the relative ‘socioeconomic status’ of neighborhoods.

There were a few key differences between Dr. Krishnan’s data and ours. Dr. Krishnan

constructed his index based off of Alberta Census data at the Dissemination Area (DA) level.

Our data uses Ontario Census data at the Forward Sortation Area (FSA) level. Although FSAs

are larger and less precise areas, this may also help our analysis by capturing a larger number of

children who live in the vicinity of their respective schools.

Krishnan outlines six steps used in his approach:11

● Select appropriate variables

● Check for outliers, linearity and normality

● Ensure the applicability of factor analysis

● Conduct factor analysis

● Interpret results

● Classify into groups

The rest of this section outlines briefly how our study mimicked these steps to create our

socioeconomic index.

Select Appropriate Variables

Our analysis used 23 of the 26 variables used in Krishnan (2010). The three variables

which we decided to omit were the number of low income families, the number of owned

occupations and the ‘income disparities’ in the neighborhood. These were omitted because there

10

Krishnan, Vijaya. "Constructing an area-based socioeconomic index: A principal components analysis approach." Edmonton, Alberta: Early Child Development Mapping Project (2010). 11

Krishnan, Vijaya. "Constructing an area-based socioeconomic index: A principal components analysis approach." Edmonton, Alberta: Early Child Development Mapping Project (2010), pp.10-11.

Page 7: Webb_Dicaire_774Final

6

were some large outliers in the data due to the way in which the variables were constructed. In

order to compensate for the loss of these variables, the GINI coefficients in each respective FSA

were added in as a 24th variable. We then ended up with the 24 variables listed below in Table 1.

Check for outliers, linearity and normality

Factor analysis can be very sensitive to outliers, skewness and kurtosis in the

distributions of the data as well as the magnitude of these variables. For instance, income, which

varies from values of 0 to slightly over four hundred thousand, will receive a higher weight than

the variable which captures the average amount of children families have in an FSA.

To combat these sensitivities, the variables are transformed according to the methods

outlined in Dr. Krishnan’s paper where appropriate, and in the appropriate manner. The variables

along with their respective descriptions are listed below in table 1. While transformed, outliers

may still exist and therefore if an observation should be outside six standard deviations, it is

dropped. This is done for three FSA’s, which consequently only has an effect on the SEI

quintiles as no students occupy these FSA’s.

Table 1: A description of variables utilized in the creation of the SEI is provided below.

Ensure the applicability of factor analysis

We use a number of statistical tests to ensure the applicability of factor analysis to our

data (also outlined in Krishnan (2010)). The first of these was called the Kaiser-Meyer-Olkin (or

KMO) test, which detects multicollinearity between the variables (which would be a problem

since it would make the standard errors of the factor loadings. The result of this test on our

variables was a KMO rating of 0.856 which was sufficiently high to conclude that there is

Page 8: Webb_Dicaire_774Final

7

relatively little multicollinearity between the variables. Another test called the Bartlett Test of

Sphericity detects if there is enough correlation between the variables to warrant the use of factor

analysis. Our data received a p-value of 0.000 on the Bartlett Test of Sphericity, meaning we can

reject the null hypothesis that there is multicollinearity between the variables.

Conduct factor analysis

Table 2 below shows the rotated factor loadings which resulted from our factor analysis.

Our data makes use of four factors which are linear combinations of the standardized variables.

The values in the table represent the relative weightings (from 0 to 1) on each variable in their

respective ‘factors’. For example, the first factor relates lightly to the dwelling and the residents

of the neighborhood. The second factor relates to mainly to the labour characteristics of the

household. The third factor relates mainly to inequality, as well as the family structure. Finally,

the fourth factor is a measure of inequality, female labour force participation and dependence-a

ratio of children under four and above 65 to the eligible labour force participants. We omit any

variable that has less than a correlation of 0.5, in absolute value, for the interpretation of the

factor variables. Uniqueness is the proportion of that variable not explained by the factors and

their variable’s correlations. Ideally, uniqueness should be relatively low as this is related to their

predictive power. The factors are then combined into a single index by adding them and

weighting them by dividing the proportion of variance explained by the factor by the total

variation explained by all of the factors. The result is a single variable (the ‘socioeconomic

index’) which is a value between 0 and 1. Once the index is created, it is inverted (in order to

make 1 ‘high’ and 0 ‘low’) and multiplied by 100 in order to make the scale easier to use.

Table 2: The Rotated Factor Loadings visually demonstrate the correlations between the variables and the factor variables.

Page 9: Webb_Dicaire_774Final

8

Interpret Results

Figure 2 below shows the distribution of Ontario FSAs in the Socioeconomic index. The

FSA’s appear to be normally distributed in the SEI, with a rightward skew. Upon thorough

investigation, the FSA’s in the highest and lowest SEI ratings were in areas which we would

expect. An investigation by Teri Pecoskie into schools in Hamilton, Ontario, revealed that

students fared worse in schools where English was not the first language, where low income

families were higher than the average, and where students required special assistance. These, and

others, are factors that we have proxies for, so we would expect our SEI to accurately identify

neighborhoods that are nurturing to student development.12

Figure 2: The 2006 ungrouped distribution of the SocioEconomic Index. For all Forward Sortation Areas that have a

positive population at the time of the census, a SEI score is given based of the 24 variables from the Census.

Classify into groups

After constructing the SEI, the FSA’s were place into 5 quintiles within the SEI. The

number of groups was chosen based off of the number chosen by Krishnan (2010). Although the

selection of the number groups was somewhat arbitrary, we did sensitivity analysis with respect

to the number of groups and found that while making the graphs and regression tables more

complicated, the addition of groups did very little in terms of changing the results. Table 3 below

shows some summary statistics for three of the five groups. The lowest standing SEI indicates

the first quintile, where we would expect statistics to be less nurturing to student development.

We see that the percentage without a diploma, the variable Education, has the highest percentage

of all standings indicating the highest population. The divorce rate, unemployment rate and

percentage of immigrants are all higher than their higher standing SEI counterparts. For the SEI

of highest standing, the Gini is lowest, indicating low income inequality, there is nearly a third of

the population without a high school diploma, the unemployment rate is much lower than the

national average at the time, and the proportion of recent immigrants is not even two percentage

points.

12

Pecoskie, Teri, “Keeping Score: Day 1: Unequal Education”, The Hamilton Spectator, April 12, 2014. Accessed June 12th, 2014.

Page 10: Webb_Dicaire_774Final

9

Table 3: Selected summary statistics for the lowest, middle and highest SEI quintiles. Other than the Gini, variable means can be

read as percentages.

Validation of the Socioeconomic Index

In order to validate the socioeconomic index we created with our data, we used three

different methods. First we hand-checked some of FSAs with extremely low and extremely high

SEI values to see if the areas made sense using our prior knowledge of Ontario neighborhoods as

well as identifying which measures were causing the low SEI score. Among some of the lowest

were some notoriously deprived areas of Toronto and Hamilton, and some of the highest were

rich suburban areas of Toronto and Ottawa. Secondly, we constructed the SEI for both 2001 and

2006 and compared the relative locations of FSA’s along the spectrum in each year. Figure 3

below shows a scatter plot of the standardized 2001 and 2006 SEI values of each FSA compared

to a 45 degree line13

. Due to the sensitive nature of factor analysis, it is important to determine

that the SEI is not drastically different between the two census years - potentially indicating

proper underpinnings. It would be a concern if the relative positions of the FSAs changed

erratically between these two years (since this is not what we observe in the real world). Outside

of a few exceptions, however, the FSA’s appear to fall closely to the 45 degree line, meaning

they are in approximately the same relative position in both years compared to other FSA’s. The

extreme outliers in this graph are where neighborhood development occurred between the census

years, changing the demographics and characteristics of the FSA. Figure 3: 2006 SEI on the vertical axis, and the 2001 SEI on the horizontal compared to a 45 degree line.

13

2006 SEI rankings were larger on average than 2001 SEI rankings. This was likely due to economic development in Ontario between these two years. Therefore, in order to compare the SEI values of FSAs between years, the distributions are standardized for both SEI’s.

Page 11: Webb_Dicaire_774Final

10

Our third method of validation was to link our census data to the schools within each

FSA and examine the distribution of schools in the socioeconomic index. This is shown in Figure

4 below. Interestingly, there are a number of schools at the lower end of the SEI distribution

while there are no schools at the very upper end of the SEI distribution. There is a significant

amount of variation in the placement of schools, with a slight rightward skew to higher SEI

neighborhoods. However, the distribution of schools in SEI does not appear to be overly skewed

and does not display concerning abnormal peaks. This will prevent bias once we move to

examining school and student differences in test scores between socioeconomic neighborhoods.

Figure 4: Distribution of schools among the SEI where one record per school is included.

Now that the SEI has been constructed, we can utilize it to examine the gap in test scores

among the different values of the SEI. There are two methods that we will be looking at the test

scores and the SEI. The first method involves graphs against the ungrouped SEI scores, while the

second method involves the quintiles that the SEI is grouped into.

Is There a Visible Test Score Gap? To begin our examinations, some summary statistics on the students and schools are

provided by relevant dataset. For our grade 3 dataset, which spans the years 1998-2010 covering

1.7 million students has 3711 schools included. Enrollment in grade 3 ranges from as little as 8

to as high as 260, with a mean of 52 and a standard deviation of 25. In grade 6, there are 955,000

students over the years 2004-2010 in 3461 schools. Enrollment in grade 6 is slightly on the larger

side with a minimum of 6 students ranging to 450, with a mean of 67 and a standard deviation of

56. When we link the students in grade 3 to grade 6, we obtain a dataset containing 466,000

observations with enrollment varying from 12 students to 240 students. Mean enrollment is 52

students with a standard deviation of enrollment of 25 students. The number of students missing

the math EQAO test at the school level ranges from zero to 6.25 percent with a mean of 0.02

percent. Appendix tables 4 and 5 display the number of students over time and across SEI’s.

Page 12: Webb_Dicaire_774Final

11

Let us first examine the test scores of students averaged at the school level against the

school’s respective ungrouped SEI score. To accurately identify how students are achieving

overall at the school level based on that school’s SEI score, we plot below the average test score

of the school against the SEI.

Figure 5: Below are six graphs, one for each test at grade 3, and one for each test in grade 6. We average the tests of the

students at the school level to have an overall effect of the SEI on the school.

Figure 5 above show that, for the most part, there is an upward trend in test scores as the

school’s SEI is increasing except for the Grades 3 and 6 Math test scores which appear to have

almost no trend. There is quite a bit of dispersion and several outliers in the graph but overall,

school’s that are in better neighborhoods according to the SEI, the students in that school

perform better on the EQAO test on average. We can also see that there are more schools that

have a value of zero in the SEI than there are schools that have a value nearer to 100. These

schools also tend to be below the trendline of test scores against the SEI which we would expect.

Though there is much dispersion around the trendline for these graphs, we must be conscious of

the fact that they are quite simple- comparing only test scores to the SEI. There are many other

factors attributable to test scores that the SEI does not capture. For example, school factors and

individual student factors not captured in the graphs.

Figure 6 illustrates test scores for the highest and lowest grouped SEI’s for all tests.

Reading, writing and math test scores are shown for the highest and lowest scoring SEI’s. We

would therefore expect that these represent the most and least potentially nurturing

neighborhoods to student development. It is easy to visually distinguish that test score values of

2 and 1 are lower in the SEI of highest standing, the 5th quintile, while test score values of 3 and

4 are higher in the SEI of highest standing by a few percentage points.

Page 13: Webb_Dicaire_774Final

12

Figure 6: Relative frequencies of test scores across all tests for neighborhoods of highest and lowest standing among the SEI.

The Model

At first glance, looking at the average differences between the groups in the previous

section, it appears there is a relationship between the socioeconomic index of a neighborhood

and the test scores of children attending the schools in that neighborhood. In order to delve

deeper into this relationship it is necessary to control for factors which are driving differences in

both test scores and socioeconomic status, but are not the focus of our analysis. Our model

controls for the following factors at the individual and neighborhood level:

● Individual Level Factors:

○ Exceptional or gifted students

○ Gender

○ French Immersion

● Neighborhood-level factors:

○ Missing Values

○ School language

○ Rural areas

○ School size

○ Year effects

Using these controls, we use OLS regression to test for gaps in test scores between our 5

SEI groups. The regression uses the following functional form:

Where j is the SEI group, i is the individual and t is the year. For detailed information on

the regression results please see Tables A1 and A2 in the Appendix. The current section will will

only highlight the simplified findings of the model.

Page 14: Webb_Dicaire_774Final

13

Findings

Figure 7 below shows the simplified regression results from the model listed in the

previous section applied to both grade 3 and grade 6 test scores separately. It is clear from these

tables that reading is the test score which is most highly impacted by the socioeconomic

neighborhood of a student. In grade 3, reading shows statistically significant gains across all

groups which are increasing as we compare higher SEI groups. Grade 3 writing and math

showed much smaller gains to the socioeconomic neighborhood of the student. The lower SEI

groups had a statistically insignificant effect on these test scores. In grade 6, the value to higher

socioeconomic neighborhoods appears to be larger in all categories than it was in grade 3. Once

again, reading shows statistically significant gains across all SEI groups except for between the

1st and 2nd. Writing shows much smaller gains to belonging to higher SEI groups. Math shows

larger gains than writing, except these are mostly statistically insignificant other than the gap

between the 1st and 5th SEI group (most likely do to higher variation in test scores).

Although the gains shown in our regressions are statistically significant, they also appear

to be reasonably small. For example, the largest gap (between the 1st and 5th SEI groups in

grade 6 reading scores) suggest that by switching from the lowest to the highest socioeconomic

group would raise your test score by 0.073 points on the EQAO test. Since the test scores are

between 1 and 4, this would only be predicted to raise a student’s score from a 3 to a 3.07.

Figure 7: Predicted gains in grade 3 (on left) and grade 6 (on right) EQAO test scores from moving from the lowest

socioeconomic group of neighborhoods to the group labelled in the legend. Bars in solid colours are statistically significant at the 5%

level of confidence, while bars in stripes are not. All control variables mentioned in the previous section were included in the

regression. Detailed regression tables are listed in the Appendix.

Included in the appendix are the regression output. In this output is also included a non-

linear term for the SEI. Since there is some weakness in the linear estimates, we examined

whether the SEI is non-linear in how it affects different neighborhoods. Therefore, we decided to

include a squared term for each of our specifications as a confirmatory tool. A positive,

statistically significant square term indicates that the SEI is affecting test scores in a convex

manner while a negative squared term indicates that the SEI is affecting test scores in a concave

manner. A convex manner would indicate that test scores are slightly affected at lower levels of

Page 15: Webb_Dicaire_774Final

14

the SEI while at higher standings of the SEI test scores are more-so affected. For negative

squared terms, the effect is convex and the opposite of what has just been described. It appears to

be the case since in some cases, the non-linear term is significant while the linear term is not

significant. However, the squared SEI is not superior in magnitude to the linear counterpart and

is inconsistent in significance.

Does the Gap Get Larger Between Grade 3 and 6?

The Model

The previous models suggested that there were some gaps in test scores between

socioeconomic groups, and many of these gaps appeared larger in grade 6 than they were in

grade three. The next question we examine is whether or not the socioeconomic status of a

child’s neighborhood has an effect on their learning between these two crucial ages. We do this

by using a value-added model. This is a similar model to the grade 6 gap regression in the

previous section, but we link students test scores between grades 3 and 6 and include the grade 3

test scores in the regression on grade 6 test scores. We therefore use the following equation:

The results from this regression therefore represent the effects of moving between SEI

groups while controlling for the effects which have already occurred by grade three. In other

words, this is the value-added to test scores in between grade 3 and 6 due to being in a higher

SEI group. Once again this report will outline only the basic findings of the model, but a detailed

regression table (Table A3) is included in the Appendix for interested readers.

Findings

Figure 8 below shows that the value added to reading scores in between grades 3 and 6

due to belonging to a higher socioeconomic group is statistically insignificant except for the gap

between the lowest and highest group. The writing and math values suggest there may be a

negative value added due to being in a higher SEI group between grades 3 and 6. Upon

examining the size of these values, it is clear that their economic significance is very small. The

largest gap in this graph is in writing between the lowest and second highest SEI group, with an

added value of -0.019 (approximately one fiftieth of a point change in test scores).

Page 16: Webb_Dicaire_774Final

15

Figure 8: Predicted gains in grade 6 EQAO test scores from moving from the lowest Socioeconomic

grouping neighborhood to the groups labelled in the legend (controlling for students’ grade 3 test scores). Bars in

solid colours are statistically significant at the 5% level of confidence, while bars in stripes are not. All control variables

mentioned in the previous section were included in the regression. Detailed regression tables are listed in the Appendix

Potential Policy Implications Our findings do not provide very much support for the idea that socioeconomic aspects of

neighborhoods play a large role in the early childhood development of skills in reading writing

and mathematics. Although we did find statistically significant gaps in both grade 3 and grade 6,

the size of these gaps were extremely small. The results should not be interpreted as evidence

that no gap exists at the grade 3 or 6 levels. It may be the case that individual or family-level

factors are more prominent is driving early childhood development, and that the size of the

variation in these factors within each FSA is overpowering the variation between FSA’s.

This being said, these results can be interpreted as evidence that FSA-level

socioeconomic factors do not have a large impact on early childhood development and

performance. In terms of policy, this suggests that community-wide targeting of children in low

socioeconomic areas would not have high returns. Instead, policies should provide equal

opportunities to individuals and families who may not have all of the proper resources to support

the development of their children through early childhood. Therefore the full-day kindergarten

and early-years centres policies recently implemented in Ontario may still be effective in

increasing both the equity and efficiency of the education system as a whole.

Acknowledging Weaknesses Unfortunately, due to data restrictions, we are unable to accurately identify the area in

which the student resides. We have to assume in our case that the student lives in the same FSA

as the school he goes to, since we have the FSA of the school. While it may be true for a

majority of the students, there may be an element of non-randomness to the type of student it

affects. For example, if a particular school is in a different FSA to the neighborhoods of a

majority of the students. Or perhaps still, if students from a high SEI neighborhood attend a

school in a neighborhood which has a lower value SEI. Depending on the scenario, the estimates

Page 17: Webb_Dicaire_774Final

16

are not capturing the full effect of the neighborhood which might be the reason our estimates

have such low statistical impact.

Aside from the potential movement between FSA’s the size of the ‘neighborhoods’ we

examine may also be too large. There may be higher variability within each FSA which is being

overlooked due to the size of our neighborhood definitions. The study could therefore benefit

from the use of smaller level neighborhood data. Ideally, we would have socioeconomic

information about children’s families and could link this information to their test scores. This

would allow us to group families at which ever level was most relevant. Unfortunately this was

not possible using the level of census data to which we had access.

Family level data aside, the use of smaller analysis areas would also be informative. The

analysis of the variation between sections of each FSA could help to inform the lack of observed

differences in test scores. At the presentation of this paper it was also suggested that if we had

lower level data, we could use a method called ‘multi-level analysis’. This is another form of

regression which allows for several nested levels to be present in the data. Although this could be

useful in terms of being more precise about the effects of different levels of neighborhoods on

test scores, it is highly complicated and out of the scope of this policy report.

Our results seem to indicate that there is no effect of SEI on test scores. However, there

are some potential reasons to believe that there is downward bias on our estimates. The first

possible source of bias is that we utilize Forward Sortation Area (FSA) to emulate the

neighborhood. These FSA’s can vary between quite large to quite small. For example, there are

several FSA’s that are less than a square kilometer while there are even more that are over ten

thousand square kilometers.14

Another factor that might be affecting the results is the fact that students sometimes move

FSA’s between grade 3 and 6. This applies only to the estimates for grade 6 and the linked

grades 3 and 6. Our thinking in choosing not to change the SEI between the grades of 3 and 6 is

that we thought the student would have been more exposed to the environment in the first SEI

compared to the SEI after the move. However, for some students, the family factors might be

more appropriately captured by the SEI which the student moves to. For example, a young

educated family who moves to a larger house in a better neighborhood would be better captured

by the new SEI.

It is also potentially the case that we have omitted variable bias in our study due to lack

of data to estimate these variables. Some examples of this could be the existence of early years

centers in specific neighborhoods, or whether or not the child attended full-day kindergarten.

These policies are both likely to have an impact on student outcomes that we cannot observe

over the course of our sample.

14

Census of Canada 2006 Geographic Reference Files, University of Toronto CHASS. Created March 16

th, 2007. Accessed June 13

th. http://datalib.chass.utoronto.ca/cc06/georef06.htm#fsar

Page 18: Webb_Dicaire_774Final

17

Conclusion Our method of utilizing an index of socioeconomic factors to reveal an achievement gap

yielded mixed results. While we find evidence that there is a gap at grade 3, and this gap

continues to exist in grade 6, we find that the gap contributes very little to the achievement of the

student. We also find little convincing evidence that the gap is growing, in absolute value, over

the course of the period between the years of grade 3 and 6. This leads us not to the conclusion

that the gap between low and high socioeconomic neighborhoods is small and that its existence

is likely inconsequential. Returning to our original question, this does not provide much evidence

that gaps between privileged and underprivileged neighborhoods are a driving force behind

economic inequality through their effects on early childhood development. This being said, there

were many assumptions and data limitations which may have watered down the results. It would

be useful for further studies to examine socioeconomic factors using smaller comparison groups.

Ideally, a study would have access to socioeconomic information about children’s families and

would be able to link these children with their test scores.

Page 19: Webb_Dicaire_774Final

18

Bibliography

Saez, Emmanuel, and Michael R. Veall. "The evolution of high incomes in Northern America: lessons

from Canadian evidence." American Economic Review (2005): 831-849.

Fryer Jr, Roland G., and Steven D. Levitt. "Testing for racial differences in the mental ability of young

children." The American Economic Review 103.2 (2013): 981-1005.

Sammons, Pam, James Hall, Kathy Sylva, Edward Melhuish, Iram Siraj-Blatchford & Brenda Taggart,

“Protecting the development of 5-11-year-olds from the impacts of early disadvantage: the role of primary

school academic effectiveness,” School Effectiveness and School Improvement 24.2 (2013): 251-268.

Fryer, Roland G. Jr. & Steven D. Levitt, “The Black-White Test Score Gap Through Third Grade”, NBER

Working Paper No. 11049, January 2005,JEL No. I2

Center on the Developing Child (2007). The Science of Early Childhood Development (InBrief). Retrieved

from www.developingchild.harvard.edu. Retrieved from

http://developingchild.harvard.edu/index.php/resources/briefs/inbrief_series/inbrief_the_science_of_ecd/

Shonkoff, Jack P., and Deborah A. Phillips, eds. From neurons to neighborhoods: The science of early

childhood development. National Academies Press, 2000.

Melhuish, Edward, Keynote Presentation,“Excellence and Equity in Early Childhood Education and Care,”

EU conference, Budapest. 2011.

Melhuish, Edward C. "Preschool matters." Science 333.6040 (2011): 299-300.

Krishnan, Vijaya. "Constructing an area-based socioeconomic index: A principal components analysis

approach." Edmonton, Alberta: Early Child Development Mapping Project (2010).

Pecoskie, Teri, “Keeping Score: Day 1: Unequal Education”, The Hamilton Spectator, April 12, 2014.

Accessed June 12th, 2014. Online.

Census of Canada 2006 Geographic Reference Files, University of Toronto CHASS. Created March 16th,

2007. Accessed June 13th. http://datalib.chass.utoronto.ca/cc06/georef06.htm#fsar

Page 20: Webb_Dicaire_774Final

19

Appendix Table A1: Regression output for grade 3- Linear SEI term on the left hand side, Non-linear SEI term on the right half

of this graph. Regression coefficients are listed with p-values in parenthesis. Errors are clustered at the school level.

Page 21: Webb_Dicaire_774Final

20

Table A2: Regression output for grade 6- Linear SEI term on the left hand side, Non-linear SEI term on the right half

of this graph. Regression coefficients are listed with p-values in parenthesis. Errors are clustered at the school level.

Page 22: Webb_Dicaire_774Final

21

Table A3: Regression output for linked grade 3-6 students. A linear SEI term on the left hand side, Non-linear SEI

term on the right half of this graph. Regression coefficients are listed with p-values in parenthesis. Errors are

clustered at the school level.

Page 23: Webb_Dicaire_774Final

22

Table A4: Grade 3 tabulation of year and grouped SEI displaying the number of students across years in different

categories of the SEI.

Table A5: Grade 6 tabulation of year and grouped SEI displaying the number of students across years in different

categories of the SEI.