Upload
simon-webb
View
80
Download
0
Embed Size (px)
Citation preview
Simon Webb & Julien Dicaire ECON 774 | MCMASTER UNIVERSITY
Neighborhood Effects on Early Childhood Test Scores AN EXAMINATION OF THE ACHIEVEMENT GAP
1
Introduction
Expenditure on education takes up a large portion of the budget for the Ontario Provincial
Government. It is therefore imperative that this expenditure be allocated in the most efficient
way possible. As we will show, neighborhood and family factors are deemed important to
student outcomes in most of the literature. The Ontario Ministry of Education implemented the
Ontario Early Years Policy Framework which recommended the introduction of new Early Years
Centers and the four year phase-in of Full-day Kindergarten in Ontario. Since this is a large and
expensive policy, it is of utmost interest to the Government of Ontario to ensure that these
strategies have the most significant effect on student outcomes to ensure the funds are being
allocated effectively.
These two policies are mainly aimed at exposing the student to better social and academic
surroundings than they might have been previously experiencing. That is, the strategies are
expected to lessen the impacts which sub-optimal environments can have on early childhood
development. We use a set of neighborhood and family variables which potentially have an
effect on early childhood development to create an index which represents the socio-economic
status of a neighborhood as well as the general environment of learning.
Focusing on Ontario, we examine whether or not the socioeconomic index has an impact
on early childhood performance. We create two separate datasets of grade 3 and 6 students, as
well as a dataset of students linked between grade 3 and grade 6. This third dataset is used to
conduct a value-added methodology to determine whether or not socioeconomic status is
affecting test scores between the years of grade 3 and 6. This methodology allows us to see if the
gap in student achievement is increasing, decreasing, or not affecting test scores between the
years of grade 3 and 6. The other two datasets, namely those datasets of our unlinked grade 3 and
grade 6 students, allow us to see if there are observable gaps at those ages.
If a gap exists, then recommendations can be made for targeting government funds
towards early childhood education in disadvantaged communities. However, in our case, the
results seem to indicate a negligible gap, and our recommendation is instead that further research
is needed. We then explain the possible sources of bias that might be causing our results to be
both economically and statistically insignificant.
Background
Rising income inequality has become an extremely important issue across much of the
world. The present day media shows an increasing amount of political unrest surrounding the
issue of relatively high-income individuals becoming ever wealthier while the rest of the income
distribution seems to have plateaued1. Delving deeper into this issue, many academics have
sought to determine how this income gap develops in the first place. It is a common belief that
1 Saez, Emmanuel, and Michael R. Veall. "The evolution of high incomes in Northern America: lessons
from Canadian evidence." American Economic Review (2005). pp. 840.
2
individuals who are raised in privileged families fare better economically in their adult lives than
children raised in underprivileged families, but a still relatively unanswered question is the
mechanism with which this gap develops. To what degree is the income gap merely a product of
differences in inherited wealth and to what degree is it woven into the skill sets of privileged
children? The current study aims to shed further light on this second point, and in particular to
uncover whether or not performance gaps are observable in early childhood test scores.
When do observable achievement gaps develop?
Other studies before ours have also tested for observable performance gaps among
different groups of children at young ages. Fryer and Levitt find in their article statistically
significant differences in performance of children as early as 6 months old. Their study aims to
see if there is a difference in cognitive ability among kids of different backgrounds. This
particular study could find no difference in ability between children of different backgrounds,
except for minor differences among race. Therefore, this suggests that if we observe differences
in test scores later in childhood, they must be due to some factor that occurs after the child is
born and up to the test that we are observing a gap.2
Sammons, Hall, Sylva, Melhuish, Siraj-Blatchford & Taggart find, utilizing an index,
whether or not the child is in a state of deprivation has an effect on test scores. Through this, they
attempt to determine whether deprivation has an effect on self-regulation and the test scores of
the student. Using a value-added method, which is identical to our methodology for the linked
students, the authors find a direct and indirect source of deprivation. The direct source of
deprivation is the immediate reduction in student effectiveness on tests, and the indirect effect is
the lasting effect of deprivation on student self-regulation and overall learning from previous
years.3
In a third study, Fryer and Levitt find that white students perform better in kindergarten
math than black students and that this gap in achievement becomes larger over time. In reading,
they find that the gap is present as well but smaller and that this gap also widens over time. Fryer
and Levitt could not come to the conclusion that school quality was the reason for the gap’s
existence. These are simply mean comparisons, however, and when a regression was conducted
the gap between black-white students remains at roughly one tenth of a standard deviation per
year in both math and reading test scores. Fryer and Levitt could not identify the source of the
gap to be the school, the student’s environment or parental factors.4
These two studies show conflicting results. The first failed to find observable gaps
between children at 6 months of age and the second found some significant gaps (relating to
race) in kindergarten level test scores. In order to test the presence of an achievement gap in our
2 Fryer Jr, Roland G., and Steven D. Levitt. "Testing for racial differences in the mental ability of young
children." The American Economic Review 103.2 (2013). pp 985. 3 Sammons, Pam, James Hall, Kathy Sylva, Edward Melhuish, Iram Siraj-Blatchford & Brenda Taggart,
“Protecting the development of 5-11-year-olds from the impacts of early disadvantage: the role of primary school academic effectiveness,” School Effectiveness and School Improvement 24.2 (2013): 256. 4 Fryer, Roland G. Jr. & Steven D. Levitt, “The Black-White Test Score Gap Through Third Grade”, NBER
Working Paper No. 11049, January 2005,JEL No. I2
3
current study, we need to look deeper into the possible underlying drivers of such a gap. We now
look to the field of Early Childhood Development to uncover some of these possible drivers.
Early childhood development
”The early years are a period of intense learning and development, when
tremendous changes occur in the brain over a short period of time. In the first
year of life, the architecture of the brain takes shape at an astounding rate –
approximately 700 new neural connections are being built per second.
Scientists now know that this process is not genetically predetermined, but is in
fact dramatically influenced by children’s early experiences with people and
their surroundings.” - Center for the Developing Child, Harvard University5
This quote from the Center for the Developing Child at Harvard University implies that at
birth, all individuals have equal opportunity in terms of brain development. The quote also
mentions that this brain development in young children is influenced by the people and
environment they are surround by. If children from privileged families and neighborhoods were
to have observably higher test score performance, this could be evidence that this ‘privileged
state’ is influencing the brain development of these children. This could be evidence that the
income inequality seen in the world develops in part due to gaps in the earliest phases of brain
development of privileged and underprivileged children. We now turn to the issue of how to
define privileged and unprivileged children in order to compare their performance.
What factors cause achievement gaps in early childhood?
Some studies have reviewed what they believe are the factors attributed to the differences
in test scores among individuals. These range from environmental factors such as crime, to
family factors such as the educational attainment of the mother and father.6 These factors, while
they may not directly affect the learning of the child, are usually the closest measures which can
be obtained to represent the parental commitment to the child’s learning. It is assumed these
factors also capture the societal forces that may be stemming from the neighborhood as it is
difficult to observe how nurturing a neighborhood is to a primary student.
Parental factors affect learning through two possible methods: Educational Effectiveness
and School Effectiveness7. The home learning environment is also important. The amount of
time a father and mother put into their child’s education seems to pass on some aspect of their
education as the learning environment is important, but also important are the education levels of
5 Retrieved from
http://developingchild.harvard.edu/index.php/resources/briefs/inbrief_series/inbrief_the_science_of_ecd/ 6 Shonkoff, Jack P., and Deborah A. Phillips, eds. From neurons to neighborhoods: The science of early
childhood development. National Academies Press, 2000. 7 Sammons, Pam, James Hall, Kathy Sylva, Edward Melhuish, Iram Siraj-Blatchford & Brenda Taggart,
“Protecting the development of 5-11-year-olds from the impacts of early disadvantage: the role of primary school academic effectiveness,” School Effectiveness and School Improvement 24.2 (2013). pp 252.
4
the mother and father, to varying degrees. The education level of the mother is far more
important to the child’s success in school compared to the father’s, possibly due to the fact that
mother’s tend to spend more time with the child.8 Neighborhood effects work in some of the
same manner. Young children look for mentors outside the home, people to look up to, and what
kind of person that mentor is, can be quite influential to the learning of the child.
Several studies suggest that the socio economic status of a student matters greatly to that
student’s success in school. In an article by Sammons, Hall, Sylva, Melhuish, Siraj-Blatchford
& Taggart8 utilizing an index of multiple deprivation that Melhuish previously created
9 using
census measures, the authors proceed to determine the effects of placing the student into a more
academically stimulating, less deprived environment. The authors utilize a value-added approach
to come to the conclusion that deprivation has a significant negative effect on the student which
becomes internalized over time. This means that, should the student be placed in an environment
of significant deprivation for a lengthy period of time, the impact of the deprivation will
accumulate, impacting further self-regulation and educational attainment.
In Figure 1 below, Edward Melhuish shows factors he found to be affecting student
performance at age 11 in the United Kingdom.9 In the figure, we see that family factors are
indeed important to student development and their effect, listed in standard deviations on the y-
axis, lead to a larger effect when combined to the influences schools have on test scores.
Figure 1: Bar graph from Edward Melhuish study demonstrating the relative importance of different factors in affecting children's
academic achievement.
Returning to our original question, we are trying to examine the performance gaps that
exist between privileged and underprivileged children. Also, given that we only have access to
neighborhood level Census data, our definition of privileged and underprivileged must be at the
neighborhood level. As Melhuish notes in the graph, there are a range of factors affecting student
performance in grade 6. Many of these factors are also highly correlated. Therefore in order to
create a definition of privileged children at the neighborhood level, we decided to create an index
which would represent the socioeconomic status of the neighborhoods in which the children
8 Melhuish, Edward, Keynote Presentation,“Excellence and Equity in Early Childhood Education and
Care,” EU conference, Budapest. 2011.
9 Melhuish, Edward C. "Preschool matters." Science 333.6040 (2011). pp 299-300.
5
attend school. This creates a simple way of comparing children over a large number of variables
which are all correlated.
Dr. Krishnan from the University of Alberta wrote a paper titled “Constructing an Area-
Based Socioeconomic Status Index: A Principal Components Analysis Approach”.10
In this
paper he outlines a method to create a socioeconomic index which compares neighborhoods
using a set of variables (constructed using Alberta Census data) which are expected to have an
effect on early childhood development. Since this index both matched our data and was highly
applicable to the research question, we decided to create our own socioeconomic status index for
Ontario in order to compare students. The next section outlines the steps we took to create this
index.
Socioeconomic Index Krishnan (2010) outlined a method to combine a large group of socioeconomic variables
into one socioeconomic index (or henceforth SEI) using factor analysis. Factor analysis is a
method which determines the maximum variation which is common among the variables and can
be explained by the data. The idea is that the factor which is common among all of the variables
which are chosen in Krishnan (2010) is the relative ‘socioeconomic status’ of neighborhoods.
There were a few key differences between Dr. Krishnan’s data and ours. Dr. Krishnan
constructed his index based off of Alberta Census data at the Dissemination Area (DA) level.
Our data uses Ontario Census data at the Forward Sortation Area (FSA) level. Although FSAs
are larger and less precise areas, this may also help our analysis by capturing a larger number of
children who live in the vicinity of their respective schools.
Krishnan outlines six steps used in his approach:11
● Select appropriate variables
● Check for outliers, linearity and normality
● Ensure the applicability of factor analysis
● Conduct factor analysis
● Interpret results
● Classify into groups
The rest of this section outlines briefly how our study mimicked these steps to create our
socioeconomic index.
Select Appropriate Variables
Our analysis used 23 of the 26 variables used in Krishnan (2010). The three variables
which we decided to omit were the number of low income families, the number of owned
occupations and the ‘income disparities’ in the neighborhood. These were omitted because there
10
Krishnan, Vijaya. "Constructing an area-based socioeconomic index: A principal components analysis approach." Edmonton, Alberta: Early Child Development Mapping Project (2010). 11
Krishnan, Vijaya. "Constructing an area-based socioeconomic index: A principal components analysis approach." Edmonton, Alberta: Early Child Development Mapping Project (2010), pp.10-11.
6
were some large outliers in the data due to the way in which the variables were constructed. In
order to compensate for the loss of these variables, the GINI coefficients in each respective FSA
were added in as a 24th variable. We then ended up with the 24 variables listed below in Table 1.
Check for outliers, linearity and normality
Factor analysis can be very sensitive to outliers, skewness and kurtosis in the
distributions of the data as well as the magnitude of these variables. For instance, income, which
varies from values of 0 to slightly over four hundred thousand, will receive a higher weight than
the variable which captures the average amount of children families have in an FSA.
To combat these sensitivities, the variables are transformed according to the methods
outlined in Dr. Krishnan’s paper where appropriate, and in the appropriate manner. The variables
along with their respective descriptions are listed below in table 1. While transformed, outliers
may still exist and therefore if an observation should be outside six standard deviations, it is
dropped. This is done for three FSA’s, which consequently only has an effect on the SEI
quintiles as no students occupy these FSA’s.
Table 1: A description of variables utilized in the creation of the SEI is provided below.
Ensure the applicability of factor analysis
We use a number of statistical tests to ensure the applicability of factor analysis to our
data (also outlined in Krishnan (2010)). The first of these was called the Kaiser-Meyer-Olkin (or
KMO) test, which detects multicollinearity between the variables (which would be a problem
since it would make the standard errors of the factor loadings. The result of this test on our
variables was a KMO rating of 0.856 which was sufficiently high to conclude that there is
7
relatively little multicollinearity between the variables. Another test called the Bartlett Test of
Sphericity detects if there is enough correlation between the variables to warrant the use of factor
analysis. Our data received a p-value of 0.000 on the Bartlett Test of Sphericity, meaning we can
reject the null hypothesis that there is multicollinearity between the variables.
Conduct factor analysis
Table 2 below shows the rotated factor loadings which resulted from our factor analysis.
Our data makes use of four factors which are linear combinations of the standardized variables.
The values in the table represent the relative weightings (from 0 to 1) on each variable in their
respective ‘factors’. For example, the first factor relates lightly to the dwelling and the residents
of the neighborhood. The second factor relates to mainly to the labour characteristics of the
household. The third factor relates mainly to inequality, as well as the family structure. Finally,
the fourth factor is a measure of inequality, female labour force participation and dependence-a
ratio of children under four and above 65 to the eligible labour force participants. We omit any
variable that has less than a correlation of 0.5, in absolute value, for the interpretation of the
factor variables. Uniqueness is the proportion of that variable not explained by the factors and
their variable’s correlations. Ideally, uniqueness should be relatively low as this is related to their
predictive power. The factors are then combined into a single index by adding them and
weighting them by dividing the proportion of variance explained by the factor by the total
variation explained by all of the factors. The result is a single variable (the ‘socioeconomic
index’) which is a value between 0 and 1. Once the index is created, it is inverted (in order to
make 1 ‘high’ and 0 ‘low’) and multiplied by 100 in order to make the scale easier to use.
Table 2: The Rotated Factor Loadings visually demonstrate the correlations between the variables and the factor variables.
8
Interpret Results
Figure 2 below shows the distribution of Ontario FSAs in the Socioeconomic index. The
FSA’s appear to be normally distributed in the SEI, with a rightward skew. Upon thorough
investigation, the FSA’s in the highest and lowest SEI ratings were in areas which we would
expect. An investigation by Teri Pecoskie into schools in Hamilton, Ontario, revealed that
students fared worse in schools where English was not the first language, where low income
families were higher than the average, and where students required special assistance. These, and
others, are factors that we have proxies for, so we would expect our SEI to accurately identify
neighborhoods that are nurturing to student development.12
Figure 2: The 2006 ungrouped distribution of the SocioEconomic Index. For all Forward Sortation Areas that have a
positive population at the time of the census, a SEI score is given based of the 24 variables from the Census.
Classify into groups
After constructing the SEI, the FSA’s were place into 5 quintiles within the SEI. The
number of groups was chosen based off of the number chosen by Krishnan (2010). Although the
selection of the number groups was somewhat arbitrary, we did sensitivity analysis with respect
to the number of groups and found that while making the graphs and regression tables more
complicated, the addition of groups did very little in terms of changing the results. Table 3 below
shows some summary statistics for three of the five groups. The lowest standing SEI indicates
the first quintile, where we would expect statistics to be less nurturing to student development.
We see that the percentage without a diploma, the variable Education, has the highest percentage
of all standings indicating the highest population. The divorce rate, unemployment rate and
percentage of immigrants are all higher than their higher standing SEI counterparts. For the SEI
of highest standing, the Gini is lowest, indicating low income inequality, there is nearly a third of
the population without a high school diploma, the unemployment rate is much lower than the
national average at the time, and the proportion of recent immigrants is not even two percentage
points.
12
Pecoskie, Teri, “Keeping Score: Day 1: Unequal Education”, The Hamilton Spectator, April 12, 2014. Accessed June 12th, 2014.
9
Table 3: Selected summary statistics for the lowest, middle and highest SEI quintiles. Other than the Gini, variable means can be
read as percentages.
Validation of the Socioeconomic Index
In order to validate the socioeconomic index we created with our data, we used three
different methods. First we hand-checked some of FSAs with extremely low and extremely high
SEI values to see if the areas made sense using our prior knowledge of Ontario neighborhoods as
well as identifying which measures were causing the low SEI score. Among some of the lowest
were some notoriously deprived areas of Toronto and Hamilton, and some of the highest were
rich suburban areas of Toronto and Ottawa. Secondly, we constructed the SEI for both 2001 and
2006 and compared the relative locations of FSA’s along the spectrum in each year. Figure 3
below shows a scatter plot of the standardized 2001 and 2006 SEI values of each FSA compared
to a 45 degree line13
. Due to the sensitive nature of factor analysis, it is important to determine
that the SEI is not drastically different between the two census years - potentially indicating
proper underpinnings. It would be a concern if the relative positions of the FSAs changed
erratically between these two years (since this is not what we observe in the real world). Outside
of a few exceptions, however, the FSA’s appear to fall closely to the 45 degree line, meaning
they are in approximately the same relative position in both years compared to other FSA’s. The
extreme outliers in this graph are where neighborhood development occurred between the census
years, changing the demographics and characteristics of the FSA. Figure 3: 2006 SEI on the vertical axis, and the 2001 SEI on the horizontal compared to a 45 degree line.
13
2006 SEI rankings were larger on average than 2001 SEI rankings. This was likely due to economic development in Ontario between these two years. Therefore, in order to compare the SEI values of FSAs between years, the distributions are standardized for both SEI’s.
10
Our third method of validation was to link our census data to the schools within each
FSA and examine the distribution of schools in the socioeconomic index. This is shown in Figure
4 below. Interestingly, there are a number of schools at the lower end of the SEI distribution
while there are no schools at the very upper end of the SEI distribution. There is a significant
amount of variation in the placement of schools, with a slight rightward skew to higher SEI
neighborhoods. However, the distribution of schools in SEI does not appear to be overly skewed
and does not display concerning abnormal peaks. This will prevent bias once we move to
examining school and student differences in test scores between socioeconomic neighborhoods.
Figure 4: Distribution of schools among the SEI where one record per school is included.
Now that the SEI has been constructed, we can utilize it to examine the gap in test scores
among the different values of the SEI. There are two methods that we will be looking at the test
scores and the SEI. The first method involves graphs against the ungrouped SEI scores, while the
second method involves the quintiles that the SEI is grouped into.
Is There a Visible Test Score Gap? To begin our examinations, some summary statistics on the students and schools are
provided by relevant dataset. For our grade 3 dataset, which spans the years 1998-2010 covering
1.7 million students has 3711 schools included. Enrollment in grade 3 ranges from as little as 8
to as high as 260, with a mean of 52 and a standard deviation of 25. In grade 6, there are 955,000
students over the years 2004-2010 in 3461 schools. Enrollment in grade 6 is slightly on the larger
side with a minimum of 6 students ranging to 450, with a mean of 67 and a standard deviation of
56. When we link the students in grade 3 to grade 6, we obtain a dataset containing 466,000
observations with enrollment varying from 12 students to 240 students. Mean enrollment is 52
students with a standard deviation of enrollment of 25 students. The number of students missing
the math EQAO test at the school level ranges from zero to 6.25 percent with a mean of 0.02
percent. Appendix tables 4 and 5 display the number of students over time and across SEI’s.
11
Let us first examine the test scores of students averaged at the school level against the
school’s respective ungrouped SEI score. To accurately identify how students are achieving
overall at the school level based on that school’s SEI score, we plot below the average test score
of the school against the SEI.
Figure 5: Below are six graphs, one for each test at grade 3, and one for each test in grade 6. We average the tests of the
students at the school level to have an overall effect of the SEI on the school.
Figure 5 above show that, for the most part, there is an upward trend in test scores as the
school’s SEI is increasing except for the Grades 3 and 6 Math test scores which appear to have
almost no trend. There is quite a bit of dispersion and several outliers in the graph but overall,
school’s that are in better neighborhoods according to the SEI, the students in that school
perform better on the EQAO test on average. We can also see that there are more schools that
have a value of zero in the SEI than there are schools that have a value nearer to 100. These
schools also tend to be below the trendline of test scores against the SEI which we would expect.
Though there is much dispersion around the trendline for these graphs, we must be conscious of
the fact that they are quite simple- comparing only test scores to the SEI. There are many other
factors attributable to test scores that the SEI does not capture. For example, school factors and
individual student factors not captured in the graphs.
Figure 6 illustrates test scores for the highest and lowest grouped SEI’s for all tests.
Reading, writing and math test scores are shown for the highest and lowest scoring SEI’s. We
would therefore expect that these represent the most and least potentially nurturing
neighborhoods to student development. It is easy to visually distinguish that test score values of
2 and 1 are lower in the SEI of highest standing, the 5th quintile, while test score values of 3 and
4 are higher in the SEI of highest standing by a few percentage points.
12
Figure 6: Relative frequencies of test scores across all tests for neighborhoods of highest and lowest standing among the SEI.
The Model
At first glance, looking at the average differences between the groups in the previous
section, it appears there is a relationship between the socioeconomic index of a neighborhood
and the test scores of children attending the schools in that neighborhood. In order to delve
deeper into this relationship it is necessary to control for factors which are driving differences in
both test scores and socioeconomic status, but are not the focus of our analysis. Our model
controls for the following factors at the individual and neighborhood level:
● Individual Level Factors:
○ Exceptional or gifted students
○ Gender
○ French Immersion
● Neighborhood-level factors:
○ Missing Values
○ School language
○ Rural areas
○ School size
○ Year effects
Using these controls, we use OLS regression to test for gaps in test scores between our 5
SEI groups. The regression uses the following functional form:
Where j is the SEI group, i is the individual and t is the year. For detailed information on
the regression results please see Tables A1 and A2 in the Appendix. The current section will will
only highlight the simplified findings of the model.
13
Findings
Figure 7 below shows the simplified regression results from the model listed in the
previous section applied to both grade 3 and grade 6 test scores separately. It is clear from these
tables that reading is the test score which is most highly impacted by the socioeconomic
neighborhood of a student. In grade 3, reading shows statistically significant gains across all
groups which are increasing as we compare higher SEI groups. Grade 3 writing and math
showed much smaller gains to the socioeconomic neighborhood of the student. The lower SEI
groups had a statistically insignificant effect on these test scores. In grade 6, the value to higher
socioeconomic neighborhoods appears to be larger in all categories than it was in grade 3. Once
again, reading shows statistically significant gains across all SEI groups except for between the
1st and 2nd. Writing shows much smaller gains to belonging to higher SEI groups. Math shows
larger gains than writing, except these are mostly statistically insignificant other than the gap
between the 1st and 5th SEI group (most likely do to higher variation in test scores).
Although the gains shown in our regressions are statistically significant, they also appear
to be reasonably small. For example, the largest gap (between the 1st and 5th SEI groups in
grade 6 reading scores) suggest that by switching from the lowest to the highest socioeconomic
group would raise your test score by 0.073 points on the EQAO test. Since the test scores are
between 1 and 4, this would only be predicted to raise a student’s score from a 3 to a 3.07.
Figure 7: Predicted gains in grade 3 (on left) and grade 6 (on right) EQAO test scores from moving from the lowest
socioeconomic group of neighborhoods to the group labelled in the legend. Bars in solid colours are statistically significant at the 5%
level of confidence, while bars in stripes are not. All control variables mentioned in the previous section were included in the
regression. Detailed regression tables are listed in the Appendix.
Included in the appendix are the regression output. In this output is also included a non-
linear term for the SEI. Since there is some weakness in the linear estimates, we examined
whether the SEI is non-linear in how it affects different neighborhoods. Therefore, we decided to
include a squared term for each of our specifications as a confirmatory tool. A positive,
statistically significant square term indicates that the SEI is affecting test scores in a convex
manner while a negative squared term indicates that the SEI is affecting test scores in a concave
manner. A convex manner would indicate that test scores are slightly affected at lower levels of
14
the SEI while at higher standings of the SEI test scores are more-so affected. For negative
squared terms, the effect is convex and the opposite of what has just been described. It appears to
be the case since in some cases, the non-linear term is significant while the linear term is not
significant. However, the squared SEI is not superior in magnitude to the linear counterpart and
is inconsistent in significance.
Does the Gap Get Larger Between Grade 3 and 6?
The Model
The previous models suggested that there were some gaps in test scores between
socioeconomic groups, and many of these gaps appeared larger in grade 6 than they were in
grade three. The next question we examine is whether or not the socioeconomic status of a
child’s neighborhood has an effect on their learning between these two crucial ages. We do this
by using a value-added model. This is a similar model to the grade 6 gap regression in the
previous section, but we link students test scores between grades 3 and 6 and include the grade 3
test scores in the regression on grade 6 test scores. We therefore use the following equation:
The results from this regression therefore represent the effects of moving between SEI
groups while controlling for the effects which have already occurred by grade three. In other
words, this is the value-added to test scores in between grade 3 and 6 due to being in a higher
SEI group. Once again this report will outline only the basic findings of the model, but a detailed
regression table (Table A3) is included in the Appendix for interested readers.
Findings
Figure 8 below shows that the value added to reading scores in between grades 3 and 6
due to belonging to a higher socioeconomic group is statistically insignificant except for the gap
between the lowest and highest group. The writing and math values suggest there may be a
negative value added due to being in a higher SEI group between grades 3 and 6. Upon
examining the size of these values, it is clear that their economic significance is very small. The
largest gap in this graph is in writing between the lowest and second highest SEI group, with an
added value of -0.019 (approximately one fiftieth of a point change in test scores).
15
Figure 8: Predicted gains in grade 6 EQAO test scores from moving from the lowest Socioeconomic
grouping neighborhood to the groups labelled in the legend (controlling for students’ grade 3 test scores). Bars in
solid colours are statistically significant at the 5% level of confidence, while bars in stripes are not. All control variables
mentioned in the previous section were included in the regression. Detailed regression tables are listed in the Appendix
Potential Policy Implications Our findings do not provide very much support for the idea that socioeconomic aspects of
neighborhoods play a large role in the early childhood development of skills in reading writing
and mathematics. Although we did find statistically significant gaps in both grade 3 and grade 6,
the size of these gaps were extremely small. The results should not be interpreted as evidence
that no gap exists at the grade 3 or 6 levels. It may be the case that individual or family-level
factors are more prominent is driving early childhood development, and that the size of the
variation in these factors within each FSA is overpowering the variation between FSA’s.
This being said, these results can be interpreted as evidence that FSA-level
socioeconomic factors do not have a large impact on early childhood development and
performance. In terms of policy, this suggests that community-wide targeting of children in low
socioeconomic areas would not have high returns. Instead, policies should provide equal
opportunities to individuals and families who may not have all of the proper resources to support
the development of their children through early childhood. Therefore the full-day kindergarten
and early-years centres policies recently implemented in Ontario may still be effective in
increasing both the equity and efficiency of the education system as a whole.
Acknowledging Weaknesses Unfortunately, due to data restrictions, we are unable to accurately identify the area in
which the student resides. We have to assume in our case that the student lives in the same FSA
as the school he goes to, since we have the FSA of the school. While it may be true for a
majority of the students, there may be an element of non-randomness to the type of student it
affects. For example, if a particular school is in a different FSA to the neighborhoods of a
majority of the students. Or perhaps still, if students from a high SEI neighborhood attend a
school in a neighborhood which has a lower value SEI. Depending on the scenario, the estimates
16
are not capturing the full effect of the neighborhood which might be the reason our estimates
have such low statistical impact.
Aside from the potential movement between FSA’s the size of the ‘neighborhoods’ we
examine may also be too large. There may be higher variability within each FSA which is being
overlooked due to the size of our neighborhood definitions. The study could therefore benefit
from the use of smaller level neighborhood data. Ideally, we would have socioeconomic
information about children’s families and could link this information to their test scores. This
would allow us to group families at which ever level was most relevant. Unfortunately this was
not possible using the level of census data to which we had access.
Family level data aside, the use of smaller analysis areas would also be informative. The
analysis of the variation between sections of each FSA could help to inform the lack of observed
differences in test scores. At the presentation of this paper it was also suggested that if we had
lower level data, we could use a method called ‘multi-level analysis’. This is another form of
regression which allows for several nested levels to be present in the data. Although this could be
useful in terms of being more precise about the effects of different levels of neighborhoods on
test scores, it is highly complicated and out of the scope of this policy report.
Our results seem to indicate that there is no effect of SEI on test scores. However, there
are some potential reasons to believe that there is downward bias on our estimates. The first
possible source of bias is that we utilize Forward Sortation Area (FSA) to emulate the
neighborhood. These FSA’s can vary between quite large to quite small. For example, there are
several FSA’s that are less than a square kilometer while there are even more that are over ten
thousand square kilometers.14
Another factor that might be affecting the results is the fact that students sometimes move
FSA’s between grade 3 and 6. This applies only to the estimates for grade 6 and the linked
grades 3 and 6. Our thinking in choosing not to change the SEI between the grades of 3 and 6 is
that we thought the student would have been more exposed to the environment in the first SEI
compared to the SEI after the move. However, for some students, the family factors might be
more appropriately captured by the SEI which the student moves to. For example, a young
educated family who moves to a larger house in a better neighborhood would be better captured
by the new SEI.
It is also potentially the case that we have omitted variable bias in our study due to lack
of data to estimate these variables. Some examples of this could be the existence of early years
centers in specific neighborhoods, or whether or not the child attended full-day kindergarten.
These policies are both likely to have an impact on student outcomes that we cannot observe
over the course of our sample.
14
Census of Canada 2006 Geographic Reference Files, University of Toronto CHASS. Created March 16
th, 2007. Accessed June 13
th. http://datalib.chass.utoronto.ca/cc06/georef06.htm#fsar
17
Conclusion Our method of utilizing an index of socioeconomic factors to reveal an achievement gap
yielded mixed results. While we find evidence that there is a gap at grade 3, and this gap
continues to exist in grade 6, we find that the gap contributes very little to the achievement of the
student. We also find little convincing evidence that the gap is growing, in absolute value, over
the course of the period between the years of grade 3 and 6. This leads us not to the conclusion
that the gap between low and high socioeconomic neighborhoods is small and that its existence
is likely inconsequential. Returning to our original question, this does not provide much evidence
that gaps between privileged and underprivileged neighborhoods are a driving force behind
economic inequality through their effects on early childhood development. This being said, there
were many assumptions and data limitations which may have watered down the results. It would
be useful for further studies to examine socioeconomic factors using smaller comparison groups.
Ideally, a study would have access to socioeconomic information about children’s families and
would be able to link these children with their test scores.
18
Bibliography
Saez, Emmanuel, and Michael R. Veall. "The evolution of high incomes in Northern America: lessons
from Canadian evidence." American Economic Review (2005): 831-849.
Fryer Jr, Roland G., and Steven D. Levitt. "Testing for racial differences in the mental ability of young
children." The American Economic Review 103.2 (2013): 981-1005.
Sammons, Pam, James Hall, Kathy Sylva, Edward Melhuish, Iram Siraj-Blatchford & Brenda Taggart,
“Protecting the development of 5-11-year-olds from the impacts of early disadvantage: the role of primary
school academic effectiveness,” School Effectiveness and School Improvement 24.2 (2013): 251-268.
Fryer, Roland G. Jr. & Steven D. Levitt, “The Black-White Test Score Gap Through Third Grade”, NBER
Working Paper No. 11049, January 2005,JEL No. I2
Center on the Developing Child (2007). The Science of Early Childhood Development (InBrief). Retrieved
from www.developingchild.harvard.edu. Retrieved from
http://developingchild.harvard.edu/index.php/resources/briefs/inbrief_series/inbrief_the_science_of_ecd/
Shonkoff, Jack P., and Deborah A. Phillips, eds. From neurons to neighborhoods: The science of early
childhood development. National Academies Press, 2000.
Melhuish, Edward, Keynote Presentation,“Excellence and Equity in Early Childhood Education and Care,”
EU conference, Budapest. 2011.
Melhuish, Edward C. "Preschool matters." Science 333.6040 (2011): 299-300.
Krishnan, Vijaya. "Constructing an area-based socioeconomic index: A principal components analysis
approach." Edmonton, Alberta: Early Child Development Mapping Project (2010).
Pecoskie, Teri, “Keeping Score: Day 1: Unequal Education”, The Hamilton Spectator, April 12, 2014.
Accessed June 12th, 2014. Online.
Census of Canada 2006 Geographic Reference Files, University of Toronto CHASS. Created March 16th,
2007. Accessed June 13th. http://datalib.chass.utoronto.ca/cc06/georef06.htm#fsar
19
Appendix Table A1: Regression output for grade 3- Linear SEI term on the left hand side, Non-linear SEI term on the right half
of this graph. Regression coefficients are listed with p-values in parenthesis. Errors are clustered at the school level.
20
Table A2: Regression output for grade 6- Linear SEI term on the left hand side, Non-linear SEI term on the right half
of this graph. Regression coefficients are listed with p-values in parenthesis. Errors are clustered at the school level.
21
Table A3: Regression output for linked grade 3-6 students. A linear SEI term on the left hand side, Non-linear SEI
term on the right half of this graph. Regression coefficients are listed with p-values in parenthesis. Errors are
clustered at the school level.
22
Table A4: Grade 3 tabulation of year and grouped SEI displaying the number of students across years in different
categories of the SEI.
Table A5: Grade 6 tabulation of year and grouped SEI displaying the number of students across years in different
categories of the SEI.