Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
- 1 -
Master Thesis
Anna Hasselqvist Haglund
Uppsala University
Spring Semester of 2018
Supervisor: Helena Holmlund
Public secondary school mergers as a desegregation method in Swedish
municipalities - Investigating their impact on student’s academic performance and choice of school
Abstract In recent years several municipalities in Sweden have merged their public secondary schools. This
has been considered a type of initiative that intends to reduce youth segregation and discrepancies in
school quality. This thesis examines in what ways the merging of all public secondary schools in a
municipality affects the students’ academic performance and their choice to enroll in the public-
school sector. To do so I use municipality-level aggregate data from the Swedish National Agency
for Education on 9th grade students’ academic outcomes and the share of 7th graders enrolled in the
public schools. I employ a difference-in-difference approach to estimate the reduced form effect of
the school mergers. The control group used in the baseline estimation includes all municipalities that
had a constant number of public secondary schools during the time period of my study. I move on to
use propensity score matching in order to create a more comparable control group. I then estimate a
difference-in-difference regression with match-fixed effects. The results show that the mergers have
a negative effect on the municipality-level average GPA. In addition, the municipalities where the
mergers have been implemented experience a reduction in the share of students that pass all 9th grade
subjects as well as an increase in the share of students who do not have sufficient grades to continue
to upper secondary school. The school mergers caused the share of 7th graders enrolled in the public-
school sector to decrease by approximately 10 percentage points. These results indicate that the public
secondary school merger is not a panacea for improving student outcomes.
Table of Contents
1. Introduction…………………………………………………………………………………..1
2. Background…………………………………………………………………………………..3
2.1 The Swedish education system……………………………………….……………………..3
2.2 Previous literature.……………………………………………..……...…………………...3
3. Theory……………………………………………………………………………………….6
4. Data………………………………………………………………………………………….8
4.1 Data description and coding issues……………………………………….…………….…...8
4.2 Outcome variables……………………………………………………...…………………10
4.3 Treatment and control groups………………………………………………………….…..11
4.4 Descriptive statistics…………………………………………………...………………….14
5. Method…………………………………………………………...………....………………16
5.1 Empirical framework………………………………………………..…………………….16
5.2 Underlying assumption………………………………………………..…………………..17
5.3 Matching………………………………………………………………………………....18
6. Results………………………………………………………………..……………………..20
6.1 Difference-in-difference results………………………………………………..…………..20
6.2 Matched difference-in-difference results………………………………………...………....23
6.3 Heterogeneous treatment effects……………………………………………..…………....27
7. Sensitivity analysis…………………………………………………………..……………..30
7.1 Event studies……………………………………………………………………………..30
7.1.1 Parallel trends……………………………………………………………………32
7.1.2 Dynamic treatment effects……………………………………………..…..……..32
7.2 Placebo tests on control variables………………………………………………………….33
7.3 ‘Worst case’ and ‘best case’ variables………………………………………………….......34
8. Discussion and concluding remarks………………………………………………….…….35
References…………………………………………………………………………………………..38
Appendices
A. Table of municipalities excluded from control groups…………………………………………..40
B. Table of matched control groups……… ……………………………………………………...41
C. Figure for event studies baseline panels………………………………………………………..43
D. Table of ‘worst case’ and ‘best case’ comparison………………………………………………44
- 1 -
1. Introduction
The Swedish education system has long been built upon a notion of egalitarian ideals. The aim is
to offer high-quality education without selective sorting of gifted students or imposing tuition fees. At
a glance, the Swedish education system should provide a good foundation for scholarly success. Yet the
results from the 2012 PISA survey got a lot of attention as it revealed that Swedish fifteen-year olds
performed significantly worse than the OECD-mean (Swedish National Agency for Education 2013).
On a somewhat more positive note, the results from the 2015 PISA survey did show improvement in the
Swedish students’ overall academic performance. However, the survey points out that there are signs of
growing achievement gaps between high and low-performing students and among students with
different socioeconomic status (OECD 2016).
One of the possible reasons for the widening of the achievement gap is that school segregation
has become increasingly present in Sweden (SOU 2017:35). School segregation implies that students
with background characteristics that increase their probability of scholarly success attend one school,
while students with a relatively more disadvantaged background characteristics attend another. This may
have negative consequences on student performance, especially for students with a low socioeconomic
status. Furthermore, school segregation may endanger the equity goal that Sweden has set for its
education system, as it might create heterogeneity in quality among schools. One reason may be that
students with low socioeconomic status are missing out on positive peer effects from classmates with
high socioeconomic status.
In 2014 the local government in Nyköping municipality took action in an effort to increase student
achievement and promote youth integration. The municipality had four public secondary schools. For a
long time, the average grade results from one school outperformed the others. This school consisted of
students with parents of Swedish descent living in predominantly high-income neighborhoods. In order
to desegregate the student population and make use of the potential peer effects that comes from having
classes with children with various level of abilities, the municipality merged their public secondary
schools into one large school. In the merged school they sort students into mixed classes based on
demographic variables.
Even though this method has been called the ‘the Nyköping model’, merging all public secondary
schools to one is not an initiative unique to Nyköping. From the year of 2012 to 2016 twelve additional
municipalities in Sweden merged their public secondary schools according to the Swedish National
Agency for Education’s SIRIS-database.
The Nyköping model has gotten a lot of attention in media (Svt 2015, Dagens nyheter 2016,
Dagens samhälle 2017). However, it was not until June 2017 that the first student cohort that had the
possibility to attend 7th to 9th grade in the merged school received their final grades. Even though the
effect of the Nyköping model on student outcomes has not yet been extensively studied, there are other
municipalities in Sweden that are considering merging their public schools (Arbetarbladet 2013, Svt
2016).
The aim of this thesis is to examine the effect of the public secondary school mergers on two types
of outcomes. Firstly, I aim to examine whether merging all public secondary schools to one increases
the municipality’s average student academic performance. Secondly, I aim to examine whether a
secondary school merger affects the share of students that attend the public secondary school instead of
choosing to enroll in an independent school.
If the public secondary school mergers have a positive effect on the student’s academic
performance there are incentives for policy makers to encourage the implementation of this type of
initiative. If the mergers have a positive effect on student outcomes, it may be an efficient desegregation
tool. If may also make the public school alternative more competitive. In addition, having just one public
secondary school may be cost-efficient for the municipality.
- 2 -
On the other hand, the mergers may not be efficient in increasing student achievement. The
municipalities may then end up with schools that accommodate a large number of students. This could
lead to a low teacher density and an impersonal school environment that might not benefit the students.
Bullying and destructive behaviors may not be noted as fast as it could have in a smaller school, and the
within-school segregation between students may not necessarily disappear.
The data that I use is gathered from the Swedish National Agency for Education’s SIRIS database,
which offers aggregate data on municipality-level student outcomes. The time period of this study
stretches from the scholastic year of 2012/2013 to 2016/2017. The time period starts when the first
ninth-grade student cohort received grades according to the 2011 grade reform. The time periods ends
with the most recent available data on 9th grade outcomes.
My identification strategy consists of two methods. My baseline approach is to employ a
difference-in-difference estimation in order to reduce bias from selection on unobservable confounding
factors. I then move on to use propensity score matching to create a control group that potentially
provides a better counterfactual to the treatment group. I use these matches to conduct a difference-in-
difference estimation with match-fixed effects. I find that the results of my baseline approach prove to
be unstable when adding controls to the specification, so I move on to use the matched difference-in-
difference estimation as the main analysis.
I also conduct a heterogeneous treatment effect analysis to explore the potential discrepancy in
treatment effects among students with different backgrounds. The analysis includes comparisons
between female and male students, students of parents with different education levels and students with
Swedish and migrant backgrounds.
Finally, I conduct multiple event studies to examine the validity of the parallel trends assumption.
I also use these studies to examine whether the outcomes are affected by the exposure time of treatment.
My outcome variables are measures on the average student performance of the whole
municipality. Accordingly, the effect that I estimate is the reduced form effect. This reduces the potential
bias from selection in or out of the public-school sector. The reduced form effect holds policy-relevance
since it measures the effect of the school mergers on the average outcomes of the total student population
in the municipality.
My main results show that the public secondary school mergers have a negative effect on the
municipality-level average GPA. The school mergers also have a negative effect on the share of students
in the municipality that pass all subjects and the share that are eligible for a vocational school education.
Thus, the school mergers seem to reduce the share of students in the top of the skill distribution, while
the share of students in the bottom increases. I also use student outcomes from the national standardized
tests as a measure of student academic performance. My analysis shows that the effect of the school
mergers on these outcomes are statistically insignificant and close to zero.
My main results also show that the public secondary school mergers caused the share of 7th graders
enrolled in the public-school sector to decrease by approximately 10 percentage points in the treated
municipalities. This implies that the initiative caused a smaller share of parents to enroll their children
in the merged public schools.
Overall, the public secondary school mergers seem not to be efficient in increasing the
municipality-level student outcomes on average, and it does not seem to improve outcomes for students
in the bottom of the skill distribution. The results indicate that the public secondary school merger is not
a panacea for improving student outcomes.
The remainder of this thesis is organized as follows. Section 2 explains the institutional background
of the Swedish education system and presents previous literature. Section 3 discusses the theory of peer
effects and its importance in the context of this study. Section 4 discusses the data and summary
statistics. Section 5 presents the empirical strategy and underlying assumptions. Section 6 presents the
- 3 -
main results. Section 7 shows the results of a number of robustness checks. Section 8 includes both a
discussion and concluding remarks.
2. Background
2.1 The Swedish education system
Education in Sweden is compulsory for the first nine years of school, starting at age seven.
Education is offered tuition-free for all levels, which is a Swedish tradition that dates back to the middle
of the 17th century. For long the overall governing of schools and school funding was decided on a
centralized level by the Swedish government and admission to compulsory school was limited to the
students being automatically assigned to the nearest public school in their municipality. Following a
general trend of shifting the responsibility of public service to the municipal level in the 1980s, school
governance was decentralized in 1991. Nevertheless there are still centrally determined curriculums and
goals in each subject that the municipalities are required to meet. This rests on the notion of equal, high-
quality education for all students, independent of which school they attend.
In the fall of 1992, a large voucher reform was introduced which allowed for independent schools
to be established in the municipalities if approved by the Swedish National Agency for Education. Prior
to 1992, students were automatically assigned to the nearest public school in their municipality. The
only alternative was to attend one of the few private schools active in Sweden, however, these schools
were mostly targeting specific demographics of the Swedish population, and less than one percent of
the students attended these schools.
When a student is enrolled in an independent school, the student’s home municipality have to
provide the school with a voucher, equivalent to most of the average per-student expenditure in the
public-school system. The reform allowed students to have the freedom of school choice. This ‘free
choice’ principle allows the student to attend a school outside of his or hers home municipality, and the
voucher will follow with the student to the school. The independent schools are allowed to deviate from
the national standardized curriculum, however they are not allowed to select students for admission
based on ability, socioeconomic background or ethnicity.
The current Swedish grading system is based on a scale of A to F for each subject. This system
came into effect in 2011, but was then only implemented in the 8th grade of compulsory school and in
the 1st grade of upper secondary school, while implementation in the 9th grade was done in 2012. Grades
A to E are passing grades, A being the highest one. The grade F is given when a student has not met the
standard knowledge requirements in the subject. If the student has been absent for too many lessons in
a subject, the school may judge that there is not enough information to assess the student. The school
may then mark the student with a dash (-) instead of a grade in that subject. The dash is not a grade, thus
significantly different from the student receiving an F. The sum of the 16 highest final grades when the
student finishes the ninth year of elementary school is recalculated to a grade point average (GPA).
Receiving an A in a subject yields 20 point. Receiving an F yields 0 points. The final grade GPA ranges
from 10 points to 340 points. A student’s GPA is the basis for admission to upper secondary school.
2.2 Previous literature
In the context of this study, it is interesting to look at literature that examines the effect of school
segregation on student outcomes. Szulkin and Jonsson (2007) examine the effect of ethnic density in
Swedish elementary schools on students’ educational outcomes in ninth grade. They link data on the
ninth-grade student cohorts in 1998 and 1999 with census data on their background characteristics to
establish which students are first and second-generation immigrants. They adopt a multilevel analysis
and find that a high density of first-generation immigrants in a school has a negative effect on a student’s
grade, especially for other first-generation immigrant students. However, a high density of second-
- 4 -
generation immigrants in a school does not affect the student’s grade negatively. They conclude that
implementing desegregation policies would lead to a reduction in educational achievement gaps
between students.
Nordin (2013) examines whether school segregation affects an individual’s human capital
outcomes. He has access to register data from Statistics Sweden on all individuals aged 20-27 living in
Sweden in 1999. The data also includes information on parental education level and income. Using a
peer effects model with school-fixed effects he estimates the concentration rate of first and second-
generation immigrants in the individual’s school cohort on the outcome variables cognitive ability,
educational attainment and compulsory school grades. In doing so he is able to isolate the within-school
immigration concentration rate which he calls the ‘immigration segregation effect’. He does not find
any significant effects when he include all types of immigrants in his calculations of the immigrant
concentration rate. However, when estimating peer influences of first-generation immigrants and
second-generation immigrants with two foreign born parents, he finds that immigrant peer externalities
have both positive and negative effects on students with immigrant backgrounds. He finds that the peer
effects are sometimes related to lack of Swedish language skills which has a negative effect on students
who also have weak Swedish language skills, while the creation of immigrant peer-groups in schools
has a positive influence on the outcomes for males.
Grönqvist, Niknami and Robling (2015) makes use of a placement policy called the ‘Whole of
Sweden strategy’ to investigate how young males’ criminal behavior is affected by immigrant residential
segregation. The strategy was implemented in Sweden from 1985 to 1994 and meant that the
government was in charge of assigning newly arrived refugees and their families to their initial
residential locations. The authors argue that this should provide exogenous variation to the initial
residential distribution of refugee children. They use population-wide administrative data from 1985 to
2008. The data consists of individual characteristics such as income, educational background and
demographic variables, as well as data on all convictions in criminal trials. This allows them to estimate
the effect of childhood exposure to segregation from the placement policy on long-run criminal
participation. They estimate an OLS that includes a vector of controls for neighborhood characteristics
that vary over time, and municipality-by-year fixed effects. Their dependent variable indicates whether
an individual has been convicted for at least one crime up to the age of 26. Thus their identification
strategy is a linear probability model. Their results show that being assigned to a neighborhood with a
large share of immigrants increases the probability that immigrant males commit drug related offenses
and become incarcerated later in life. They show that these estimates account for a large share of the
native-immigrant crime gap, and that the results are driven mainly by individuals with low educated
parents. The authors conclude that their results provide some evidence that exposure to segregation
during childhood is one of the reasons why young immigrants are overrepresented among criminal
offenders.
Brandén, Birkelund and Szulkin (2016) study how ethnic school segregation in compulsory
schools affects 9th grade students’ educational outcomes. They examine whether attending a segregated
school affects the students’ grade point average and the students’ eligibility for secondary school. To do
this they use Swedish population register data on all students who finished ninth grade between 1998
and 2012. They perform several OLS regressions where they first introduce school fixed effects, then
family fixed effects, and finally a two-way school and family fixed effects design in order to isolate the
‘social-interaction effect’. Thus they try to net out other factors that may affect the student academic
performance such as overall school quality and the potential selection bias that comes from choice of
school. Their estimations show mixed results. Overall, the effect of ethnic segregation on the educational
outcomes is close to zero. Their standard OLS regression shows that grade scores are on average lower
in schools where all students are immigrants than in schools that consist only of native children.
However, the difference between the schools disappears when the school and family fixed effects are
- 5 -
added. In other words, when the ‘social-interaction effect’ is isolated, the effects perish. In addition,
they find that attending a school with a large proportion of immigrant schoolmates has a small negative
impact on students that are one the margin of not being eligible for upper-secondary schools. Their
conclusion is that ethnic school segregation, if it matters, is mainly affecting students with weak
academic performance. They relate these findings to previous research that shows that skill formation
is crucial in the early years of the child, and that family background has a major influence on the child’s
ability of academic success. From this they conclude that policy interventions trying to change the
composition of immigrants in schools should not affect either immigrant or native children negatively.
Merging all public secondary schools to one in order to decrease segregation and increase student
performance is a type of initiative that is comparable to the desegregation-busing programs that have
been implemented in America since the 1970s. In states with segregated school districts, busing
programs send minority students with low socioeconomic status by school buses to attend schools in
neighborhoods that are predominantly white. There is a large literature that try to estimate the effect of
these types of desegregation programs on student performance and other civic outcomes.
Angrist and Lang (2004) examines the effects of the Metco busing program in Boston. The
program sent black students from Boston schools to attend schools in suburbs outside the city. They use
micro data from the Boston school districts from the year of 1994 to 2000. They estimate the difference
in achievement between Metco-students attending schools in the suburb of Brookline and the resident
students. They find that Metco students had significantly lower test scores than the students that reside
in Brookline. They then move on to estimate the effect of Metco students on the achievement of non-
metco students using both an OLS regression framework and an instrument variables approach where
class size is used as an instrument. They find no statistically significant evidence that Metco students
have any effect on the educational outcomes of their non-Metco classmates. They conclude that peer
effects that arise from the influx of black inner-city students into suburban schools because of the Metco
program are modest.
Johnson (2011) investigates the effect of court-ordered school desegregation programs on
different types of adult socioeconomic and health outcomes. The fact that the court orders were issued
at different points in time creates a quasi-experiment where there is exogenous variation in the
implementation of the programs. Using this variation he estimates a difference-in-difference, a second
least square and a sibling-difference estimation using a panel of children born between 1945 and 1968.
The data includes information on their income until the year of 2013 as well as background
characteristics and school quality measures. His findings show that school desegregation programs
significantly increased educational attainment among black students. Consequently, the accompanied
increases in school quality for black students had a significant positive effect on their labor market and
health outcomes.
Billings et al (2012) study the impacts of the shutdown of a desegregation busing-program in the
Charlotte-Mecklenberg school district in North Carolina in 2002 on academic achievement, educational
attainment and young adult crime. They match university attendance records and incarceration data to
yearly student records from the school district in question. Their identification strategy is to compare
students who live in the same neighborhoods but whose pre-policy addresses placed them on different
sides of a newly drawn school boundary. This caused them to be reassigned to schools with different
racial compositions. They estimate a fixed effects regression and find that both white and minority
students that were reassigned to schools with a higher share of minority students scored lower on their
high school exams. In addition, the policy change widened the performance gap between black and
white students. However, for primary school students the negative impact was smaller. Overall the
effects of the minority-student influx to schools on student achievement diminished over time. An
increase of the share of minority peers increased the probability that minority males take part in criminal
activity, which stayed persistent over the years. The authors conclude that the diminishing impact on
- 6 -
student achievement was due to compensatory allocation of resources to the schools, while the impacts
on crime seemed not to be affected by increasing school resources.
3. Theory
When changing the school organization within a municipality in the way as Nyköping has done,
one changes the composition of the school cohorts, mixing students together in order to reduce
segregation among the youth population and increase student performance. The term peer effects in the
education of economics literature is defined as the externalities that comes from peers in the classroom
or in the general school environment affecting the outcome of another student. A problem is that peer
groups are often created endogenously. This may be optimal for the students that choose the ‘right’ type
of peers since it optimize their outcome. However, their actions may not optimize the social outcome.
Consequently, there are opportunities for school officials to create incentives for students to interact
with the socially optimal distribution of peers within the school (Hoxby 2000).
Manski (1993) divides peer effects into three different types. The first one is when the peers’ own
abilities affect another student’s outcome. The second one is when the peers’ background characteristics
affect another student’s outcome. This may affect the student’s outcome directly in class and indirectly
through the level of difficulty that the teachers adopt in the classrooms or the peers’ parent’s impact on
school decisions. The third type is the correlated peer effects. These arise when students spend a lot of
time together in the same institutional setting. In this setting, a school for instance, they are treated by
common shocks. This may be that they are taught by the same teachers or influenced by the same type
of social norms which will cause them to adopt similar traits that in turn will affect their outcomes.
A basic model of the effects of peers on a student’s outcome is a linear-in-means estimation,
assuming that only the first two types of peer effects affect the outcome of the student:
𝑌𝑖 = 𝑎 + 𝛽1 ∗ �̅�−𝑖 + 𝛾1 ∗ 𝑋𝑖 + 𝛾2 ∗ �̅�−𝑖 + 𝜀𝑖 (1)
𝑌𝑖 is student i’s outcome, for instance the student’s performance in school. �̅�−𝑖 is the peers’ average
outcomes, which may be interpreted as the peer’s average performance in school, affecting student i’s
outcome if 𝛽𝑖 ≠ 0. 𝑋𝑖 is a vector of student i’s own background characteristics. �̅�−𝑖 is a vector of the
peers’ background characteristics, affecting student i’s outcome if 𝛾2 ≠ 0. 𝜀𝑖 is the error term.
Several problems arise when economists try to empirically estimate peer effects. One problem is
the reflection problem (Manski 1993), that arises from the fact that student i is a peer of her classmates.
Hence the student’s outcome 𝑌𝑖 will affect her peer’s outcome �̅�−𝑖 which introduces simultaneous
causality to the linear model.
A second problem is the self-selection into peer groups based on unobserved factors. For example,
even though sorting into a group may be done in part based on observables such as background
characteristics, the peers may self-select into sub-groups in ways that unobservable (Sacerdote 2001;
Carell et al 2011). This means that the third peer effect, the correlated effect, is not observable in the
data but still affects 𝑌𝑖, which will lead to bias estimates.
A third problem is the difficulty to establish whether it is the peers’ abilities or background
characteristics that have a causal effect on the student’s outcomes. The peers’ abilities are in themselves
affected by their backgrounds. Consequently, even if one may assume that the effects of peers’
background characteristics have a causal effect on the student outcomes, it may not be possible to
separately identify 𝛽1 and 𝛾2.
Because of these problems, most economists employ an estimation of the reduced form effect of
peer effects (Sacerdote 2001; Angrist and Lang 2004; Carell et al 2008; Carell et al 2011).
- 7 -
In this study I will not be able to establish the effects of peers on the students’ academic outcomes
within the merged schools. This is because I work with aggregate data and cannot make inferences about
the actual students in the classrooms. However, the underlying mechanisms of peer effects may explain
the motivational reasons for why municipalities merge their public secondary schools.
Firstly, the most prominent goal of the public secondary school merger in Nyköping is to reduce
segregation among its youth population. In essence, what Nyköping is hoping for is an ‘integration
effect’ where a student population with diverse backgrounds will create positive peer effects that in turn
will increase the student’s academic performance. However, under the assumption that peers affects
outcomes as in the linear model specified in equation (1), sorting students into mixed classes should not
have any effect on average student performance. This is because the positive effects that the group of
low-ability students experience will be net out by the negative effects that the group of high-ability
students will encounter.
We can relax the linearity assumption and assume that weak students experience positive effects
from high-ability peers, while high-ability peers are not hurt by having low-ability school mates. What
we could expect in this case is that the average student performance should increase when students with
diverse backgrounds and abilities attend the same schools. The positive effects of the public secondary
school mergers on average student academic performance hinges on the belief that peers will affect each
other in this ‘non-linear’ manner. This is why school officials believe in sorting the classes on
demographic variables as a way to optimize the distribution of peers in each classroom. On the other
hand, the students may create sub-groups within each class, and still interact with the students that they
would have attended the same school with if the merger had not been implemented.
The officials may believe that the students with high socioeconomic status will perform just as
well in the merged school as they would have if they attended a school close to their home. This should
have a positive influence on students with low socioeconomic status. This means that the presence of
students with high socioeconomic status will cause 𝛽1 > 0 for their classmates. However, students with
high socioeconomic status may have done well in their old school because they were in a homogeneous
environment and only associated with other students of high socioeconomic status. In essence, the
potentially positive peer effects from this group may not be as positive as the officials hope for. In this
case, the ‘integration effect’ may be smaller than predicted.
Secondly, one may assume that good-quality teachers choose employment at schools where
student’s average performance is high. Students in the merged school who otherwise would have gone
to a school where the average student performance was low may now have teachers of greater quality
because their peers will attract them to seek employment at the new school. However, it might also be
the case that teachers who otherwise would have taught students with high socioeconomic status is
biased towards students with low socioeconomic status. These students may then not get the resources
that they need to increase their academic performance.
Finally, when the school mergers are implemented, parents who are about to make the choice of
school for their child will essentially choose the one public school available or an independent school
alternative. If the peer effects that comes from attending the merged schools are linear, then children
with low ability will benefit from attending the merged school. However, children with a high
probability of scholarly success will be on the downside of the initiative. Students with low
socioeconomic status or immigrant background might benefit from learning in a heterogeneous school
environment, while students with high socioeconomic status or native background might benefit from
attending a school with peers that are similar to themselves. In this case the parents of these children
might choose to enroll their children in an independent school as a reaction to the school mergers. The
result may be that the school mergers actually cause school segregation to increase in the municipality
rather than to eliminate it.
- 8 -
4. Data
4.1 Data description and coding issues
This study aims to examine in which ways the public secondary school mergers affect student
outcomes. The data is gathered from the Swedish National Agency for Education’s SIRIS database,
which offers municipality or school level data on the Swedish education system. The database is built
on individual-level data that Statistics Sweden gathers by order of the agency.
The time period of this study stretches from the scholastic year of 2012/2013 to 2016/2017. The
9th grade student’s academic outcomes are recorded in the end of each scholastic year. The most recent
available data is from the scholastic year of 2016/2017. As mentioned in section 2.1, the grade reform
of 2011 did not affect the 9th graders until the scholastic year of 2012/2013. The reform entailed that an
entirely new grade system was implemented. Because of this, the aggregated data on test and grade
results provided by SIRIS is considerably different before and after the reform. In order for the grades
and test results to be comparable between the student cohorts, my panel does not stretch further back
than to the year of 2012/2013.
The SIRIS database also offers data on municipality-level student characteristics for the entire
compulsory school population (grades 1-9). I use these variables as controls for municipality
demographics. The controls that I include are the share of female students, the share of students with
migrant background and the share of student with parents that have a university education. Students
with migrant background are defined as students with both parents born outside of Sweden. Students
with parents who have a university education is defined as students who have at least one parent who
attended university.
The data includes municipality-level outcomes and control variables for all municipalities in
Sweden. Some values in the SIRIS raw data are coded as missing instead of its real values due to
different reasons. I will briefly go through how I have handled these coding issues.
Firstly, the raw data includes some missing values for some municipalities some years. This data
is coded as (.) in SIRIS. It is probable that this issue rises due to the fact that it is formally the official
for each school that is responsible to provide Statistics Sweden with the data material. For the public
schools this is done by the municipality officials. For the independent schools it is either the headmaster
or the company that runs it that is responsible. Neither Statistics Sweden nor the Swedish National
Agency for Education holds the main responsibility, and it is therefore believable that some outcomes
are missing due the human error. When the outcomes are missing for a specific municipality for two
years or less, I use interpolation. However, when a municipality have missing values for the majority of
the years it has been omitted.
Secondly, there are some values that are coded as ~100 in the ‘share outcomes’ data, for example
the share of students that pass the national standardized tests. When there are four students or less that
fail the test, the share of students that pass are coded as ~100. There is no real clarification on SIRIS on
what ~100 actually means, but it could be interpreted that ‘approximately 100 percent’ of the students
passed. Note that this is not the true value of the outcome. We can assume that the true value will move
towards 100 percent when the student population is large. However, if the student population is small
the true value will not be approximately 100.1 I have no way of knowing the true value of these
outcomes. I have solved this coding issue by creating two different types of measures for these outcome
variables. For the first type of outcome variable I assume that four students failed to pass (the ‘worst
case’ scenario) and then calculate the true value of this case by dividing four by the student population
1 A municipality with a student population of a thousand students where four students fail to pass the national standardized test
will have a 99.6 percent share that did. However, a municipality with a student population of twenty students where four
students failed to pass, will have an 80 percent share that passed. Recoding the ~100 value to 100 for this municipality is not
close to the true value of the share that passed.
- 9 -
and subtracting it from 1. In this ‘worst case’ variable the ~100 values are recoded with the value of my
calculations. I also create a variable where all the ~100 values are recoded as 100, a ‘best case’ scenario.
I will use the ‘worst case’ outcome variables in my analysis. This is because it is likely that at least some
students should fail to pass in each municipality and using the ‘worst case’ scenario accounts for this.
This issue will be further discussed in section 7.4.
Finally, when the outcome is based on ten students or less, the outcome is coded as (..) instead of
its real value. This may be considered fair on the basis of anonymity in the Swedish Personal Data Act,
but it poses somewhat of an issue for statistical inference since it produces missing values that are never
random.2 I treat these values as missing and either use interpolation to get an estimated value or omit
the municipality.3
I have created seven panels for the different types of outcomes used in this study. In each panel
I assess which municipalities that needs to be omitted. Because of this most of the samples differ in size.
Accordingly, the results of each panel need to be analyzed separately. The panel datasets I have
constructed for the main analysis are:
Main analysis panels:
Panel 1 - Final grade outcomes
Panel 2 – Share of students in a municipality’s public schools
Panel 3 - National standardized test outcomes
Panels 1-3 is used for estimating the overall effects of the school mergers. However, the results
may conceal adverse effects for certain student groups. There may be substantial variation in how
students respond to treatment because of their background characteristics. Initiatives to decrease school
segregation aims to target groups of students that are disadvantaged. It is interesting to explore potential
discrepancy in treatment effects among different student groups. I have obtained data on student
outcomes divided on certain characteristics and created the following panels:
Heterogeneous treatment effects analysis panels:
Panel 4 – final grade outcomes divided on gender
Panel 5 – final grade outcomes divided on parental education level
Panel 6 – final grade outcomes divided on Swedish and migrant background
There are some notes to make on the heterogeneous treatment effect panels. In panel 5 the final
grade outcomes are reported for two student populations divided on parental education level. The first
group is ‘students of parents with a upper-secondary school education’, which includes students for
which the highest level of education that both parents have is either a secondary school or upper-
secondary school education. The second group is ‘students of parents with a university education’,
which includes students who have at least one of the parents attended university. Panel 5 differs
considerably from the other panels as it only includes the time period 2014/2015-2016/2017.4
2 I am sometimes able to calculate the real value of (..) for the number of students of in a certain student population when I
have the real value of the number of total population and the real value of the number of students in the other student population.
I cannot calculate the true value of any of the outcome variables since it needs to be calculated using individual data that I do
not have access to. 3 The data for some municipalities may not have any (..) values for outcome measures based on the total ninth-grade student
population, but it may need to be omitted from the panel when the outcomes are divided on student background characteristics. 4 The group “students of parents with an upper-secondary education” did not include students of parents who just had a
secondary school education before the scholastic year of 2014/2015. Instead, these students were included in their own group
in the SIRIS data. In the raw data it was not possible to recalculate the outcome measures for the three groups in 2012/2013-
2013/2014 in order to make the measures comparable with the two groups reported after 2014.
- 10 -
In panel 6 the final grade outcomes are reported for two student populations that are divided on
Swedish and migrant background. The first group is ‘students with Swedish background’, which
includes students born in Sweden with at least one parent born in Sweden. The second group is ‘students
with migrant background’, which includes students born outside of Sweden who have migrated to
Sweden with their families.5
4.2 Outcome variables
This study focuses on the effect of the secondary public-school mergers on two different types of
outcomes. Firstly, I want to investigate the effect on the municipality-level 9th grade student academic
performance. To do so I look at the final grade outcomes and national standardized test results. The final
grades that the students receive in the end of ninth grade constitute the ground for admission to upper-
secondary school. The purpose of the national standardized tests is mainly to provide teachers across
the country with the same basis for grade assessment. The tests are marked and graded according to
national guidelines.
There are some differences between the two measures for the students’ academic performance.
The final grades are based not only on the student’s result on the national standardized test, but also the
student’s overall performance in 7th to 9th grade. Hence, the result that the student gets on the national
standardized test does not need to be equivalent to the final grade that the student receives. In this sense
treatment may have different effects on these two outcomes. I therefore analyze both. In detail, the
outcome variables that measure student academic performance are:
(i) average final grade GPA
- The municipality-level average GPA. This outcome may be seen as a measure for the
average performance of the total student population. Only students who passed 9th grade
is included in the data. Hence in my panels the highest GPA value is 340 while the
lowest is 10.
(ii) The share of students that passed all subjects in ninth grade
- The municipality-level share of students that did not receive any (F) or (-) grading. This
outcome may be seen as a measure that captures the effect of treatment on students in
the top of the student skill distribution.
(iii) The share of students eligible for a vocational education
- To be eligible for a university prepatory program in upper secondary school a student
has to pass twelve out of seventeen subjects. In total the student needs to pass eight
subjects to become eligible for admission at a vocational school program. This outcome
may be seen as a measure that captures the effect of treatment on students in the bottom
of the student skill distribution.
5 SIRIS have data on a third student population distinction; students who are born in Sweden with both parents born outside
of Sweden. Including this group caused the data to contain missing values for a large portion of the municipalities due to the
issue with the (..) coding. This meant that almost all treatment municipalities would have needed to be excluded. Therefore, I
do not include this student group in my study.
- 11 -
(iv) Average national standardized test GPA (in English, math, Swedish)
- The municipality-level average test GPA. The grading on the test is recalculated into
points. Receiving an A in a subject yields 20 point. Receiving an F yields 0 points. All
students who participated are included in the data. Thus, the average test GPA varies
between 20 and 0.
(v) The share of students who passed the national standardized test (in English, math, Swedish)
- A measure for the municipality-level share of students who received grades A-E on the
tests out of all students who participated (received A-F).
Secondly, I want to examine the effect of the public secondary school mergers has on the student’s
choice of enrolling in a public school instead of choosing an independent school alternative. The
outcome variable for the effect of the school mergers on share of students in the public-school sector is:
(vi) The share of 7th grade students enrolled in public school in a municipality
- The value is calculated by taking the number of 7th graders enrolled in the public schools
and divide it by the total number of 7th graders in each municipality each year.
The reason why I examine the behavior of 7th graders is because it is common in Sweden to divide
grades 1-3, 4-6 and 7-9 into different schools (Swedish Agency for Education, 2017). In this case, the
parents have to choose a new school for the child between the 6th and 7th grade. The parents then have
the opportunity to choose an independent school option instead of a public one, and vice versa. The
merged schools that are included in this school include grades 7 to 9. Using the share of 9th graders may
be considered fruitless due to the fact that students and parents may be hesitant towards switching
schools right before the last year of secondary school.
4.3 Treatment and control groups
According to the SIRIS database there are twelve municipalities that have implemented the
merger during the time period of my study. These municipalities are presented in table 1. Column (2)
shows the scholastic year that the merged secondary public school opened, and column (3) shows the
name of the new school. The number of students in the scholastic year of 2016/2017 is shown in column
(4). In 2016 the average number of students in public secondary schools in Sweden was 217 students
(Swedish agency of Education, 2017). All of the merged schools have a larger body of students than the
national mean. The two largest schools reside in Katrineholm and Nyköping, where the student
population consists of over 1000 students.
For the six panels that I use in this study the treatment group differs in size. This is due to the
various nature of the outcome variables. The municipalities included in each treatment group are
presented in table 2.
When I estimate the effect of the secondary school mergers on student academic achievement the
requirement for a municipality to be included in the treatment group is that the merged school have been
opened for at least three years. As noted earlier, the merged school in Nyköping opened in 2014/2015.
This implies that the ninth-grade student cohort in the scholastic year of 2016/2017 was the first cohort
that had the possibility to attend the merged school throughout 7th to 9th grade. It would be optimal to be
completely certain that the student cohort did attend the merged school during these years in order to
eliminate the effect that attending another school may have on the students’ ninth-grade outcomes. Since
- 12 -
I only have access to municipality-level data I cannot be certain of this. Nevertheless, it is possible that
a small number of student switched schools sometime between 7th and 9th grade.6
Table 1. Treatment municipalities (1) (2) (3) (4) (5)
Official
mun.
key
Scholastic year
of
school merger
Name of
secondary school
Number of
students
(2016)
Independent
schools in
municipality
Flen 0483 2015/2016 Stenhammarskolan 427 No
Götene 1471 2015/2016 Liljestensskolan 464 No
Hedemora 2083 2013/2014 Vasaskolan 419 Yes
Katrineholm 0483 2016/2017 Järvenskolan 1,055 Yes
Lilla Edet 1462 2015/2016 Fuxernaskolan 136 Yes
Munkedal 1430 2013/2014 Kungsmarksskolan 343 No
Nyköping 0480 2014/2015 Nyköpings högstadium 1,261 Yes
Skara 1495 2015/2016 Viktoriaskolan 557 Yes
Surahammar 1907 2013/2014 Hammarskolan 320 No
Säffle 1785 2015/2016 Tegnérskolan 524 No
Årjäng 1765 2015/2016 Nordmarkens skola 349 No
Älmhult 0765 2014/2015 Linnéskolan 433 Yes
Örkelljunga 1257 2013/2014 Kungsskolan 271 Yes
The data is gathered from the Swedish National Agency for Education’s SIRIS database. The official municipality
keys in column (1) are identification numbers provided by the Swedish Tax Agency. The number of students for
each school as shown in column (4) is the number of students that are enrolled in each school on the 15th of October
2016.
As shown in table 2, the treatment group that I use when I examine the effect of the school mergers
on the students’ academic outcomes only includes municipalities that implemented the merger in 2014
or earlier. This is to make sure that the students have had the possibility to attend the merged school
from 7th to 9th grade. These municipalities are; Hedemora, Nyköping, Surahammar, Älmhult,
Örkelljunga and Munkedal.
In column (5) we see that the municipalities in the treatment group when the final grade outcomes
are divided on parental education level differ as it includes; Flen, Götene, Skara, Säffle, Årjäng and Lilla
Edet. These municipalities are the ones where the merged schools opened in the fall of 2015.7
In column (6) we see that there are only three municipalities included in the treatment group when
the final grade outcomes are divided on students with Swedish and migrant backgrounds. This is due to
the fact that the outcome data divided on Swedish and migrant background includes many missing
values and in affect some of the treatment municipalities needed to be excluded from the data. The
municipalities included in the treatment group in this panel are; Hedemora, Nyköping and Älmhult.
When I estimate the effect of the school mergers on the share of students in the public-school
sector, I impose another requirement on the municipalities that are included in the treatment group. The
requirement is that there actually exist independent schools in the municipality. In column (5) of table
1 we see that not all municipalities do. The students in these municipalities does not have the option of
6 This would be if they switched from an independent school or moved in to the municipality. In contrast to public schools,
independent schools do not have to admit all applicants each year, and there is usually a queuing system. The independent
schools may presumably be reluctant to admit students when their classes are already filled according to the que. 7 As explained in the previous section this panel only include the time-period of 2014/2015-2016/2017. In this case the
constraint that schools have to been open for at least three years needs to be loosened. Thus, these schools have been opened
for two years, which means that the 2016/2017 9th grade student cohort only had the opportunity to attend the school in 8th
grade too.
- 13 -
attending an independent school. I do not include thee municipalities in the treatment group.8 As shown
in table 2, column (2), the municipalities included in this treatment group are; Hedemora, Nyköping,
Skara, Älmhult, Örkelljunga and Lilla Edet.
The control group consists of all municipalities in Sweden that had a constant number of public
secondary schools during the time-period of this study. The municipalities that experienced an increase
in the number of public schools during the time-period of this study are excluded. There are also a
number of municipalities that had a decreasing number of public schools during this time. These
municipalities are also excluded. In total 48 municipalities are excluded from all panels.9
Table 2. Municipalities in treatment groups
Municipalities included in treatment group for different panels
(1) (2) (3) (4) (5) (6)
Panel 1:
Grade
outcomes
Panel 2:
share 7th grade
students in public
schools
Panel 3:
National
standardized
test
Panel 4:
Grade
outcomes div.
on gender
Panel 5:
Grade outcomes div.
on parental
education-level
Panel 6:
grade outcomes div.
on background
Flen No No No No Yes No
Götene No No No No Yes No
Hedemora Yes Yes Yes Yes No Yes
Katrineholm No No No No No No
Lilla Edet No Yes No No No No
Munkedal Yes No Yes Yes No No
Nyköping Yes Yes Yes Yes No Yes
Skara No Yes No No Yes No
Surahammar Yes No Yes Yes No No
Säffle No No No No Yes No
Årjäng No No No No Yes No
Älmhult Yes Yes Yes Yes No Yes
Örkelljunga Yes Yes Yes Yes No No
This table shows which municipalities are included in the treatment group. Columns (1)-(3) presents the treatment groups used
in the main analysis. Columns (4)-(6) shows the treatment groups used in the heterogeneous treatment effects analysis.
8 None of the treated municipalities had an independent school that opened during the time period of this study. 9 The names and official municipality keys of these municipalities are shown in table A1 in appendix A.
- 14 -
4.4 Descriptive statistics
Table 3 shows summary statistics for each panel that is used in the main analysis. ‘Panel 1 - final
grade outcomes’ includes 235 municipalities over the five-year period of this study. The treatment group
consist of six municipalities, while the control group consists of 229 municipalities. Overall the mean
outcomes are slightly larger for the control group than for the treatment group. The largest difference is
the mean number of ninth-graders in the municipality. Note that this variable includes the total ninth
grade student population in a municipality, thus it includes students in both the public and independent
school sector. The difference is probably due to the fact that the control group includes some
metropolitan municipalities with large student populations while the treatment group includes almost
only small and rural municipalities. ‘Panel 2 – share of students in a municipality’s schools’ includes 93
municipalities, for which six constitutes the treatment group and the remaining 87 municipalities the
control group. The share of 7th graders that are enrolled in the public-school sector in each municipality
is the outcome variable, the mean share in the total student population is 78.5 percent, and the mean in
treatment and control group does not deviate much from this. The average size of the municipality-level
student population differs considerably between control and treatment group. The municipality-level
mean number of students in 7th grade the control group is 2279 students, while in the treatment group
the mean number of 7th graders is 361 students. The difference is also large when comparing the mean
number of 7th graders in the public-school sector and independent school sector. ‘Panel 3 – standardized
national test outcomes’ for the total student population consists of 232 municipalities out of which six
municipalities received treatment. The table shows that there are no considerable differences in mean
outcomes between treatment and control group.
- 15 -
Table 3. Summary statistics for final grade outcomes and share of students in a municipality All municipalities Control municipalities Treated municipalities
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
N n mean sd n mean sd n mean sd
Panel 1: grade outcomes
Average GPA 1,175 235 219.8 15.35 229 219.9 15.39 6 211.1 8.715
Pass all subjects (%) 1,175 235 76.58 7.110 229 76.64 7.105 6 72.07 6.057
Eligible voc. school (%) 1,175 235 85.82 5.634 229 85.86 5.633 6 82.68 4.901
No. 9th grade students 1,175 235 1583 2653 229 1599 2666 6 335.0 240.7
Female students (%) 1,175 235 48.51 1.095 229 48.50 1.091 6 48.97 1.309
Immigrant background (%) 1,175 235 20.39 10.06 229 20.41 10.11 6 18.99 4.468
Students of parents with
university education (%)
1,175 235 53.35 10.76 229 53.46 10.76 6 45.06 6.902
Panel 2: share of students
in mun. pub. schools
7th grade students Pub. (%) 464 93 78.48 11.62 87 78.53 11.65 6 75.75 9.342
No. 7th grade students tot. 464 93 2247 3131 87 2279 3148 6 360.9 230.7
No. 7th grade students Pub. 464 93 1516 1917 87 1538 1926 6 256.6 145.1
No. 7th grade students Ind. 464 93 721.6 1208 87 732.1 1216 6 104.3 87.39
Female students (%) 464 93 48.62 0.840 87 48.62 0.831 6 49.16 1.190
Immigrant background (%) 464 93 21.89 10.47 87 21.91 10.54 6 20.73 4.295
Students of parents with
university education (%)
464 93 57.28 9.305 87 57.48 9.218 6 45.68 6.929
Panel 3: national test
English test:
Average grade point 1,155 231 14.44 0.839 225 14.44 0.844 6 14.42 0.529
Share pass test 1,155 231 95.15 2.987 225 95.15 2.991 6 95.34 2.852
Math test:
Average grade point 1,155 231 10.92 1.382 225 10.92 1.390 6 10.81 0.930
Share pass test 1,155 231 84.65 8.302 225 84.67 8.329 6 83.74 6.847
Swedish test:
Average grade point 1,155 231 12.99 0.967 225 12.99 0.970 6 12.82 0.798
Share pass test 1,155 231 94.01 3.796 225 94.01 3.809 6 94.31 3.134
Control variables: 1,155 231 12.99 0.967 225 12.99 0.970 6 12.82 0.798
Female student (%)
Immigrant background (%) 1,155 231 48.31 1.658 225 48.30 1.660 6 48.86 1.457
Students of parents with
university education (%)
1,155 231 16.57 7.849 225 16.53 7.890 6 18.18 5.142
No. 9th grade students 1,155 231 45.93 10.18 225 46.02 10.23 6 41.32 5.931
The data is gathered from the Swedish National Agency for Education’s SIRIS database. All summary statistics are based on
the scholastic years of 2012/2013-2016/2017. N shows the number of observations in the total dataset, while n shows the
number of municipalities.
- 16 -
5. Method
5.1 Empirical framework
This study aims to examine the effect of the secondary school mergers on the municipality-level
student outcomes. Treatment occurs when the municipality goes from having several public secondary
schools to have one school. When estimating this type of treatment effect, there may be unobservable
factors correlated with treatment and affecting the outcome that vary between municipalities or over
time. In order to isolate the effect of the secondary school mergers and reduce bias from selection on
unobservable factors, I employ a difference-in-differences estimation as my baseline approach. Thus, I
compare the change in outcome in municipality that has been subject to treatment with change in
outcome in untreated municipalities over time. My baseline estimation is the following:
𝑌𝑚𝑡 = 𝛼 + 𝛽𝑇𝑚𝑡 + 𝛾𝑋𝑚𝑡 + 𝜃𝑚 + 𝜆𝑡 + 𝜀𝑚𝑡 (2)
𝑌𝑚𝑡 denotes the average student outcome in the municipality. 𝑇𝑚𝑡 is a dummy variable taking on
the value one if a municipality that is exposed to treatment is in a period where it receives treatment.
Hence, 𝑇𝑚𝑡 equals zero if a treated municipality is in a pre-treatment period or if the municipality is in
the control group. Accordingly, 𝛽 is the difference-in-difference estimator which is the parameter of
interest. 𝑋𝑚𝑡 is a vector that includes controls for the municipality demographics. 𝜃𝑚 denotes the
municipality-fixed effects. This eliminates bias arising from there being unobservable factors that vary
across municipalities but are constant over time. 𝜆𝑡 denotes the time-fixed effects. This eliminates bias
arising from unobservable factors that are constant across municipalities but vary over time. All
municipalities are weighted on their 9th grade student population in order to correct for differences in
population size.
Considering the fact that I study the effect of the public secondary school mergers on student
outcomes for the entire municipality, what I estimate is essentially a reduced-form effect. Thus, the
difference-in-difference estimator captures the overall impact of the initiative on the municipality’s
student performance or choice of school. By using outcome data for the entire municipality I reduce the
risk of ‘selection into treatment’ within the municipality. The selection problem could arise if the
students that attend public schools do so because they have certain background characteristics that
makes them different form students that attend independent schools. By estimating the effect of the
initiative on the municipality-level student outcomes I am able to see how the opening of the merged
school is affecting the overall performance of the municipality’s student population, which could capture
any potential spillover effects from the initiative on students who attend independent schools in the
municipality.
In this study the standard errors are clustered on municipality-level to allow for the residuals
within municipalities to be correlated. In other words, I allow for the error term to be correlated within
a certain cluster of entities, yet uncorrelated across clusters. Donald and Lang (2007) point out that the
estimator of the standard error is only asymptotically consistent when the number of observations in the
treatment and control groups are large. The fact that the time-period of this study is only five years long
may cause the standard errors to be inconsistent.
According to Conley and Tabler (2011) having a large sample might not always cause the standard
errors to be consistent if the number of treated observations is small. They argue that in this case,
assuming that the standard error estimates are asymptotically normally distributed may not be
appropriate. They a framework for the estimation of the standard errors that causes them to be consistent
when the treatment group is small. Their strategy is to create an empirical distribution of the residuals
from control group. This creates an asymptotically valid acceptance region for the null hypothesis that
the difference-in-difference approach estimates the true causal effect. If the estimates lie in the tails of
the empirical distribution, the null hypothesis can be rejected. The key idea is that although the control
- 17 -
group does not provide any information about the true value of the difference-in-difference estimator, it
can contain information about the distribution of the noise in the error term. If the number of
observations in the control group is large one could be able to make a consistent estimation of the
distribution of the point estimators up to the true value of the causal effect.
I will use the municipality-level clustered standard errors throughout this study under the
assumption that they are consistent, however this issue is worth noting since the number of
municipalities in the treatment group in the panels are at the most six. The method that Conley and
Tabler introduce would be a useful technique to use in a possible extension of the framework I employ
for my analysis. However it is outside the scope and time frame of this thesis in its current form.
5.2 Underlying assumption
The key identifying assumption is that aside of the secondary school merger, there are no
unobservable confounding factors that vary both over time and across municipalities systematically
affecting student outcomes in the treated municipalities. This essentially means that one would be able
to observe the same trends in changes of potential student outcomes for both treatment and control
municipalities over time if treatment had not occurred. It is important to note that these trend-changes
are not observable. This is because they refer to trends in outcomes in an alternative world where the
secondary school mergers would not have occurred in the municipalities. Looking at what the
difference-in-difference estimator is actually measuring may bring clarity to this issue. The difference-
in-difference estimator, given by 𝛽 in this study, compares the change in student outcomes before and
after the mergers have been implemented in the municipalities to the changes in student outcomes for
the same time periods in municipalities where no school mergers occurred. Thus, the estimator is given
by:
𝛽 = [𝐸(𝑌𝑡1|𝑚𝑇) − 𝐸(𝑌𝑡−1
0 |𝑚𝑇)] − [𝐸(𝑌𝑡0|𝑚𝐶) − 𝐸(𝑌𝑡−1
0 |𝑚𝐶)] (3)
In equation (3), 𝑌𝑡1 is the expected outcome of treatment municipality 𝑚𝑇 in period t which is
when treatment occurs. This makes it clear that the difference-in-difference estimator compares changes
in expected outcomes between treatment and control group.
The parallel trends condition says that if treatment never had occurred the changes in expected
outcomes will be equal between treatment and control group. Hence, this is in the alternative world
where 𝑌𝑡1 never happens. Therefore, this scenario can be shown as:
𝐸(𝑌𝑡0|𝑚𝑇) − 𝐸(𝑌𝑡−1
0 |𝑚𝑇) = 𝐸(𝑌𝑡0|𝑚𝐶) − 𝐸(𝑌𝑡−1
0 |𝑚𝐶) (4)
Equation (4) essentially tells us that if treatment never had occurred, one would have been able to
observe the same trends in potential student outcomes for the municipalities in the treatment and control
group over time.
As noted earlier we are not able to observe the alternative universe where the merger was never
implemented in the municipalities in question. However, it is possible to test the assumption by
conducting placebo tests on pre-treatment periods. Thus, the procedure of the parallel-trends test is to
estimate the effect of treatment on the sample in the period before treatment actually happened. This
requires that one have access to data on multiple years before the merger to be able to look at
pretreatment trends. I do not have access to enough pre-treatment data to observe any trends, but there
are some municipalities that have two pre-treatment periods.
In order to observe the parallel trends one need to look at the pre-treatment trends over time, and
therefore it is not enough to only have one pre-treatment period. Since I do not have enough pre-
- 18 -
treatment periods I cannot perform a placebo test that is representative for the entire treatment group. I
am however able to explore whether the parallel trends assumption holds for the municipalities in the
treatment group that have more than one pre-treatment period by conducting a number of event studies.
In these event studies I plot the difference-in-difference point estimates where the binary treatment
variable has been interacted with the years before and after the public secondary school mergers where
implemented. The results of these tests will be further discussed in 7.1.
There are two reasons why an insignificant point estimate in year t-2 may still not be valid
indicator that the parallel trends assumption holds. Firstly, the effect in year t-2 is only calculated based
on the municipality that have two pre-treatment periods, and therefore the point estimate does not reflect
the whole treatment group. Secondly, since I am only able to observe the point estimate for one pre-
treatment period, and it may therefore not say much about the parallel trends over time.
5.3 Matching
The difference-in-difference approach compares the trend in average outcome between treatment
and control group. Essentially, I estimate the trends in average outcome for the control group and uses
it as a proxy for the counterfactual outcome. How to find a suitable control group depends on the
assignment mechanism. If the school mergers are implemented randomly in the municipalities in
question, then on average the estimation of trends in potential outcomes between treatment and control
group should provide unbiased estimates. However, if there is self-selection into treatment based on
trends in deteriorating school performance or demographics that caused the municipalities to implement
the school merger, then non-treated and treated municipalities may have very different potential
outcomes settings, which would cause biased estimates.
In order to create a control group that potentially provide a more suitable counterfactual I want
to explore the alternative of restricting the control group. To do so I use propensity score matching to
create new panels that contain the treatment group and a control group that is more similar on
municipality demographics. In the matching process each treated municipality is matched to the
untreated municipality with the closest propensity scores. The propensity score is the probability that a
municipality receives treatment given the observable confounding factors.
The matching is done on a set of pre-treatment covariates10. Ideally one wants to perform the
matching on pre-treatment trends, however due to the small number of pre-treatment periods available
to me I use the level of the covariates. Most of the panels used in this study has a staggered treatment.
Therefore, some of the treated municipalities have two pre-treatment periods while others just have one.
In order to have the same number of nearest neighbors for all treatment municipalities within each panel,
the matching had to be done on covariates from the same pre-treatment period for the entire treatment
group. Because of this, the matching is done on covariates from the scholastic year of 2014/2015 for the
panel where the final grade outcomes divided on parental education level, and 2012/2013 for all the
other panels.
As mentioned in the previous section the number of treated municipalities is small. I therefore use
M-nearest neighbor matching with redraw so that each treated municipality gets several control group
matches in order to not lose too many observations in the control group.11 I match each municipality in
10 The covariates that are used in the matching process for the student performance outcome panels: final grade or national
standardized test outcomes, the mean number of ninth-graders in the municipality, the number of the total compulsory school
student population (grades 1-9), the share of 9th grade students enrolled in the public-school sector, and the control variables
(see section 4.1). The pre-treatment covariates that are used in the matching process for the share of students in a municipalities
public schools: the share of 7th graders in the public-school sector, the number of 7th grade students in the public and
independent school sector, the mean number of students enrolled in 7th grade, the number of the total compulsory school student
population (grades 1-9), the share of 9th graders enrolled in the public-school sector, and the control variables (see section 4.1). 11 Selecting the number of neighbors to match on involves a bias called variance trade-off (Stuart 2010). Matching each
observation with a larger number of neighbors may decrease variance due to a larger sample size. On the other hand, including
- 19 -
the treatment group with the five untreated municipalities that have the closest propensity score. Due to
the fact that the treatment group in ‘Panel 6 – final grade outcomes divided on Swedish and migrant
background’ have less than five municipalities in the treatment group due to reasons discussed in section
4.3, the treated municipalities in this panel is matched with the three closest m-neighbors.12
The propensity score matching is done in several steps. First, I need to change the dimension of
the panel data from long to wide to treat all covariates each year as cross-sectional data. I then estimate
the propensity score using a logit model given the pre-treatment covariates. Based on the propensity
score I match the treated municipality with the untreated municipalities that have the closest scores. I
then create new panels that only include the treatment municipalities and their matched control group.
Using these panels, I estimate the following regression:
𝑌𝑚𝑡 = 𝛼 + 𝛽𝑇𝑚𝑡 + 𝛾𝑋𝑚𝑡 + 𝜂𝑚 + 𝜆𝑡 + 𝜀𝑚𝑡 (5)
Equation (5) is in many ways similar to the baseline regression equation (2), but instead of
including municipality-fixed effects, I include match-fixed effects, 𝜂𝑚. The match-fixed effects are a
set of dummies that each takes on the value 1 for a treatment municipality and its matched control
municipalities. What happens is that each treatment municipality is paired with its matches in order to
isolate the within-matches variation. Thus, the difference-in-difference estimator 𝛽 compares the
average changes in expected outcomes between each treatment municipality and its matched control
group. The estimation of equation (5) is done using the same municipality-level clustered standard errors
as in the baseline difference-in-difference approach, to allow for autocorrelation within each
municipality over time.
When I use this matched difference-in-difference approach it is necessary to adjust the parallel
trends assumption to account for the propensity score matching. In any matching process, one need to
assume that conditional on the covariates, the potential average outcomes are independent of treatment.
However, with the matched difference-in-difference approach there may still be systematic differences
between treatment and control group due to time constant unobservable factors. Following Smith and
Todd (2005), for the difference-in-difference estimator in equation (5) to be unbiased it is necessary that
the following holds:
𝐸(𝑌𝑡0 − 𝑌𝑡−1
0 |𝑚𝑇 , 𝑃) = 𝐸(𝑌𝑡0 − 𝑌𝑡−1
0 |𝑚𝐶 , 𝑃) (6)
Equation (6) is basically a rearrangement of the parallel trends assumption in equation (4).
However, it now states that conditional on the propensity score, 𝑃, there would have been no systematic
differences in trends in student outcomes between a treated municipality and its matched control group
if treatment had not occurred. This means that conditional on the propensity score treatment becomes
as if randomly assigned.
As a final note, I want to mention that it is important to keep in mind that the matching process
creates the matched control group based on an estimation of the propensity score. This is true given that
nothing other than the pre-treatment covariates determines the assignment of treatment. It does not mean
that the municipalities in the matched control group is the most appropriate group to use when estimating
the true counterfactual effect. It is merely an estimation to create a control group that may be more
appropriate relative to the control group that includes all municipalities available.
more neighbors may increase bias due to the fact that each additional neighbor will by definition be further away in similarity
to the treated municipality. However, matching with replacement may reduce this bias since control municipalities that are
similar to many treated municipality can be used more than once. 12 The restriction is set by the data analysis software I use (STATA) and entails that m-neighbor must specify an integer greater
than or equal to 1 but no larger than the number of observations in the smallest treatment or control group.
- 20 -
6. Results
6.1 Difference-in-difference results
Tables 6 to 9 show the main results of my baseline difference-in-difference estimation. From these
tables we see that adding controls to the estimation causes the point estimates to change considerably in
magnitude. Some of the effects goes from being negative to positive and the majority of the point
estimates that are statistically significant turn insignificant when the controls are added. The conclusion
from this is that the effects are not robust to adding controls.
The controls are municipality-level student characteristics for the entire compulsory school
student population. I include them is because they may be correlated with treatment and determinants
for student outcomes. The fact that the point estimates change when the controls are added implies that
there are trends in the municipality demographics that covary with treatment. If this is true, there is risk
that the parallel trends assumption does not hold for my baseline approach. I therefore conclude that the
point estimates from the difference-in-difference estimation are too unstable to draw any causal
inferences from. In the next section I proceed to conduct the estimation with the match control group
sample, which might provide a better counterfactual.
Table 6. The effect of the public secondary school merges on final grade outcomes
(1) (2) (3) (4) (5) (6)
Average
GPA
Average
GPA
Share pass
all subjects
Share pass
all subjects
Share eligible
vocational school
Share eligible
vocational school
School Merge 6.34448*** -1.27710 -3.79141*** -1.74579 -3.63043*** -0.59773
(1.12327) (1.62727) (0.88028) (1.07557) (0.80625) (0.73332)
Female students (%) 0.88418** 0.29587 0.13711
(0.40296) (0.22285) (0.17157)
Students with migrant 0.31101*** -0.61143*** -0.64918***
background (%) (0.09887) (0.05322) (0.04576)
Students of parents with 3.35363*** 0.24584* -0.19628***
university education (%) (0.44943) (0.13179) (0.06161)
Constant 219.71255*** -8.33393 76.63152*** 61.60395*** 85.86972*** 102.88760***
(0.01448) (35.76285) (0.01135) (13.99783) (0.01039) (8.99046)
Observations 1,175 1,175 1,175 1,175 1,175 1,175
R-squared 0.00251 0.34649 0.00419 0.16531 0.00482 0.26157
Number of municipalities 235 235 235 235 235 235
Municipality Fixed Effects yes yes yes yes yes yes
Time Fixed Effects yes yes yes yes yes yes
Adjusted R-squared 0.00166 0.344 0.00334 0.162 0.00397 0.259
Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1.
The table presents the results of the difference-in-difference estimation of the effect of secondary school mergers on the
municipality-level final grade results for the total 9th grade student population. Standard errors are clustered on municipalities.
The dataset includes 235 municipalities, of which six municipalities received treatment. All municipalities are weighted on
ninth-grade student population.
- 21 -
Table 7. The effect of the public secondary school merges on nat. stand. test average GPA
Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1
The table presents the results of the difference-in-difference estimation of the effect of secondary school mergers on the
municipality-level average GPA on the national standardized tests in English, math and Swedish. Standard errors are clustered
on municipalities. The dataset includes 231 municipalities, of which six municipalities received treatment. All municipalities
are weighted on ninth-grade student population.
Table 8. Effect of public secondary school merges on the share of students that pass the tests
(1) (2) (3) (4) (5) (6)
English test English test Math test Math test Swedish test Swedish test
School Merge -0.226 0.197 -4.738*** -1.611 -1.039 -0.006
(0.607) (0.562) (0.962) (1.194) (1.341) (1.353)
Female students (%) -0.003 -0.324 0.338***
(0.079) (0.263) (0.116)
Students with migrant -0.135*** -0.588*** -0.164***
background (%) (0.020) (0.059) (0.029)
Students of parents with 0.066* -0.400*** -0.170***
university education (%) (0.039) (0.100) (0.040)
Constant 96.318*** 95.709*** 86.185*** 135.234*** 95.454*** 91.482***
(0.008) (4.745) (0.012) (15.115) (0.017) (6.349)
Observations 1,155 1,155 1,155 1,155 1,155 1,155
R-squared 0.000 0.059 0.003 0.095 0.001 0.093
Number of municipalities 231 231 231 231 231 231
Municipality Fixed Effects yes yes yes yes yes yes
Time Fixed Effects yes yes yes yes yes yes
Adjusted R-squared -0.000748 0.0559 0.00238 0.0922 0.000303 0.0901
Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1
The table presents the results of the difference-in-difference estimation of the effect of secondary school mergers on the
municipality-level share of students that passed the national standardized tests in English, math and Swedish. Standard errors
are clustered on municipalities. The dataset includes 231 municipalities, of which six municipality received treatment. All
municipalities are weighted on ninth-grade student population.
(1) (2) (3) (4) (5) (6)
English test English test Math test Math test Swedish test Swedish test
School Merge 0.113 0.040 -0.665*** -0.104 -0.309 -0.462*
(0.257) (0.250) (0.205) (0.239) (0.260) (0.265)
Female students (%) 0.013 -0.043 0.058**
(0.019) (0.040) (0.027)
Students with migrant -0.018*** -0.099*** 0.010
Background (%) (0.005) (0.009) (0.007)
Student of parents with 0.076*** -0.085*** 0.061***
university education (0.013) (0.020) (0.020)
Constant 14.945*** 10.638*** 11.386*** 20.023*** 13.464*** 7.221***
(0.003) (1.277) (0.003) (2.475) (0.003) (1.726)
Observations 1,155 1,155 1,155 1,155 1,155 1,155
R-squared 0.000 0.091 0.003 0.127 0.002 0.046
Number of municipalities 231 231 231 231 231 231
Municipality Fixed Effects yes yes yes yes yes yes
Time Fixed Effects yes yes yes yes yes yes
Adjusted R-squared -0.000419 0.0874 0.00188 0.124 0.00121 0.0427
- 22 -
Table 9. Effect on the share of 7th grade students enrolled in public school sector
(1) (2)
Share 7th graders Share 7th graders
School Merge -2.322* -1.705
(1.184) (1.537)
Female students (%) -0.598
(0.501)
Students with migrant background (%) -0.052
(0.155)
Students of parents with -0.425
university education (%) (0.293)
Constant 78.519*** 133.076*** (0.020) (33.713) Observations 464 464 R-squared 0.003 0.041 Number of municipalities 93 93 Municipality Fixed Effects yes yes Time Fixed Effects yes yes Adjusted R-squared 0.00119 0.0323
Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.
The table presents the results of the difference-in-difference estimation of the effect
of secondary school mergers on the share of 7th graders enrolled in a municipality’s
public school sector. Standard errors are clustered on municipalities. The dataset
includes 93 municipalities, of which six municipality received treatment. All
municipalities are weighted on 7th student population.
- 23 -
6.2 Matched difference-in-difference results
In this section I will present the results of the difference-in-difference estimation with match-fixed
effects. As shown in equation (5) in section 5.3, each treatment municipality has been paired with its
matches in order to isolate the within-matches variation. In tables 10-13 the point estimates are in general
robust to adding controls. This is to be expected since the propensity score matching has been done in
part on these covariates. In addition, many of the results are now statistically significant. I will
henceforth treat the matched difference-in-difference approach as my main specification and treat tables
10 to13 as my main results. My inference will be done on the effects of treatment where the controls are
added.
Table 10 shows the matched difference-in-difference estimation of the effect of the public
secondary school mergers on the final grade outcomes. Column (2) shows the effect of the public
secondary school mergers on average GPA, which is negative and statistically significant on a one-
percent level. This result shows that the school merger reduced the average GPA in the municipality
with approximately 6.8 grade points. The final grade GPA ranges from 10 points to 340 points. From
the summary statistics table 3 we see that the mean average GPA in all Swedish municipalities is 219.8
points. Using this mean GPA as a baseline for comparison, the municipalities where the public
secondary school mergers were implemented should have a mean average GPA of 213 points. Hence,
the reduction of the average GPA compared to the national mean is not substantial. Nevertheless, the
effect still goes in the opposite direction of what should be the aim of the mergers.
Column (4) shows the effect of the public secondary school mergers on the share of students that
pass all subjects. This outcome can be seen as a measure that captures the effect of treatment on students
in the top of the skill distribution. Column (4) shows that the public secondary school mergers caused
the share of students that pass all subjects in the treated municipalities to decrease by approximately 3
percentage points. This result is statistically significant on a five-percent level.
Column (6) shows the effect of the public secondary school mergers on the share of students that
are eligible for vocational school education. To be eligible for a university prepatory program in upper
secondary school a student has to pass twelve out of seventeen subjects. Yet to be eligible for a
vocational school program the student only need to pass eight subjects. This outcome may therefore be
seen as a measure that captures the effect of treatment on students in the bottom of the skill distribution.
The public secondary school mergers caused the share of students that are eligible for a vocational
school education in the treated municipalities to decrease by approximately 3.12 percentage points.
Overall, the results show that the public secondary school mergers have a negative effect on the
average student academic performance. The results also show that the school mergers have a negative
effect on the share of students that pass all subjects and on the students that are eligible for vocational
school education. This tells us that the school mergers have a negative effect on the share of students
that perform well in school while it increases the share of students that will have insufficient amount of
passing grades to continue to any type of upper-secondary education.
- 24 -
Table 10. Match: effect of the public secondary school merges on final grade outcomes (1) (2) (3) (4) (5) (6)
Average GPA Average GPA Share pass
all subjects
Share pass
all subjects
Share eligible
vocational
school
Share eligible
vocational
school
School Merge -4.80930** -6.85423*** -2.17036 -2.96959** -2.11787** -3.11854***
(1.96141) (1.58724) (1.53822) (1.08111) (0.99324) (1.01127)
Female students (%) -0.48051 -0.31350 -0.33040
(0.69849) (0.45246) (0.35298)
Students with migrant -0.38653*** -0.41395*** -0.31014***
background (%) (0.13254) (0.10957) (0.05764)
Students of parents with 0.41958*** 0.02567 0.14621
university education
(%)
(0.14120) (0.09267) (0.08966)
Constant 204.77588*** 217.64812*** 74.40609*** 93.38755*** 87.96647*** 102.32726***
(1.35868) (34.38351) (1.35380) (22.81323) (0.85013) (16.94302)
Observations 155 155 155 155 155 155
R-squared 0.29892 0.40969 0.14659 0.29923 0.20555 0.36462
Number of municipalities 31 31 31 31 31 31
Match Fixed Effects yes yes yes yes yes yes
Time Fixed Effects yes yes yes yes yes yes
Adjusted R-squared 0.394 0.472 0.130 0.220 0.186 0.281
Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1. The table presents the results of the difference-in-difference estimation with match-fixed effects. It show the effect of the
secondary school mergers on the municipality-level final grade results for the total 9th grade student population. Standard errors
are clustered on municipalities. The control group has been obtained through propensity score matching on pre-treatment
covariates. The match-fixed effects are a set of dummies that each takes on the value 1 for a treatment municipality and its
matched control municipalities. The match-dummy for Surahammar is the omitted baseline. The dataset includes 31
municipalities, of which six municipalities received treatment. All municipalities are weighted on ninth-grade student
population.
Tables 11 and 12 show the matched difference-in-difference estimation of the effect of treatment
on the standardized national test outcomes in English, math and Swedish. Table 11 shows the effect of
the public secondary school mergers on average test GPA. All effects turn statistically insignificant and
close to zero when the controls are added. The largest effect of the initiative on average test GPA is
found in column (4) where we observe that the mergers cause the math test average GPA to increase by
0.333 grade points. The national standardized test GPA ranges from 0 to 20 points. Summary statistics
table 3 shows that the national mean average GPA of 14.44 grade points. Compared to this, the mean
math GPA in the treated municipalities would be 14.77 points. This is not a considerable difference.
Table 12 shows the effect of the public secondary school mergers on the share of students that
pass the national standardized tests. The effect is again largest for the outcome on the math test. The
point estimate, shown in column (4), tells us that the public secondary school mergers increase the share
of students in the municipality that pass the national standardized tests in math with approximately 1.1
percentage points.
Overall the public secondary school mergers have small and positive effects on the national
standardized test outcomes. It is interesting that the effects of the school mergers on the national
standardized test outcomes goes in the opposite direction of the effects on the final grade outcomes.
However, all point estimates in tables 11 and 12 are statistically insignificant when the controls are
added to the estimation. The point estimates are so small in magnitude that it is unlikely that the effects
have any economic significance.
- 25 -
Table 11. Match: effect of public secondary school merges on nat. standard test average GPA
Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1
The table presents the results of the difference-in-difference estimation with match-fixed effects. It shows the effect of the
secondary school mergers on the municipality-level average GPA on the national standardized tests in English, math and
Swedish. Standard errors are clustered on municipalities. The control group has been obtained through propensity score
matching on pre-treatment covariates. The match-fixed effects are a set of dummies that each takes on the value 1 for a treatment
municipality and its matched control municipalities. The match-dummy for Hedemora is the omitted baseline. The dataset
includes 35 municipalities, of which six municipalities received treatment. All municipalities are weighted on ninth-grade
student population.
Table 12. Match: effect of public sec. school merges on the share of students that pass the tests (1) (2) (3) (4) (5) (6)
English
test
English test Math test Math test Swedish test Swedish
test
School Merge 0.382 0.214 2.156 1.094 1.062* 0.589
(0.689) (0.593) (1.309) (1.528) (0.622) (0.604)
Female students (%) -0.043 0.142 0.219
(0.130) (0.578) (0.214)
Students with migrant 0.012 -0.305*** 0.009
background (%) (0.034) (0.100) (0.054)
Students of parents with 0.107*** 0.118* 0.182***
university education (%) (0.023) (0.068) (0.024) Constant 95.259*** 93.052*** 85.310*** 77.028** 94.621*** 76.749***
(0.910) (6.615) (1.972) (28.464) (0.764) (10.011)
Observations 175 175 175 175 175 175
R-squared 0.152 0.308 0.410 0.514 0.138 0.353
Number of municipalities 35 35 35 35 35 35
Match Fixed Effects yes yes yes yes yes yes
Time Fixed Effects yes yes yes yes yes yes
Adjusted R-squared 0.101 0.252 0.374 0.474 0.0850 0.301
Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1
The table presents the results of the difference-in-difference estimation with match-fixed effects. It shows the effect of the
secondary school mergers on the share that pass the national standardized tests in English, math and Swedish. Standard errors
are clustered on municipalities. The control group has been obtained through propensity score matching on pre-treatment
covariates. The match-fixed effects are a set of dummies that each takes on the value 1 for a treatment municipality and its
matched control municipalities. The match-dummy for Hedemora is the omitted baseline. The dataset includes 35
municipalities, of which six municipalities received treatment. All municipalities are weighted on ninth-grade student
population.
(1) (2) (3) (4) (5) (6)
English
test
English test Math test Math test Swedish test Swedish test
School Merge 0.157 0.029 0.530** 0.333 0.156 0.004
(0.244) (0.172) (0.241) (0.275) (0.210) (0.164)
Female students (%) 0.012 0.021 0.042
(0.031) (0.094) (0.049)
Students with migrant -0.003 -0.041** -0.008
Background (%) (0.009) (0.017) (0.012)
Students of parents with 0.055*** 0.042*** 0.053***
university education (%) (0.006) (0.012) (0.006)
Constant 14.322*** 11.599*** 10.962*** 8.727* 12.723*** 8.669***
(0.369) (1.593) (0.313) (4.630) (0.251) (2.433)
Observations 175 175 175 175 175 175
R-squared 0.245 0.616 0.402 0.557 0.099 0.429
Numbers of municipalities 35 35 35 35 35 35
Match Fixed Effects yes yes yes yes yes yes
Time Fixed Effects yes yes yes yes yes yes
Adjusted R-squared 0.198 0.585 0.366 0.521 0.0443 0.383
- 26 -
In addition to examine the effect of the public secondary school mergers on student academic
achievement outcomes I want to investigate the impact on the share of students that are enrolled in the
public-school sector. Table 13 shows the effect of the mergers on the municipality-level share of 7th
grade students enrolled in public schools. The point estimate in column (2) shows that the public
secondary school mergers caused the share of 7th graders enrolled in the public-school sector in the
treated municipalities to decrease by approximately 10 percentage points. This result is statistically
significant on a five-percent level.
This tells us that the merging of the public secondary schools caused a smaller share of parents to
enroll their children in the public-school sector in 7th grade. The conclusion from this result is that the
municipalities seem not to have been successful in making the public-school alternative more attractive
by implementing the school mergers.
Table 13. Match: effect on the share of 7th grade students enrolled in public school sector
(1) (2)
Share 7th graders Share 7th graders
School Merge -10.371*** -9.981**
(3.683) (3.841)
Female students (%) -0.649
(0.801)
Students with -0.019
migrant background (%) (0.254)
Students of parents with -0.265
university education (%) (0.267) Constant 84.661*** 128.375***
(4.006) (33.751)
Observations 130 130
R-squared 0.278 0.309
Number of municipalities 26 26
Match Fixed Effects yes yes
Time Fixed Effects yes yes
Adjusted R-squared 0.217 0.231
Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.
The table presents the results of the difference-in-difference estimation with match-fixed
effects. It shows the effect of secondary school mergers on the share of 7th graders enrolled in a
municipality’s public school sector. Standard errors are clustered on municipalities. The control
group has been obtained through propensity score matching on pre-treatment covariates.
The dataset includes 26 municipalities, of which six municipalities received treatment.
All municipalities are weighted on ninth-grade student population
- 27 -
6.3 Heterogeneous treatment effects
In this section I will present the results of my heterogeneous treatment effect analysis. The aim of
this analysis is to examine whether the public secondary school mergers affect the students’ academic
outcomes in various ways due to their background characteristics. The estimation is done using the
matched difference-in-difference estimation specified in section 5.3. Just as in the main analysis the
point estimates in this section are robust to adding controls. This is reasonable since the propensity score
matching was done in part on the control variables. The tables in this section only includes the results
where the controls have been added.
Tables 14 to 16 present the point estimates of the effect of the school mergers on final grade
outcomes for the various student populations. The analysis is done with separate panels that differ in
size. Because of this, the results cannot be compared across panels. I have also included results from
estimating the effect of treatment on the outcomes for the entire student population. When comparing
these results it becomes obvious that the magnitude of this effect differs substantially across the tables.
Thus, the effect of treatment on the outcomes for the entire student population seems to be not be robust
to changes in the composition of the treatment and control group. These estimates also differ from the
main analysis results, presented in table 10. This suggest that my main results are sensitive to differences
in sample size and time period specification. I can therefore not say for certain that my analysis yields
the true magnitude of the heterogeneous treatment effects. I will however attempt to draw some
inferences about the direction of the effects. It is important to keep in mind that my heterogeneous
treatment effect analysis does not seem to be representative of my main analysis sample.
Table 14 presents the heterogeneous treatment effects on final grade outcome measures for the
female and male student populations. My results show that the public secondary school mergers have
an overall positive effect on the outcomes for female students. The school mergers have a negative effect
on male average GPA and the share of male students that pass subjects in the treated municipalities.
However, all the effects are close to zero and insignificant.
Table 15 presents the effect of the public secondary school mergers on the final grade outcomes
for students with different parental education levels. The first student population is ‘students of parents
with an upper-secondary school education’, which includes students for which the highest level of
education that both parents have is either a secondary school or an upper-secondary school education.
The second student population is ‘students of parents with a university education’, which includes
students who have at least one parent that attended university. Note that the panel used for this analysis
only includes the time period 2014/2015-2016/2017. Overall, the public secondary school mergers are
efficient in increasing final grade outcomes for students with parents that have at least an upper-
secondary school education. The overall effects of the school mergers on the final grade outcomes for
students with parents that have attended university are negative.
Table 16 presents the effects of the public secondary school mergers on final grade outcomes for
students with different native backgrounds. The first student population is ‘students with Swedish
background’, which includes students born in Sweden with at least one parent born in Sweden. The
second student population is ‘students with migrant background’, which includes students born abroad
who have migrated to Sweden. The results from table 16 show that the effects of the public secondary
school mergers on average GPA and the share of students that pass all subjects are negative and
statistically significant for both student populations in the treated municipalities. However students with
migrant background seem to be especially disadvantaged. The share of students eligible for vocational
school is reduced for both student groups. However, the effect is larger in magnitude and statistically
significant for students with migrant background.
As mentioned in section 4.3, the panel used for estimating the results in table 16 only includes three
treated municipalities. Because of this the treatment municipalities are matched with the three untreated
municipalities that have the closest propensity score. Hence, the sample is small and it is good to keep
- 28 -
in mind that the inferences I draw from table 16 may lack external validity. In essence, this means that
the results may not be applicable to student populations on a national scale.
Table 14. Heterogeneous treatment effects: female and male students
Average GPA Share pass all subjects Share eligible vocational school ed.
(1) (2) (3) (4) (5) (6) (7) (8) (9)
Total student
population
Female
students
Male
students
Total
student
population
Female
students
Male
students
Total student
population
Female
students
Male
students
School Merge -0.47913 1.13816 -1.46768 -0.61935 0.51509 -1.63239 0.35698 0.51569 0.61265
(2.29997) (3.20869) (2.10874) (1.36129) (1.24397) (1.63231) (1.16501) (1.06004) (1.41028)
Constant 163.75372*** 220.10257*** 153.48204*** 71.95804** 93.27816*** 62.89278 90.20224*** 106.77226*** 82.01550**
(43.62464) (55.52969) (44.23858) (32.99998) (29.03651) (39.58713) (25.63700) (23.62900) (30.94624)
Observations 155 155 155 155 155 155 155 155 155
R-squared 0.43951 0.40607 0.42207 0.31936 0.21620 0.35765 0.42459 0.29993 0.44973
Number of mun. 31 31 31 31 31 31 31 31 31
Controls yes yes yes yes yes yes yes yes yes
Match Fixed Effects yes yes yes yes yes yes yes yes yes
Time Fixed Effects yes yes yes yes yes yes yes yes yes
Adjusted R-squared 0.388 0.351 0.369 0.257 0.144 0.298 0.372 0.235 0.399
Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1
The table presents the heterogeneous treatment effect results of the match difference-in-difference estimation for the female
and male student populations. Standard errors are clustered on municipalities. The dataset includes 31 municipalities, of which
six municipalities received treatment. The estimation is done with math-fixed effects. The match-dummy for Surahammar is
the omitted baseline. All municipalities are weighted on ninth-grade student population.
Table 15. Heterogeneous treatment effects: parental education level
Average GPA Share pass all subjects Share eligible vocational school ed. (1) (2) (3) (4) (5) (6) (7) (8) (9)
Total student
population
Students of
parents with
upper-sec ed.
Students of
parents with
university ed.
Total student
population
Students of
parents with
upper-sec ed.
Students of
parents with
university
ed.
Total student
population
Students of
parents with
upper-sec ed.
Students of
parents with
university
ed.
School Merge 0.45498 2.97003 -4.62459* 2.72206 4.87829*** -1.30829 3.04038* 4.52500** -0.50355
(3.92063) (3.93087) (2.35355) (2.15676) (1.68483) (3.65570) (1.72287) (1.87814) (0.95529)
Constant 206.94134*** 248.28899*** 235.29010*** 90.41178*** 124.26268*** 88.26894*** 81.93546*** 102.35472*** 79.05545***
(36.59284) (53.98529) (32.28426) (23.31593) (31.13542) (25.62171) (19.00266) (25.00144) (15.98070)
Observations 85 85 85 85 85 85 85 85 85
R-squared 0.38320 0.14810 0.36951 0.45213 0.27302 0.28449 0.52692 0.28517 0.31458
Number of mun. 15 15 15 15 15 15 15 15 15
Controls yes yes yes yes yes yes yes yes yes
Match Fixed
Effects
yes yes yes yes yes yes yes yes yes
Time Fixed Effects yes yes yes yes yes yes yes yes yes
Adjusted R-squared 0.299 0.0314 0.283 0.377 0.173 0.186 0.462 0.187 0.221
Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1
The table presents the results of the match difference-in-difference estimation of the heterogeneous treatment effects for
students with different parental education level. Standard errors are clustered on municipalities. The dataset includes 15
municipalities, for which five municipalities received treatment. The estimation is done with math-fixed effects. The match-
dummy for Surahammar is the omitted baseline. This analysis deviates from the overall time period of this study as it only
includes the years 2014/2015-2016/2017. All municipalities are weighted on ninth-grade student population.
- 29 -
Table 16. Heterogeneous treatment effects: Swedish and migrant background Average GPA Share pass all subjects Share eligible vocational school
ed. (1) (2) (3) (4) (5) (6) (7) (8) (9)
Total
student
population
Students
Swedish
background
Students
migrant
background
Total
student
population
Students
Swedish
background
Students
migrant
background
Total
student
population
Students
Swedish
background
Students
migrant
background
School Merge -9.0177*** -8.4655*** -22.2502** -4.5848*** -3.72486** -10.3706*** -2.5571** -0.95203 -8.52463**
(2.65597) (2.46459) (8.14918) (1.30229) (1.34728) (2.46222) (0.95184) (0.80538) (3.70287)
Constant 48.84611 -46.31296 -228.44578 -
87.57863**
-
105.26161**
-12.49452 -37.24689 -53.45861* -179.41618
(76.24301) (63.50702) (241.13988) (35.16392) (36.81842) (132.41945) (27.32641) (23.95376) (108.74968)
Observations 50 50 50 50 50 50 50 50 50
R-squared 0.87036 0.85785 0.71374 0.73230 0.58500 0.63015 0.75380 0.64839 0.55691
Number of mun. 8 8 8 8 8 8 8 8 8
Controls yes yes yes yes yes yes yes yes yes
Match Fixed Effects yes yes yes yes yes yes yes yes yes
Time Fixed Effects yes yes yes yes yes yes yes yes yes
Adjusted R-squared 0.833 0.821 0.640 0.664 0.479 0.535 0.691 0.558 0.443
Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1
The table shows the match difference-in-difference estimation of the heterogeneous treatment effects for students with Swedish
and migrant background. ‘Students with Swedish background’ includes students born in Sweden with at least one parent also
born in Sweden. ‘Students with migrant background’ includes student born outside of Sweden with both parents also born
outside of Sweden. Standard errors are clustered on municipalities. The dataset includes 8 municipalities, of which three
municipalities received treatment. All municipalities are weighted on ninth-grade student population.
- 30 -
7. Sensitivity analysis
7.1 Event studies
It is interesting to investigate the impact that the public secondary school mergers have over time.
This is because this capture both the pre-treatment trends and the yearly effects of treatment in the
aftermath of the public secondary school mergers. I do this by conducting multiple event studies. For
this analysis I use the data on the matched samples. I plot the difference-in-difference point estimates
where the binary treatment variable has been interacted with the years before and after the public
secondary school mergers were implemented. The year before the school merger was implemented, t-1,
is the omitted base category and is given by the dashed zero-line. In my analysis I compare the
coefficients to the zero-line. For example, a point estimate above the zero-line shows a positive effect
of treatment on outcome compared to the baseline year, and the opposite holds for a point below. The
event studies for each outcome measure are shown in figures 1 and 2.
Figure 1 – Event studies for final grade outcomes and the share of 7th graders in public schools
The graphs show the point estimates for the effect of treatment for each year. Year t-1 is the omitted category. The red line
represents that treatment was implemented one semester before the first outcomes are observed in the data. The t-2 point
estimates are based on outcome data from the municipalities that have two pre-treatment periods; Nyköping and Älmhult.
Consequently, these two municipalities are not part of the t+6 point estimates.
(a) Average GPA (b) Share students that pass all subjects
(c) share students elgibile for a vocational school education
(d) share 7th graders in public-school sector
- 31 -
Figure 2. Parallel trends plots – national standardized test outcomes
The graphs show the point estimates for the effect of treatment for each year. Year t-1 is the omitted category. The red
line represents that treatment was implemented one semester before the first outcomes are observed in the data. The t-2 point
estimates are based on outcome data from the municipalities that have two pre-treatment periods; Nyköping and Älmhult.
Consequently, these two municipalities are not part of the t+6 point estimates.
(a) English test – Average GPA (b) Math test – Average GPA
(c) Swedish test – Average GPA (d) English test – share of students that pass
(e) Math test – share of student that pass (f) Swedish test – share of students that pass
- 32 -
7.1.1 Parallel trends
The trust I can invoke in the results of my matched difference-in-difference estimation rests on
my key identifying assumption. It says that conditional on the propensity score there would have been
no systematic differences in trends in student outcomes between a treated municipality and its matched
control group if treatment had not occurred. This is essentially an extension of the parallel trends
assumption defined in section 5.2. If the effect of the school mergers on pre-treatment trends are
statistically insignificant it indicates that the assumption holds. I do not have enough pre-treatment data
to observe any trends, however I am able to observe one pre-treatment year, which is denoted t-2 in my
event study figures. If the coefficient for year t-2 is statistically insignificant, this provide some proof
that the parallel trends assumption may hold. Note that the effect in year t-2 is calculated based on the
municipalities that have two pre-treatment periods, and therefore the coefficient does not reflect the
whole treatment group.
The event studies in figure 1 show that for the majority of the outcomes the effect of treatment is
statistically insignificant in year t-2. It is only for the average GPA outcome that the coefficient is
significantly different from zero, which is the average GPA. Hence, this suggest that there are parallel
trends for the pre-treatment outcomes for the share of students in 7th grade measures and all final grade
outcomes except for the average GPA.
In figure 2 we see that the majority of the pre-treatment effects for the national standardized test
outcomes are statistically significant. The coefficients in year t-2 that are not significantly different from
zero are for the average GPA and the share that pass the national standardized test in English. Thus, the
parallel trends assumption seems to hold for the effects of the public secondary school mergers on the
national standardized tests in English, however not for the tests in math or Swedish. This is a sign that
the small and insignificant effects of treatment on the national standardized test outcomes presented in
my main analysis may be bias.
My baseline difference-in-difference estimation also rests on the assumption of parallel trends.
The fact that the results using this approach were unstable and insignificant indicates that the parallel
trends assumption may be violated. I have therefore conducted the same event studies using the same
data that I used in my baseline estimation. These are shown in figure C1, in appendix C. From these
figures we see that many of the pre-treatment effects are statistically significant, which further proves
that I cannot trust that the parallel trends assumption holds for my baseline difference-in-difference
estimations.
7.1.2 Dynamic treatment effects
The student cohorts may be affected differently by treatment due to their exposure time of
treatment. Thus, the magnitude of effect of the public secondary school mergers on student outcomes
may increase with time. For instance, the first student cohort exposed to treatment only attended 9th
grade in the merged school. It is likely that the magnitude of the effect of the school mergers is smaller
for this group than for the subsequent student cohorts. The effect of the mergers on the share of 7th
graders enrolled in the public-school sector may also be affected by exposure time to treatment. This is
because the parents may form different opinions about the merged public schools when they become
more established, which in turn may affect their choice of school for their children.
Panels (a), (b) and (c) in figure 1 show that the effect of the school mergers on the final grade
outcomes are initially positive but insignificant and close to zero. However, the positive effect turns
negative for the second student cohort, and the negative effect continue to increase in magnitude. For
the share of students that pass all subjects and the share of students that are eligible for a vocational
school education the point estimates seem to stabilize at a constant negative effect in year t+2, while the
negative effect of treatment on average GPA seem to continue to increase in magnitude.
- 33 -
As mentioned earlier, the effect of the school mergers on the share of 7th graders enrolled in the
public-school sector may also be affected by the exposure time. Panel (d) shows the effect of the school
mergers on the share of 7th grade students in the public-school sector. The point estimates are
continuously negative and increase in magnitude until year t+2. This serve some proof that a larger share
of the parents in the municipalities chose to enroll their children in independent schools instead of the
merged public school in the year that followed the implementation of the merger.
Figure 2 shows the event studies for the national standardized test outcomes. For most of the
outcomes the coefficients are initially positive, but declines over the years and turn negative in year t+2
or becomes zero in year t+3. The only effect of treatment that seem to be altogether positive is on the
share of students that pass the Swedish test. The fact that many of the coefficients are insignificant and
close to zero is in line with the main result findings.
7.2 Placebo tests on control variables
The controls that I include in the estimations are municipality-level student characteristics for the
entire compulsory school student population. The reason why I include them is because they are
potential determinants of the student outcomes and correlated with treatment. For example, there might
be positive trends in average GPA when the share of students with parents that have a university
education increases in the municipality because these students may possess certain abilities that cause
them to perform well in school. On the other hand, a municipality with a large share of student with
parents who have migrated to Sweden might also have a larger share of students enrolled in the public
school sector. This may be happen if there could exist information asymmetry between parents which
causes those who have never attended a school in Sweden themselves to not have full information about
the options that come with the ‘free choice’ principle.
In order to isolate the causal effect of the public secondary school mergers on student outcomes,
these variables need to be included and held constant in the regressions. The aim is to have conditional
mean independence; conditional on the control variables, the public secondary school mergers become
as if randomly assigned to the municipalities in question.
In this section I want to examine whether the municipality-level student characteristics change
when the public secondary school mergers are implemented. If this would be the case it implies that
treatment covaries with the municipality demographics. This might happen if residents with certain
characteristics move in or out of the municipality as a reaction to the merger. It might also happen if the
implementation of the public secondary school mergers happens because of the existence of certain
trends in municipality demographics. This means that because of the school mergers, the treated and
untreated municipalities experience different trends in student characteristics, which is an obvious threat
to the parallel trends assumption.
In order to test whether the municipality-level student characteristics covary with the school
mergers I perform a placebo test where I estimate the effect of treatment using the control variables as
the outcomes. The tests are done using the matched panels. Essentially, this is a test to see whether my
matching has been successful in creating a control group that works as a suitable counterfactual to the
treatment group. The results are shown in table 17. From this table we see that the public secondary
school mergers do not have a significant effect on the control variables. This is true for all panels. A
statistically significant effect of treatment on the control variables would have indicated that the effects
in the main analysis stems from changes in the municipality demographics that covary with treatment.
Instead, the results from my placebo tests points to the fact that the treatment and control group do
experience the same trends in municipality-level student characteristics. This result provides proof that
I have been able to isolate the causal effect of the school mergers on student outcomes by performing
estimations with my matched samples.
- 34 -
Table 17. Control placebo test Final grade outcomes panel National standardized test panel Share students in public schools panel
(1) (2) (3) (4) (5) (6) (7) (8) (9)
Female
students
(%)
Migrant
BG (%)
Parents w.
uni. ed.
(%)
Female
students
(%)
migrant
BG (%)
Parents w.
uni. ed. (%)
Female
students
(%)
Migrant
BG (%)
Parents w.
uni. ed. (%)
School Merge 0.424 -2.036 3.484 0.553 -2.425 2.064 0.124 -0.449 -0.028
(0.384) (2.316) (2.394) (0.496) (2.004) (3.163) (0.328) (1.005) (0.740)
Constant 48.139*** 11.574*** 35.114*** 48.781*** 10.709*** 39.081*** 49.008*** 18.288*** 43.702***
(0.322) (1.673) (1.529) (0.526) (1.434) (2.365) (0.171) (0.497) (0.261)
Observations 155 155 155 175 175 175 130 130 130
R-squared 0.215 0.300 0.256 0.134 0.277 0.151 0.128 0.745 0.359
Number of mun. 31 31 31 35 35 35 26 26 26
Match Fixed
Effects
yes yes yes yes yes yes yes yes yes
Time Fixed Effects yes yes yes yes yes yes yes yes yes
Adjusted R-squared 0.224 0.185 0.539 0.0812 0.233 0.0989 0.0927 0.735 0.334
Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1
This table shows a placebo test of the effect of treatment on the control variables used in the main analysis. This is done using
the difference-in-difference estimation with match-fixed effects. Standard errors are clustered on municipalities. The panels
consist of six treated municipalities and the matched control groups. All municipalities are weighted on ninth-grade student
population.
7.3 ‘Worst case’ and ‘best case’ variables
In section 4.1 I discussed the issue of the ~100 values. When there are four students or less that
failed to become eligible for vocational school education or failed to pass the national standardized test
SIRIS denotes these variables with the value ~100.
In order to solve this issue I created two different types of measures for these outcome variables.
The first one is the ‘worst case scenario’ variable where I assume that four students fail to pass and then
calculate the true value of this case by dividing four by the student population and subtracting it from 1.
In the ‘worst case’ variable all the ~100 values are recoded with the value of my calculations. The second
one is the ‘best case scenario’ variable where I assume that all students pass, so I recode all ~100 values
to 100. The measure that I have used in all my estimations is the ‘worst case scenario’ variable since it
is likely that at least some students should fail to pass in each municipality. Yet it is interesting to conduct
estimations using both measures to see whether the effects differ.
I present the results of my baseline difference-in-difference estimation with the two outcome
variables in table D1 in appendix D.13 The point estimates do not differ in sign but there are some small
differences in magnitude. However, there is not one of the two outcome measure that consistently
produces a larger effect. Thus, the changes in magnitude seem to be arbitrary. Furthermore, the ‘best
case scenario’ variable tends to produce larger standard errors. From this I conclude that using the ‘worst
case scenario’ variable should not provide biased estimates over the use of the ‘best case scenario’
variable.
13 Table D1 only include the point estimates where no controls are added. The point estimates where controls are added did
not change my conclusion of the test. The same is true when running the matched difference-in-difference estimation on the
‘worst case’ and ‘best case’ variables.
- 35 -
8. Discussion and concluding remarks
The first aim of this study was to examine whether the public secondary school mergers were
efficient in increasing average student academic performance. For the municipalities that implement the
school mergers it is important that low-ability students get the help they need to pass secondary school
even though they attend a school with a larger number of students. Yet it is also important that the
students who would have performed well regardless of the initiative are not negatively affected by the
school merger. The results of my matched difference-in-difference estimation showed that the public
secondary school mergers had a negative effect on the municipality-level average GPA. The school
mergers also seem to reduce the share of students in the municipality that passed all subjects and the
share that were eligible for a vocational school education. This tells us that the municipalities that
implemented the school merger experienced a reduction in the share of students that resided in the top
of the skill distribution, while the share of students in the bottom increased. The take-home message
here is that the share of students that had the possibility to choose among a wide range of the upper-
secondary school programs shrank whilst the share of students that did not have enough passing grades
to continue on to upper-secondary school increased.
The second aim of this study was to examine whether a secondary school merger affects the share
of students that attend the public secondary school instead of choosing to enroll in an independent
school. I investigated this by estimating the effect of treatment on the share of 7th graders enrolled in the
municipality’s public-school sector. My results show that the public secondary school mergers caused
the share of 7th graders enrolled in the public-school sector to decrease by approximately 10 percentage
points. This implies that the initiative caused parents in the treated municipalities to seek another
alternative for their children rather than to enroll them in the merged schools. Along these lines, merging
the public secondary schools to one school does not seem to have made the public-school alternative
more competitive.
The trust that I can invoke in these results depends on whether the parallel trends assumption is
satisfied. The results of my event studies indicate that the assumption does not hold for all outcomes.
However, when I move on to conduct a placebo test on the control variables for the municipality-level
student characteristics the result shows that treatment has no statistically significant effect. It indicates
that treatment does not covary with municipality demographics in my matched panels. This seemed to
be an issue with my baseline difference-in-difference estimation. This provides some proof that the
treatment and control groups follow the same trends in pre-determined student characteristics. Hence
my matching strategy has created control groups that provide a more suitable counterfactual than using
all the municipalities with a constant number of secondary schools as controls.
Even though the positive effects of the school mergers on the students’ academic performance
seem to be absent, the initiative might still be an efficient desegregation tool. That is, if the school
mergers have a positive effect on academic outcomes for students who otherwise might experience
drawbacks from school segregation. I explored this by examining in what ways the public secondary
school mergers have heterogeneous treatment effects of the merger. My results show that public
secondary school mergers affect students’ final grade outcomes differently depending on their
background characteristics. However, my results also show that the effect of treatment on the outcomes
for the entire student population seems to be sensitive to changes in the composition of the treatment
and control group. I can therefore not say for certain that my estimations yield the true magnitude of any
heterogeneous treatment effects.
According to previous research, the student population that is negatively affected by school
segregation is students with a migrant background (Szulkin, Jonsson 2007; Nordin 2013; Grönqvist et
al 2015). The most prominent goal with the public secondary school merger in Nyköping municipality
was to reduce segregation among its youth population. This implies that the municipality was hoping
for an ‘integration effect’ which rests on the belief that the within-school peer effects are non-linear. In
- 36 -
other words, they believe that weak students experience positive effects from high-ability peers, while
high-ability students are not hurt by associating with low-ability classmates. The result of my
heterogeneous treatment effect analysis instead provides some evidence that the negative effect of the
school mergers on academic performance seem to be especially large for students with migrant
backgrounds. Due to the small sample size of the panel I use when I conduct this analysis it is good to
keep in mind that the magnitude of the true effect of the school mergers for students with migrant
backgrounds is uncertain and the results may lack external validity. However, there are some possible
explanations for why students with migrant backgrounds may not be benefiting from this type of
initiative. Firstly, creating one large public school in the municipality may not be the most efficient way
of helping students who might have a lower probability of scholarly success due to insufficient Swedish
language skills or other background factors. Potentially, this student population may benefit from
attending smaller schools where they will receive school resources that have a direct impact on their
academic performance. Secondly, under the assumption that students with migrant backgrounds would
benefit from attending a school with peers that have diverse backgrounds, the absence of the positive
impact of treatment on their academic performance might indicate that there is within-school segregation
that arises if the students form homogeneous sub-peer groups.
According to my results, the only group that seemed to benefit from the mergers is students of
parents who have at most an upper-secondary school education. The fact that students of parents with a
university education experience negative impacts from the merger may imply that these students benefit
from attending schools with a more homogeneous environment. These students would presumably have
attended a public school where students on average performed better than the municipality-mean if the
school mergers had not been implemented. Thus, in the absence of the initiative it is probable that these
students would have benefited from interacting with high-ability peers on a daily basis. The main results
indicate that the public school mergers had a negative effect on the share of students enrolled in public
schools. It might be the parents of students in the top of the skill distribution that choose to enroll their
children in independent schools instead. If the share of high-ability students in the public-school sector
is reduced the positive peer effects in the merged school one anticipated coming from this group might
be insufficient in increasing average student performance. I am not able to look at the heterogeneous
treatment effects of the school merger initiative on the share of students in the public-school sector. One
interesting contribution for future research is to explore whether the initiative caused students with
certain background characteristics to seek admission elsewhere. If parents with high socioeconomic
status choose an independent school for their children as a reaction to the school mergers, the initiative
may actually cause between-school sector segregation to increase rather than to eliminate it.
Investigating the effects of the public secondary school mergers on student outcomes has both
social and policy relevance. If the school mergers would have been successful in increasing the students’
academic outcomes it could have been an efficient and cost-reducing way for officials to handle the
potential imbalance in student performance across the public schools in a municipality. Furthermore, if
the school merger was in fact efficient in promoting integration there could have been positive outcomes
for students that are particularly hurt by school segregation.
The conclusion of this study is that the public secondary school mergers were not sufficient in
either increasing the municipality-level student outcomes on average, or the outcomes for the students
in the bottom of the skill distribution. This does not necessarily mean that students would not benefit
from efforts to eliminate school segregation, rather that this type of effort might not be enough to do so.
There is also the question of whether peer effects actually have an impact on students’ academic
performance. Sacerdote (2011) argues that while the large body of peer effects research has established
some evidence that peers affect an individual’s social outcomes, it is important to further examine
whether peer effects have any significant impact on educational outcomes. Implementing a merger of
all public secondary schools in a municipality seems to rest heavily on the idea that desegregation will
- 37 -
happen because of social interaction amongst students with diverse backgrounds. In order for students
to truly benefit from the peer effects that potentially emerge, there might also need to be resources spent
on eliminating within-school segregation. Thus, school officials might also need to create incentives for
students to interact with classmates that they do not share similar traits with. In this study I am not able
to draw inferences in what ways peers affect students within the merged public secondary schools, but
it is an interesting topic to explore in the future.
- 38 -
References
Angrist, Joshua. Lang, Kevin. 2004. Does school integration generate peer effects? Evidence
from Boston’s metco program. The american economic review. Vol. 94, no. 5. P. 1613-1634
Arbetarbladet. 2013. Nu är det klar: det blir en storskola. [Gathered 2018-02-05]
http://www.arbetarbladet.se/gastrikland/gavle/nu-ar-det-klart-det-blir-en-storskola
Billings et al. 2012. School segregation, ecuational attainment and crime: evidence form the
end of busing in charlotte-mecklenburg. National bureau of economic research. Working paper
18487.
Brandén, Maria. Birkelund, Gunn Elisabeth. Szulkin, Ryszard. 2016. Does school segregation
lead to poor educational outcomes? Evidence form fifteen cohorts of Swedish ninth graders.
The institution of analytical sociology, Linköping university. Working paper 2016:4.
Carell, Scott. Fullerton, Richard, West, James. 2008. Does cohort matter? Measuring peer
effecs in college achievement. National bureau of economic research. Working paper 14032.
Carell, Scott. Sacerdote, Bruce. West, James. 2011. From natural variation to optimal policy?.
National bureau of economic research. Working paper 16856
Conley, Timothy G. Taber, Christopher R. 2011. Inference with “difference in differences” with
a small number of policy changes. The review of Economics and Statistics. Vol 92. No 1. P
113-125.
Dagens Nyheter. 2016. Nyköping byggde en skola och bröt segregationen. [Gathered 2018-02-
05]https://www.dn.se/nyheter/sverige/nykoping-byggde-ny-skola-och-brot-
segregationen/?utm_source=facebook&utm_medium=page&utm_campaign=dn
Dagens samhälle. 2010. Barriärer bryts i en skola för alla. No. 10 p. 18-19. [Gathered 2018-
02-05]
https://nykoping.se/Global/Dokument/Skolor/Nykopings_hogstadium/170321%20artikel%20Dagenssamh%C3%A
4lle.pdf
Donald, Stephen G. Lang, Kevin. 2007. Inference with difference-in-differences and other panel
data. The Review of Economics and Statistics, Vol 89. No 2. p. 221-233
Grönqvist, Hans. Niknami, Susan. Robling, Per-Olof. 2015. Childhood exposure to segregation
and long-run criminal involvement. Swedish institute for social research, Stockholm university.
Working paper 2015:1.
Hoxby, Caroline. 2000. Peer effects in the classroom: learning from gender and race variation.
National bureau of economic research. Working paper 76867
Johnson, Rucker. 2011. Long-run impacts of school desegregation and school quality on adult
attainments. National bureau of economic research. Working paper 1664.
Manski, Charles. 1993. Identification of endogenous social effects: the reflection problem. The
review of economic studies. Vol. 60 p. 531-542.
Nordin, Martin. 2013. Immigrant school segregation in Sweden. Population research and policy
review. no 32. p. 415-453.
OECD. 2016. Country note: Sweden - Results from PISA 2015. Programme for international
student assessment (PISA). [Gathered 2018-05-25]
https://www.oecd.org/pisa/PISA-2015-Sweden.pdf
Sacerdote, Bruce. 2001. Peer effects with random assignment. The quarterly journal of
economics. Vol 116. P 681-704.
Sacerdote, Bruce. 2011. Peer effects in education: How might they work, how big are they and
how much do we know thus far? Handbook of the Economics of education, volume 3. Chapter
4. Elsevier B.V. 2011. DOI: 10.1016/S0169-7218(11)030048.
Smith, Jeffrey A, Todd, Petra E. 2005. Does matching overcome Lalonde’s critique of
nonexperimental estimators?. Journal of econometrics 125 (2005) p. 305-353.
- 39 -
Stuart, Elisabeth A. 2010. Matching methods for causal inference: A review and a look
forward. Statistical Science vol. 25 no.1 (February 2010) p.1-21.
Svt Nyheter. 2015. En skola istället för fyra ska bryta segregationen. [Gathered 2018-02-05]
https://www.svt.se/nyheter/inrikes/en-skola-istallet-for-fyra-ska-bryta-segregationen
Svt Nyheter. 2016. Ny storskola planeras i Arvika. [Gathered 2018-02-05]
https://www.svt.se/nyheter/lokalt/varmland/ny-storskola-planeras-i-arvika
Swedish Ministry of Education. 2017. Samling för skolan – nationell strategi för kunskap och
likvärdighet. SOU 2017:35. Stockholm.
Swedish National Agency for Education. 2013. PISA 2012 – 15-åringars kunskaper I
matematik, läsförståelse och naturvetenskap, resultaten I koncentrat. Summary of report 298
2012. ISBN: 978-91-7559-069-1. Stockholm.
Swedish National Agency of Education. 2017. Elever och skolenheter I grundskolan läsåret
2016/17. Promemoria. Dnr 2016:1320.
Szulkin, Ryszard. Jonsson, Jan. 2007. Ethnic segregation and educational outcomes in Swedish
comprehensive schools. The Stockholm university Linnaeus center for integration studies.
Working paper 2007:2.
- 40 -
Appendix A.
Table A1. Municipalities excluded from all control groups Municipalities with a decreasing number of
public secondary schools
Municipalities with an increasing number of public
secondary schools
Name: Official mun. key: Name: Official mun. key:
Boden 2582 Botkyrka 127
Enköping 381 Eskilstuna 484
Gävle 2180 Göteborg 1480
Huddinge 126 Järfälla 123
Härnösand 2280 Lerum 1441
Kiruna 2584 Lidingö 186
Kristianstad 1290 Lomma 1262
Landskrona 1282 Mark 1463
Leksand 2029 Nacka 182
Linköping 580 Sandviken 2181
Ludvika 2085 Storuman 2421
Lund 1281 Strängnäs 486
Mölndal 1481 Södertälje 181
Norrköping 581 Trelleborg 1287
Norrtälje 188 Umeå 2480
Pajala 2521 Uppsala 380
Sigtuna 191 Vänersborg 1487
Sjöbo 1265 Älvdalen 2039
Skövde 1496 Ängelholm 1292
Sundbyberg 183 Öckerö 1407
Timrå 2262
Täby 160
Upplands-väsby 114
Varberg 1383
Värmdö 120
Västervik 883
Västerås 1980
Östersund 2380
The table shows the municipalities that have been excluded from all the panels. The official municipality key is the
identification number of the municipality. The keys are decided by the Swedish tax office, Skatteverket.
- 41 -
Appendix B
Table B1. Matched control groups
Panel 1 - final grade outcomes
Nyköping Älmhult Örkelljunga
Gällivare 2523 Kinda 0513 Klippan 1276
Karlsborg 1446 Malung-Sälen 2023 Storfors 1760
Essunga 1445 Laxå 1860 Östra Göinge 1256
Torsby 1737 Berg 2326 Håbo 0305
Haninge 0136 Sävsjö 0684 Högsby 0821
Munkedal Surahammar Hedemora
Vindeln 2404 Kinda 0513 Jokkmokk 2510
Nora 1884 Malung-Sälen 2023 Haparanda 2583
Nordmaling 2401 Laxå 1860 Åstorp 1277
Gnesta 0461 Sävsjö 0684 Bjuv 1260
Överkalix 2513 Berg 2326 Simrishamn 1291
Panel 2 – share of students in a municipality’s public schools
Nyköping Älmhult Örkelljunga
Upplands-bro 0139 Haninge 0136 Hylte 1315
Ronneby 1081 Höör 1267 Tingsryd 0763
Kalix 2514 Sävsjö 0684 Gislaved 0662
Orust 1421 Ånge 2260 Östra Göinge 1256
Lilla edet Skara Hedemora
Nordanstig 2132 Höör 1267 Vilhelmina 2462
Övertorneå 2518 Haninge 0136 Nordanstig 2132
Simrishamn 1291 Trollhättan 1488 Jokkmokk 2510
Fagersta 1982 Överkalix 2513 Övertorneå 2518
Panel 3 – national standardized test outcomes
Nyköping Älmhult Örkelljunga
Ragunda 2303 Östra Göinge 1256 Storfors 1760
Sotenäs 1427 Fagersta 1982 Ovanåker 2121
Åstorp 1277 Simrishamn 1291 Bollnäs 2183
Vansbro 2021 Heby 0331 Bengtsfors 1460
Haninge 0136 Haparanda 2583 Hylte 1315
Munkedal Surahammar Hedemora
Osby 1273 Vellinge 1233 Håbo 0305
Hallstahammar 1961 Bjuv 1260 Ljusdal 2161
Åre 2321 Gislaved 0662 Klippan 1276
Arvika 1784 Jokkmokk 2510 Älvkarleby 0319
Köping 1983 Boxholm 0560 Heby 0331
Panel 4 – final grade outcomes divided on gender Nyköping Älmhult Örkelljunga
Båstad 1278 Mullsjö 642 Gnosjö 617
Karlshamn 1082 Gullspång 1447 Kramfors 2282
Halmstad 1380 Norsjö 2417 Tidaholm 1498
Svedala 1263 Mariestad 1493 Oxelösund 481
Sundsvall 2281 Gislaved 662 Bjuv 1260
Munkedal Surahammar Hedemora
Munkfors 1762 Högsby 821 Mullsjö 642
Borlänge 2081 Åmål 1492 Gnesta 461
Sävsjö 684 Oxelösund 481 Emmaboda 862
Markaryd 767 Fagersta 1982 Gullspång 1447
Vara 1470 Kramfors 2282 Norsjö 2417
- 42 -
Panel 5 – final grade outcomes divided on parental education level
Flen Götene Skara
Gnosjö 617 Klippan 1276 Övertorneå 2518
Malå 2418 Östra Göinge 1256 Simrishamn 1291
Fagersta 1982 Strömstad 1486 Töreboda 1473
Mellerud 1461 Ovanåker 2121 Alvesta 764
Burlöv 1231 Tranemo 1252 Tingsryd 763
Årjäng Säffle
Markaryd 767 Valdemarsvik 563
Falkenberg 1382 Hörby 1266
Sollefteå 2283 Hallstahammar 1961
Borlänge 2081 Ånge 2260
Hörby 1266 Orsa 2034
Panel 6 – final grade outcomes divided on Swedish and migrant background
Nyköping Älmhult Hedemora
Höör 1267 Solna 184 Klippan 1267
Helsingborg 1283 Vellinge 1233 Höör 1267
Vaggeryd 665 Osby 1273 Helsingborg 1283
- 43 -
Appendix C
Figure C1 – Event studies: baseline difference-in-difference panels
The graphs show the point estimates for the effect of treatment for each year. Year t-1 is the omitted category. The red line
represents that treatment was implemented one semester before the first outcomes are observed in the data. The t-2 point
estimates are based on outcome data from the municipalities that have two pre-treatment periods; Nyköping and Älmhult.
These two municipalities are excluded from the t+6 point estimates
(a) average GPA
(b) share pass all subjects
(c) share eligible vocational school education
(d) share 7th graders in public schools
(a) English test – average GPA
(b) English test – share pass
(c) Math test – average GPA
(d) Math test – share pass
(e) Swedish test – average GPA
(f) Swedish test – share pass
- 44 -
Appendix D.
Table D1. Difference-in-difference estimation: ‘worst case’ and ‘best case’ variables comparison
Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1
The outcome variables denoted by ‘WC’ are the variables for which ~100 values are recoded under the assumption that four
students out of the municipality-level student population failed to pass the test. The outcome variables denoted by ‘BC’ is the
variable for which ~100 values are recoded as 100 under the assumption that no students failed. The estimation is done with
the baseline difference-in-difference estimation specified in section 5.1. Standard errors are clustered on municipalities. All
municipalities are weighted on ninth-grade student population.
Final grade outcomes panel National standardized test outcomes (1) (2) (3) (4) (5) (6) (7) (8)
Share eligible
voc. school
WC
Share eligible
voc. school
BC
share pass
English
test WC
share pass
English
test BC
share pass
math test
WC
share pass
math test
BC
share pass
Swedish
test WC
share pass
Swedish
test BC
School Merger -3.63043*** -3.63043*** -0.226 -0.635 -4.738*** -4.660*** -1.039 -0.724
(0.80625) (0.80625) (0.607) (0.820) (0.962) (1.018) (1.341) (1.537)
Constant 85.86972*** 85.95010*** 96.318*** 96.957*** 86.185*** 86.311*** 95.454*** 96.032***
(0.01039) (0.01039) (0.008) (0.011) (0.012) (0.013) (0.017) (0.020)
Observations 1,175 1,175 1,155 1,155 1,155 1,155 1,155 1,155
R-squared 0.00482 0.00440 0.000 0.001 0.003 0.003 0.001 0.000
Number of kommunkod 235 235 231 231 231 231 231 231
Controls no no no no no no no no
Muncipality Fixed
Effects
yes yes yes yes yes yes yes yes
Time Fixed Effects yes yes yes yes yes yes yes yes
Adjusted R-squared 0.00397 0.00355 -0.000748 -0.000315 0.00238 0.00207 0.000303 -0.000471
Root MSE 3.300 3.452 1.309 1.711 5.253 5.436 1.921 2.301
f-stat 20.28 20.28 0.139 0.599 24.24 20.96 0.600 0.222