10
Effects of the implementation of state-wide exit exams on students’ self-regulated learning Katharina Maag Merki * Institute of Education, University of Zurich, Freiestrasse 36, 8032 Zurich, Switzerland Introduction In almost all of the sixteen states (La ¨nder) of Germany, state- wide exit examinations (Abitur) at the end of academic-track high schools (Gymnasium) are one of the instruments that were recently implemented to increase student achievement levels, reduce the differences between schools, and attain the goal of a higher level of standardization in the grading system. State-wide exit exams in Germany are administered by the state government. They are end- of-course exams and focus on curriculum content. The main difference between class-based and state-wide examination systems is related to the question of who develops the tests. Whereas it used to be that teachers of each school subject and course designed the exams for their classes (class-based exams), now there are standardized, externally developed exams for each subject for all schools and courses in the entire state (state-wide exams). The grading of these state-wide exit exams is still done in a class-based manner, that is, by the individual teacher of the class. However, teachers now must use unified grading guidelines that were developed to standardize grading. The final result of the exam includes not only the results of the single tests at the end of high school but also the students’ grades during the last 2 years of high school. Furthermore, the students can choose their examination subjects to a certain extent. Comparing the examination system of Germany to the examination systems in the United States or other OECD countries, another important difference is that the system in Germany shows a rather low level of standardization without any high-stakes assessments for teachers and schools (Klein, Ku ¨ hn, Van Ackeren, & Block, 2009). For the students, however, the exams which are taken in common school situations are mandatory for graduation and relevant to gaining access to university. Several studies investigated the effects of implementation of state-wide exit exams on student achievement in school subjects (for an overview, see Holme, Richards, Jimerson, & Cohen, 2010). However, there is still little known about the extent to which the implementation of state-wide exit exams impacts students’ self- regulated learning (Zimmermann & Schunk, 2001). This compe- tency not only influences in-depth understanding of learning contents and the acquisition of school-subject competencies (see, for example, OECD, 2010) but also itself represents a significant competency that in many syllabuses in German- speaking countries is a stated educational goal (Maag Merki & Grob, 2002). Therefore, reaching a higher level of achievement as a goal of the implementation of exit examinations involves also significant enhancement of the self-regulated learning of the students. Furthermore, the findings of Miller, Heafner, and Massey (2009) make it clear that students have difficulty in developing appropriate strategies when dealing with the increased pressures caused by external examinations. Consequently, the implementa- tion of state-wide exit exams in Germany must also be assessed in view of the extent to which it successfully fosters learning with in- depth understanding and the relevant motivation and minimizes the use of dysfunctional strategies. Therefore, this study investi- gates the impact of the implementation of state-wide exit exams on students’ self-regulated learning. Studies in Educational Evaluation 37 (2011) 196–205 A R T I C L E I N F O Article history: Received 20 February 2011 Received in revised form 15 July 2011 Accepted 1 December 2011 Available online 11 February 2012 Keywords: State-wide exit examination High school Low-Stakes testing system Program evaluation A B S T R A C T Whereas several studies investigated the effects of implementation of state-wide exit exams on student achievement, there is still little known about the impacts of the exams on students’ self-regulated learning. This paper examines the question as to whether the implementation of state-wide high school exit exams is associated with a change in the self-regulated learning of students in mathematics or English. We conducted a standardized questionnaire survey of students in two German states for a period of 3 years. In mathematics no significant effects of the immediate introduction of state-wide exit exams were identified. In English the results show significant positive and negative effects. The results are discussed and implications for further research are given. ß 2012 Elsevier Ltd. All rights reserved. * Tel.: +41 44 634 27 80; fax: +41 44 634 28 88. E-mail address: [email protected]. Contents lists available at SciVerse ScienceDirect Studies in Educational Evaluation jo ur n al ho mep ag e: www .elsevier .c om /st u ed u c 0191-491X/$ see front matter ß 2012 Elsevier Ltd. All rights reserved. doi:10.1016/j.stueduc.2011.12.001

Effects of the implementation of state-wide exit exams on students’ self-regulated learning

Embed Size (px)

Citation preview

Page 1: Effects of the implementation of state-wide exit exams on students’ self-regulated learning

Studies in Educational Evaluation 37 (2011) 196–205

Effects of the implementation of state-wide exit exams on students’self-regulated learning

Katharina Maag Merki *

Institute of Education, University of Zurich, Freiestrasse 36, 8032 Zurich, Switzerland

A R T I C L E I N F O

Article history:

Received 20 February 2011

Received in revised form 15 July 2011

Accepted 1 December 2011

Available online 11 February 2012

Keywords:

State-wide exit examination

High school

Low-Stakes testing system

Program evaluation

A B S T R A C T

Whereas several studies investigated the effects of implementation of state-wide exit exams on student

achievement, there is still little known about the impacts of the exams on students’ self-regulated

learning. This paper examines the question as to whether the implementation of state-wide high school

exit exams is associated with a change in the self-regulated learning of students in mathematics or

English. We conducted a standardized questionnaire survey of students in two German states for a

period of 3 years. In mathematics no significant effects of the immediate introduction of state-wide exit

exams were identified. In English the results show significant positive and negative effects. The results

are discussed and implications for further research are given.

� 2012 Elsevier Ltd. All rights reserved.

Contents lists available at SciVerse ScienceDirect

Studies in Educational Evaluation

jo ur n al ho mep ag e: www .e lsev ier . c om / s t u ed u c

Introduction

In almost all of the sixteen states (Lander) of Germany, state-wide exit examinations (Abitur) at the end of academic-track highschools (Gymnasium) are one of the instruments that were recentlyimplemented to increase student achievement levels, reduce thedifferences between schools, and attain the goal of a higher level ofstandardization in the grading system. State-wide exit exams inGermany are administered by the state government. They are end-of-course exams and focus on curriculum content. The maindifference between class-based and state-wide examinationsystems is related to the question of who develops the tests.Whereas it used to be that teachers of each school subject andcourse designed the exams for their classes (class-based exams),now there are standardized, externally developed exams for eachsubject for all schools and courses in the entire state (state-wideexams). The grading of these state-wide exit exams is still done in aclass-based manner, that is, by the individual teacher of the class.However, teachers now must use unified grading guidelines thatwere developed to standardize grading. The final result of the examincludes not only the results of the single tests at the end of highschool but also the students’ grades during the last 2 years of highschool. Furthermore, the students can choose their examinationsubjects to a certain extent. Comparing the examination system ofGermany to the examination systems in the United States or otherOECD countries, another important difference is that the system in

* Tel.: +41 44 634 27 80; fax: +41 44 634 28 88.

E-mail address: [email protected].

0191-491X/$ – see front matter � 2012 Elsevier Ltd. All rights reserved.

doi:10.1016/j.stueduc.2011.12.001

Germany shows a rather low level of standardization without anyhigh-stakes assessments for teachers and schools (Klein, Kuhn, VanAckeren, & Block, 2009). For the students, however, the examswhich are taken in common school situations are mandatory forgraduation and relevant to gaining access to university.

Several studies investigated the effects of implementation ofstate-wide exit exams on student achievement in school subjects(for an overview, see Holme, Richards, Jimerson, & Cohen, 2010).However, there is still little known about the extent to which theimplementation of state-wide exit exams impacts students’ self-regulated learning (Zimmermann & Schunk, 2001). This compe-tency not only influences in-depth understanding of learningcontents and the acquisition of school-subject competencies(see, for example, OECD, 2010) but also itself represents asignificant competency that in many syllabuses in German-speaking countries is a stated educational goal (Maag Merki &Grob, 2002). Therefore, reaching a higher level of achievement asa goal of the implementation of exit examinations involves alsosignificant enhancement of the self-regulated learning of thestudents.

Furthermore, the findings of Miller, Heafner, and Massey (2009)make it clear that students have difficulty in developingappropriate strategies when dealing with the increased pressurescaused by external examinations. Consequently, the implementa-tion of state-wide exit exams in Germany must also be assessed inview of the extent to which it successfully fosters learning with in-depth understanding and the relevant motivation and minimizesthe use of dysfunctional strategies. Therefore, this study investi-gates the impact of the implementation of state-wide exit examson students’ self-regulated learning.

Page 2: Effects of the implementation of state-wide exit exams on students’ self-regulated learning

K. Maag Merki / Studies in Educational Evaluation 37 (2011) 196–205 197

Theory models and current state of research

Although there is no consistent definition of self-regulatedlearning in the literature up to now (Schunk, 2008), there are corefeatures that are constitutive of the construct. According to Artelt,Baumert, Julius-McElvany, and Peschar (2003, p. 10), self-regulat-ed learning is ‘‘generally understood to involve students inselecting appropriate learning goals which guide the learningprocess, using appropriate knowledge and skills to direct learning,consciously selecting appropriate learning strategies appropriateto the task at hand, and being motivated to learn.’’ Self-regulatedlearning is thus characterized by motivational, cognitive, andmetacognitive dimensions of learning (Baumert, Fend, O’Neil, &Peschar, 1998; Zimmermann & Schunk, 2001).

Theoretical approaches to explain why the implementation ofstate exit exams should be systematically associated with effectson self-regulated learning can be found in the theory tradition ofeconomics of education (Bishop, 1997; Wossmann, 2003).According to these theoretical models, a central goal of imple-menting state exit exams is to reinforce students’ motivation to puteffort into their learning. This is based on the assumption that thevalue of the test results achieved is higher for state-wide exitexams than for class-based exams, the scores of which cannot beused for comparison purposes by institutions of higher education.

Ryan, Ryan, Arbuthnot, and Samuels (2007) and Ryan and Sapp(2005) criticized the postulated influence model for not being ableto explain why, owing to the implementation of state exit exams,learning motivation is developed that can foster the acquisition ofsubject competencies. With reference to various learning theoryapproaches, Ryan and Sapp (2005) showed that standardized high-stakes examination systems tend to foster more extrinsicmotivations, performance orientation rather than mastery orien-tation, and problematic emotional reactions such as text anxietyand the over- or underchallenging of students.

Empirical studies on the effects of standardized tests on theindividual dimensions of self-regulated learning, in particular onmotivation or emotions, are found in the context of the realizationof a high-stakes monitoring system. But there are gaps in theinternational research when it comes to analyze the impact of stateexit exams on cognitive or metacognitive learning strategies. Thefollowing sections give an overview of empirical results on therelationship between the implementation of exit exams and thecore concepts of self-regulated learning in an internationalperspective.

Studies showed that the implementation of standardized high-stakes tests have a significantly negative impact on students’motivational and emotional experience. The studies revealed, forinstance, increased stress, anxiety, or fatigue among the students(e.g. Meyer, McClure, Walkey, Weir, & McKenzie, 2009; Pedullaet al., 2003; Ryan & Sapp, 2005). Ryan et al. (2007) found theseeffects even among medium and high-achieving students. InRichman, Brown, and Clark (1987) high-risk students that failedthe test had somewhat lower self-esteem and significantly moreapprehension after the test than beforehand. Catterall (1989)discovered that in addition, students’ doubts concerning theirability to pass future exit exams clearly increased.

However, Bishop (1999) analyzed international comparisonstudies and found little support that state-wide exit exams aresystematically associated with problematic learning behaviors ormotivational burdens in students. For instance, students incountries or provinces that have central exit exams are ‘‘lesslikely to report that memorization is the way to learn the subjectand more likely to report that they did experiments in scienceclass’’ (Bishop, 1999, p. 394). They were not less likely to beinterested in the subject than students in nations or provinces withno central exit exams.

Based on analyses in the context of the Third InternationalMathematics and Science Study (TIMSS) at the upper secondarylevel in Germany Baumert and Watermann (2000) concluded thatthe implementation of state exit exams was not associated withgreater test anxiety in specific school subjects. In addition,students’ use of elaboration strategies was higher in mathematicscourses that were tested via state exit exams than in mathematicscourses that were tested via class-based exams. The effect and useof understanding-oriented learning strategies could be replicatedfor physics. However, in physics the organization of exams had noeffect on test anxiety.

Contrary to these findings, the results based on TIMSS data fromthe end of the lower secondary level in Germany showed thatstudents in state exit exam states are ‘‘consistently less likely tolike or enjoy mathematics, or to find it as an easy subject, but theyare more likely to find it boring’’ (Jurges & Schneider, 2010, p. 514).The findings in the context of the PISA study are comparable.Students in Germany with state exit exams were more likely toreport anger, anxiety, and achievement pressure than students inGermany without state exit exams. Furthermore, their self-conceptin mathematical competence was relatively weak. Jurges et al. sawthese negative effects as associated with the increased pressure onstudents exerted by teachers.

Initial findings from one of our own studies (Oerke & MaagMerki, 2009) also pointed to the importance of teachers in thecontext of state exit exams. The study indicated that prior to thehigh school exit exam, the students in advanced courses withstate-wide exams attribute their possible success on the exit exammore to their teachers and their teachers’ explanations (‘‘teacherexplained this well’’) than students in advanced courses with class-based examinations do. However, based on these findings it cannotbe assumed that introduction of the state exit exam gave studentsa feeling of a loss of control.

In summary, there are studies available that examined theeffects of state-wide exit exams on dimensions of self-regulatedlearning in students, whereby students’ motivational and emo-tional experience was investigated more often than students’cognitive and metacognitive regulation. Regarding the analysis ofthe influence of state exit exams on students’ motivational andemotional experience, two research gaps in particular can beidentified. Students’ motivational and emotional experience hasbeen captured with little differentiation up to now. Previousstudies used mostly single measurement items. Importantdimensions such as self-efficacy, subject specific self-concept, orpersistence, were in part examined only in the study by Jurges,Schneider, Senkbeil, & Carstensen (2009). In that study, as in theother German studies, the effects of state-wide academic-track exitexams were assessed via a comparison of German states and schoolsubjects in the frame of cross-sectional studies. Thus, the changefrom a class-based to state-wide examination system within oneGerman state cannot be depicted, and it remains unclear whetherthe effects identified are the results of the introduction andimplementation of state-wide exit exams or can be explained byother variables, such as context differences (for example, differingexpectations or values concerning education in the differentGerman states) or subject-specific differences.

Research questions and hypotheses

This paper examines the following questions: How does theimplementation of state-wide academic-track high school exitexams in the advanced courses in mathematics and English impactstudents’ self-regulated learning?

Based on the theoretical assumptions and empirical findingsmentioned above, it can be assumed that the implementation ofstate-wide exit exams impacts individual dimensions of students’

Page 3: Effects of the implementation of state-wide exit exams on students’ self-regulated learning

K. Maag Merki / Studies in Educational Evaluation 37 (2011) 196–205198

self-regulated learning. Whether these effects are negative orpositive cannot be determined clearly. Moreover, subject-specificeffects should be expected (Baumert & Watermann, 2000).

Referring to findings by Baumert and Watermann (2000) andBishop (1999), we do not expect to find a negative impact onstudents’ self-regulated learning with regard to using cognitive(e.g., elaboration strategies) and metacognitive learning strate-gies (e.g., monitoring and planning strategies) for the exampreparation. However, because the goal of passing the state-wideexit exam is a kind of extrinsic motivation (Ryan & Sapp, 2005) andbecause the possibility of gaining confidence throughout highschool with regard to the type and content of the exams is smallerthan in the class-based exit examination system, it cannot beruled out that the newly implemented exams will result inincreased use of surface-learning strategies (e.g., memorizationstrategies).

This can also be supposed for the influence of these exams onstudents’ motivational (e.g., persistence, interest in school subject)and emotional experience (e.g., uncertainty about passing theexam). The majority of the previous studies tended to indicate thatuncertainty and, connected with that, negative effects onmotivational and emotional aspects of learning are to be expected(Jurges et al., 2009; Pedulla et al., 2003; Ryan et al., 2007). Speakingagainst that, Baumert and Watermann (2000) found for mathe-matics no negative effect on students’ text anxiety and for physicsno effects at all.

Research design and methods

The research questions were investigated from 2007 to 2009 intwo German states, Bremen and Hesse, in the context of a studyfunded by the German Research Foundation (DFG). Whereas inHesse the state-wide exams in the advanced courses wereintroduced in 2007, Bremen introduced state-wide exit exams insome advanced courses only in 2008. Therefore, the main focus ofour analyses was on the change of the examination system inBremen.

To work on the research questions we conducted descriptiveanalyses and multilevel analyses using HLM 6.08 (Raudenbusch,Bryk, Cheong, & Congdon, 2004). We focused on advanced coursesin mathematics and English, since Bremen changed the examina-tion system in these subjects in 2008. In addition, mathematics andEnglish are the two subjects that students most frequently chooseas their exam subjects.

In this study we conducted two different analyses to assess theeffect of the implementation of state-wide exit exams: First, within

Bremen, the developments in 2007–2008 with the change of theexamination system can be compared with the developments in2008–2009, when there were no further changes. Here, the secondtime period from 2008 to 2009 serves as a control analysis. It isassumed that the change of the system (2007–2008) will bereflected in a greater difference than system stability (2008–2009).The advantage of these analyses is that both state and schoolsubject can be kept stable. However, no multilevel analyses can beconducted due to the small number of academic-track high schoolsin Bremen. This means that the nested structure of the data(students are nested within schools) cannot be taken intoconsideration.

Secondly, we calculated difference-in-differences, following theanalyses by Jurges and Schneider (2010) and Jurges, Schneider, &Buchel (2005) and taking into account the multilevel structure ofthe data. Here the possible effect of state-wide exit exams wasexamined via comparison of the two states and the 3 years. Due tosubstantial overlaps in the two systems, the state-specificimplementation mode allows quasi-experimental comparisonsover the years, considering Hesse as a control group.

The assumption is that the between-state differences are smallerif in both states the exams for the courses are state-wide exams,which is the case for advanced courses in mathematics and Englishin 2008 and 2009, but that the between-state differences are larger ifin one state (Hesse) the exams are state-wide and in the other state(Bremen) the exams are class-based, which is the case for theadvanced courses in 2007. In the following, different effects of yearappear that can be identified empirically via the cross-levelinteraction effects between the variables ‘‘year’’ and ‘‘state.’’

For the interpretation of these effects, results of schoolimprovement research (e.g., Earl, Nancy & Sutherland, 2006) andprevious results analyzing the impact of the implementation ofstate-wide exit exams (e.g. Amrein & Berliner, 2002; Taylor, 2009)showing that school reforms can have both short-term and long-term effects have to be considered. Furthermore, depending on thestructure, the content, or the context of the reform, they mighthave only short-term effects due to a low sustainability of thereform. In contrast, some reforms show first and foremost longerterm effects because the complexity of the reform is high, teachershave to learn how to deal with the new demands, and they have togain experience with the new system.

Considering our question, we expect the implementation ofstate-wide exit examination to be a complex reform influenced bydifferent actors and their actions (e.g., district, schools, teachers,students, parents etc.). Therefore, for one, the implementation ofthe new system might lead to an immediate change of teaching orlearning (e.g., as a result of perceived external pressure, see above).For another, long-term effects can be assumed, too, we expect theschools and teachers to align their teaching to their multi-yearexperiences with the new examination system. However, theseprocesses will take time; they will not affect teaching and learningimmediately but step by step. Considering Hesse as a control groupis based on the assumption that the first 3 years of theimplementation that we are able to analyze in Hesse are morestable than the immediate change in Bremen.

Indicators

To examine the research questions, various dimensions ofstudents’ self-regulated learning (Baumert et al., 2000) weremeasured using items from standardized questionnaires. Allindicators were collected before the exit exam. The responsescale for all items was from 1 = not true at all, . . . to 4 = very true.Reliability was merely satisfactory only for the advanced course inmathematics in the indicator ‘‘elaboration strategies.’’ For all theother dimensions the reliabilities were good.

Motivational self-regulation

- Interest in school subject (3 items): example item: ‘‘When I workon this subject, I sometimes forget everything else around me’’(source: PISA-Konsortium Deutschland, 2000). Cronbach’s alphafor mathematics: .78 (range rit: 0.53–0.69); Cronbach’s alpha forEnglish: .76 (range rit: 0.48–0.68).

- Scholastic self-efficacy (4 items): Example item: ‘‘I can answereven the difficult questions in class, if I try hard’’ (source:Jerusalem & Satow, 1999). Cronbach’s alpha for mathematics: .85(range rit: 0.61–0.73); Cronbach’s alpha for English: .81 (range rit:0.57–0.69).

- Persistence (3 items): example item: ‘‘Even when I run intodifficulty when preparing for the test, I remain determined andkeep at it’’ (source: Grob & Maag Merki, 2001). Cronbach’s alphafor mathematics: .74 (range rit: 0.53–0.61); Cronbach’s alpha forEnglish: .73 (range rit: 0.53–0.59).

- Uncertainty about passing the academic-track exit exam (5 items):example item: When you think about the exit exams coming up,

Page 4: Effects of the implementation of state-wide exit exams on students’ self-regulated learning

K. Maag Merki / Studies in Educational Evaluation 37 (2011) 196–205 199

how true are the following statements for you?: ‘‘I am worriedthat something on the exam could go badly’’ (scale was designedfor this study specifically). Cronbach’s alpha for mathematics: .85(range rit: 0.55–0.74); Cronbach’s alpha for English: .82 (range rit:0.52–0.68).

Cognitive regulation

- Elaboration strategies (8 items): example item: How do youprepare for the exit exams outside of class?: ‘‘When I prepare forthe exams, I recall how we solved the problems in class that willbe on the exams’’ (scale was designed for this study specifically).Cronbach’s alpha for mathematics: .66 (range rit: 0.29–0.46);Cronbach’s alpha for English: .73 (range rit: 0.34–0.50).

- Memorization strategies (3 items): example item: How do youprepare for the exit exams outside of class?: ‘‘When I prepare forthe exams, I try to memorize as much as possible’’ (source: Grob& Maag Merki, 2001). Cronbach’s alpha for mathematics: .73(range rit: 0.48–0.67); Cronbach’s alpha for English: .80 (range rit:0.58–0.74)

Metacognitive regulation

- Planning strategies (3 items): example item: How do you preparefor the exit exams outside of class? ‘‘When I prepare for theexams, I take the time to plan and think about when I will studywhat material’’ (source: Grob & Maag Merki, 2001). Cronbach’salpha for mathematics: .76 (range rit: 0.54–0.68); Cronbach’salpha for English: .79 (range rit: 0.60–0.67).

- Monitoring strategies (5 items): example item: How do youprepare for the exit exams outside of class?: ‘‘When I prepare forthe exams, I check from time to time whether I am heading in theright direction’’ (source: Grob & Maag Merki, 2001). Cronbach’salpha for mathematics: .69 (range rit: 0.38–0.51); Cronbach’salpha for English: .75 (range rit: 0.44–0.57).

Based on the scale values, missing values were imputed usingmultiple imputation in SPSS 18 (Graham, 2009; Ludtke, Robitzsch,Trautwein, & Koller, 2007).1 Here ten data sets were produced inwhich for the missing values plausible estimated values wereincluded that vary slightly among the data sets. The descriptivestatistics reported in the following combine the values of theindividual data sets using the formula developed by Rubin (1987).

Analysis strategies

In a first step, the descriptive and inferential statistical analyzeswere carried out within the two states. This means that based onthe Bremen data, differences between the years were calculated.Multivariate regression analyses with corresponding dummyvariables were performed. To test for possible cohort effects in ayear-to-year comparison, a 15-item figure analogies subtest from aGerman cognitive ability test (KFT) (Heller & Perleth, 2000) wasused additionally. Further, in line with empirical results (e.g. Arteltet al., 2003; Zimmerman & Martinez-Pons, 1990), students’ sexwas controlled for as a possible influencing variable. Due to astrong privacy policy particularly in Bremen the assessment offamily background information was completely voluntary. As aconsequence, the number of respective missing, especially in 2007,is too large, and the reduction of the sample due to the non-randomdistribution of the missings would have been too big if we had

1 The reason to impute only scale values is related to the fact that with the given

number of cases on n > 6000, the software could not handle the large number of

individual items despite maximal hardware requirements. The multilevel structure

of the data set was taken into account in that for the multiple imputation dummy

variables were entered for the units at the higher level (Graham, 2009).

wanted to control for family background. We therefore did notinclude family background indicators in the evaluation. The sameanalyses were also conducted for the Hesse data. However, inHesse there is no system change to examine but instead the first 3years of the implementation of state-wide exit exams.

In a second step multilevel analyses were conducted. Multilevelanalysis evaluation methods allow estimation of year effects independency on specific school membership. In this way it can betaken into account that the use of learning strategies or themotivations of students in the schools are not independent of oneanother but can vary by specific school.

The evaluations were based on a two-level model, where therespective dimension of self-regulated learning was used as thedependent variable. At Level 1 the independent variables were thetwo dummy variables ‘‘Year 07’’ (1 = 2007) and ‘‘Year 09’’(1 = 2009) and at Level 2 the variable ‘‘state’’ (0 = Bremen,1 = Hesse). The changes from 2007 to 2008 (Year 07) and from2008 to 2009 (Year 09) could be determined via the two dummyvariables. Both the fixed and the random effects were included inthe regression equation. The independent variables were enteredinto the analyses uncentered. Sex and KFT scores, again entered ascontrol variables, were considered only as main effects at Level 1.The regression equation used was:

Dimension of self-regulated learning

¼ G00 þ G01 � State þ G10 � Year07 þ G20 � Year09

þ G30 � Sex þ G40 � KFT þ G11 � State � Year07

þ G21 � State � Year09 þ U0 þ U1 � Year07

þ U2 � Year09 þ R

Sample

The indicators were collected by written questionnaires in 37academic-track high schools (Gymnasium) with upper level uppersecondary school in the 3 years 2007, 2008, and 2009, conducted atthe beginning of February before the written state exit exams wereheld.

In Bremen all 19 general education academic-track uppersecondary schools with upper level were analyzed. In Hesse 18schools were included, selected according to specific criteria(region, urban/rural, size of school, type of Gymnasium) to obtainthe most representative sample possible within the state.

Within the schools, students in four graduating classes weresurveyed. The students were asked to fill out the questionnairewith reference to the three school subjects that they would betested on in the written state exit exams, regardless of the classthey were in when asked to fill out the questionnaire. This focuswas important so that the students’ responses could also be used infurther analyses. However, it must be taken into consideration thatin larger schools, not all students that for instance reported thatthey had chosen mathematics as their test subject were in thesame advanced course. For this reason, in connection withmultilevel analyses we used the schools and not the courses asthe Level 2 variable.

In Hesse the response rates for the student questionnaires wererelatively high and stable (68–74%). In Bremen, participation wassomewhat lower in the first year, at 52%, owing to organizationalproblems in four schools, but it was comparable to Hesse in 2008and 2009 (68% and 71%). The percentage of imputed data can becalled low. In the scales of interest here, it ranged in mathematicsfrom 6.7% to 10.8% and in English from 6.7% to 9.8%.

Of the total 6331 students that participated in all 3 years, 1961students chose mathematics and 2573 students chose English astheir advanced course for the exit exam. Of these, only thosestudents were included in the following analyses for whom also

Page 5: Effects of the implementation of state-wide exit exams on students’ self-regulated learning

K. Maag Merki / Studies in Educational Evaluation 37 (2011) 196–205200

cognitive ability test (KFT) scores were available. The missings forthis indicator were not imputed, as they did not fulfill therequirements for multiple imputation. This reduced the sample inthe advanced course in mathematics by 13.5% to N = 1697 and inthe advanced course in English by 13.9% to N = 2215.

Results

Analyses 1: year-specific effects within the two German States

Analysis of the results for mathematics (see Table 1) revealedchanges only in some dimensions, and the effects were ratherweak.

The only significant effect of year, owing to the change in theexamination system in Bremen, was found in the dimension‘‘persistence.’’ Controlling for general cognitive ability and for sex,in the first year of implementation of state-wide exit exams in2008 the students reported lower persistence with regard topreparing for their exit exams than in the year 2007, when theexamination system was still class-based (p < .05, Cohen’sd = �.19). However, this drop is no longer evident in the secondyear of implementation of the exams, since there is a significantincrease from 2008 to 2009 in the amount of persistence (p < .05,Cohen’s d = .17). In the 3-year comparison, therefore, no significantdifferences were found in students’ reported persistence whenpreparing for their exams.

The 3-year comparison uncovered a positive effect on interestin the school subject mathematics. In 2009 students reportedsignificantly higher interest in mathematics than in 2007 (p < .05,Cohen’s d = .19). A look at the means in the 3 years reveals thatinterest in mathematics rose continuously across the 3 years,although the comparison of the means 2007/2008 and 2008/2009were not significant. Contrary to the expectation that the changewould be reflected in a greater difference than the system stability,the effect sizes of the two time periods are comparable.

A trend for a 3-year effect (p < .10, Cohen’s d = .15) was alsofound for the indicator ‘‘elaboration strategies.’’ In 2009 studentstend to use elaboration strategies when preparing for the examsmore frequently than in 2007. Here again, there is a continuousincrease over the 3 years, although it is less strong than theincrease in school subject interest. For the other cognitive andmetacognitive learning strategies, no significant year effects werefound.

In sum, in Bremen it can be said for the advanced course inmathematics that the change in the examination system appears toshow less directly in associated changes in students’ self-regulatedlearning. In most cases, there are no effects; but where effects arefound, in the 3-year comparison they are positive in trend.

In Hesse it is not the change in examination system that can beobserved but rather the change across the first 3 years ofimplementation of state-wide exit exams. Here again, in partpositive effects were found, the strongest for students’ use ofelaboration strategies when preparing for the exams (p < .01,Cohen’s d = .29). Slightly positive changes were also found forincreased scholastic self-efficacy (p < .05, Cohen’s d = .12) and intrend for reduction of uncertainty about passing the exit exam(p = .054, Cohen’s d = �.13).

In contrast to school subject mathematics, clearer effects of thechange in the examination system in Bremen can be observed in

English (see Table 2), although the effect sizes are small, too.Positive changes from 2007 to 2008 were found for students’interest in the school subject and for the use of elaborationstrategies when preparing for the exit exam. In the first year of thestate exit exams, the students reported greater interest in Englishthan in 2007, when exams were class-based (p < .01, Cohen’sd = .26). They also more frequently reported using elaboration

strategies when preparing for the exam (p < .01, Cohen’s d = .23).No further changes were found in the two dimensions from 2008 to2009, so that in the 3-year comparison these positive effectsremain stable.

The findings are similar for uncertainty regarding passing theexit exam, although negative rather than positive: With theimplementation of the state exit exams, in the subject Englishuncertainty about passing the exit exam increased significantly(p < .01, Cohen’s d = .21) and again remained stable from 2008 to2009.

In trend, there is the same effect also for use of memorizationstrategies when preparing for the exit exam (p < .10, Cohen’sd = .18). In 2008 students reported somewhat more frequentlythan in 2007 that they memorized learning contents or mnemonicsentences when preparing for the exit exam. However, in the 3-year comparison no differences were found. Also, there were noyear differences in the use of metacognitive exam preparationstrategies.

However, in Hesse the clearest year effects for cognitive andmetacognitive learning strategies were identified; they are anindication of positive changes in the 3-year period (elaborationstrategies: p < .01, Cohen’s d = .19; planning strategies: p < .001,Cohen’s d = .32; monitoring strategies: p < .01, Cohen’s d = .22).Only the tendency for increased use of memorization strategiesfrom 2007 to 2008 deviates from this pattern and suggests, as inBremen, somewhat more frequent surface-level exam preparationin 2008 than in 2007 (p < .10, Cohen’s d = .16). However, in a 3-yearcomparison no significant changes were observed.

Analyses 2: results of multilevel analyses

With consideration of the multilevel structure, the analyses ofthe data on students with advanced course in mathematics showthat the variance between schools in the dimensions of self-regulated learning, measured via the intraclass correlationcoefficient (ICC), is low (see Table 1).

No significant effects of year or cross-level effects that can beconnected with the introduction of state exit exams were found inany of the dimensions investigated. A tendency for a year effectwas identified only in the dimension ‘‘planning strategies’’(b = 0.120, SE = 0.065, p < .10). When controlling for sex andcognitive ability, students taking the state-wide high school exitexam in 2009 tended to report more frequently than studentstaking the state-wide high school exit exam in 2008 that they usedplanning strategies when preparing for the exit exam. Also when itcan be assumed, based on the descriptive analyses (see Table 1),that this can be attributed first and foremost to the mean differencein Hesse, no significant cross-level effect was found when takinginto consideration the multilevel structure. However, girls useplanning strategies more frequently than boys (b = �0.240,SE = 0.044, p < .001), as do also students with lower cognitiveability test scores (b = �0.017, SE = 0.006, p < .01).

In comparison to mathematics, students with advanced course

in English showed somewhat more frequent significant effects (seeTable 3). In connection with the introduction of state-wide exitexams a cross-level effect by trend was found in the dimension‘‘uncertainty about passing the exit exam.’’ In Bremen, uncertaintyabout passing the exam increased from 2007 to 2008 with theintroduction of state-wide exit exams, whereas in Hesse, where theexamination system remained the same, no such change can beidentified. However, with a probability of error of 5.3%, the cross-level-effect only tended toward significance.

Significant, but small changes from 2007 to 2008 were found inthe use of elaboration and monitoring strategies and, in trend, inthe use of planning strategies in a positive direction and in the useof memorization strategies in a negative direction.

Page 6: Effects of the implementation of state-wide exit exams on students’ self-regulated learning

Table 1Descriptive analyses, advanced courses mathematics.

Elaboration

strategies

(ICC: 0.014)

Memorization

strategies

(ICC: 0.046)

Planning

strategies

(ICC: 0.019)

Monitoring

strategies

(ICC: 0.007)

Interest in

school subject

(ICC: 0.034)

Scholastic

self-efficacy

(ICC: 0.030)

Persistence

(ICC: 0.018)

Uncertainty about

passing the exit

exam (ICC: 0.045)

HE BR HE BR HE BR HE BR HE BR HE BR HE BR HE BR

2007 M 2.77 2.86 2.21 2.42 2.53 2.65 2.61 2.63 3.04 2.80 2.83 2.77 3.09 3.03 2.64 2.53

SD 0.52 0.52 0.79 0.83 0.86 0.85 0.61 0.61 0.75 0.88 0.73 0.75 0.64 0.71 0.74 0.73

SE 0.03 0.04 0.05 0.06 0.05 0.06 0.04 0.05 0.05 0.06 0.04 0.05 0.04 0.05 0.04 0.05

N 285 220 285 220 285 220 285 220 285 220 285 220 285 220 285 220

2008 M 2.87 2.91 2.23 2.54 2.49 2.66 2.63 2.64 3.03 2.87 2.84 2.72 3.06 2.90 2.58 2.63

SD 0.43 0.46 0.83 0.80 0.81 0.77 0.57 0.59 0.76 0.82 0.70 0.77 0.65 0.66 0.74 0.74

SE 0.03 0.03 0.05 0.05 0.05 0.05 0.03 0.04 0.05 0.05 0.04 0.05 0.04 0.04 0.05 0.04

N 282 289 282 289 282 289 282 289 282 289 282 289 282 289 282 289

2009 M 2.91 2.94 2.31 2.49 2.64 2.70 2.67 2.72 3.14 2.96 2.92 2.79 3.10 3.01 2.55 2.59

SD 0.45 0.52 0.80 0.79 0.80 0.86 0.59 0.62 0.71 0.82 0.73 0.75 0.64 0.67 0.70 0.74

SE 0.03 0.03 0.05 0.05 0.05 0.05 0.04 0.04 0.04 0.05 0.04 0.04 0.04 0.04 0.04 0.04

N 309 312 309 312 309 312 309 312 309 312 309 312 309 312 309 312

Regression analyses

J07-08a .106(.041)p < .01

.053

(.047)

n.s.

.024

(.070)

n.s.

.113

(.076)

n.s.

.033

(.072)

n.s.

.003

(.076)

n.s.

.027

(.050)

n.s.

.010

(.057)

n.s.

.000

(.063)

n.s.

.065

(.079)

n.s.

.018

(.056)

n.s.

.043

(.071)

n.s.

.027

(.056)

n.s.

S.129(.062)p < .05

.046

(.059)

n.s.

.093

(.068)

n.s.

J08-09a .035

(.040)

n.s.

.030

(.041)

n.s.

.057

(.067)

n.s.

.028

(.068)

n.s.

.126(.070)p < .10

.056

(.071)

n.s.

.020

(.050)

n.s.

.081

(.050)

n.s.

.121(.062)p = .050

.098

(.069)

n.s.

.121(.060)p < .05

.062

(.062)

n.s.

.055

(.054)

n.s.

.114(.056)p < .05

�.089

(.057)

n.s.

�.037

(.060)

n.s.

J07-09a .141(.041)p < .01

.083(.047)p < .10

.081

(.067)

n.s.

.084

(.072)

n.s.

.093

(.070)

n.s.

.059

(.078)

n.s.

.048

(.051)

n.s.

.091

(.057)

n.s.

.122(.063)p = .055

.162

(.076)

p< .05

.140(.060)p < .05

.019

(.069)

n.s.

.028

(.054)

n.s.

�.014

(.060)

n.s.

�.135(.058)p = .054

.056

(.070)

n.s.

M = combined mean; SD = combined standard deviation (naive pooling); SE = standard error of the mean; BR = Bremen, HE = Hesse; n.s. = non significant; bold = significant at least at a 10%-level of significance.a J07-08, change from 2007 to 2008 (dummy variable); J08-09, change from 2008 to 2009 (dummy variable); J07-09, change from 2007 to 2009 (dummy variable), unstandardized regression coefficient (standard error), under

control of ‘cognitive ability test (KFT)’ and sex.

K.

Ma

ag

Merk

i /

Stud

ies in

Ed

uca

tion

al

Ev

alu

atio

n 3

7 (2

01

1)

19

6–

20

5

20

1

Page 7: Effects of the implementation of state-wide exit exams on students’ self-regulated learning

Table 2Descriptive analyses, advanced courses English.

Elaboration

strategies

(ICC: 0.034)

Memorization

strategies

(ICC: 0.018)

Planning

strategies

(ICC: 0.048)

Monitoring

strategies

(ICC: 0.032)

Interest in

school subject

(ICC: 0.047)

Scholastic

self-efficacy

(ICC: 0.031)

Persistence

(ICC: 0.022)

Uncertainty about

passing the exit

exam (ICC: 0.032)

HE BR HE BR HE BR HE BR HE BR HE BR HE BR HE BR

2007 M 2.62 2.74 2.17 2.33 2.27 2.70 2.39 2.56 2.84 2.67 3.11 3.07 2.96 3.01 2.56 2.43

SD 0.61 0.63 0.83 0.84 0.85 0.83 0.64 0.61 0.71 0.78 0.60 0.70 0.60 0.67 0.66 0.73

SE 0.03 0.04 0.05 0.05 0.05 0.05 0.03 0.03 0.04 0.04 0.03 0.04 0.03 0.04 0.04 0.04

N 367 330 367 330 367 330 367 330 367 330 367 330 367 330 367 330

2008 M 2.74 2.87 2.30 2.48 2.40 2.69 2.54 2.64 2.94 2.86 3.13 3.07 3.02 2.98 2.52 2.58

SD 0.50 0.50 0.84 0.83 0.82 0.81 0.61 0.63 0.73 0.71 0.58 0.62 0.67 0.61 0.68 0.67

SE 0.03 0.03 0.05 0.04 0.04 0.04 0.03 0.03 0.04 0.04 0.03 0.03 0.04 0.03 0.04 0.03

N 368 401 368 401 368 401 368 401 368 401 368 401 368 401 368 401

2009 M 2.73 2.85 2.27 2.40 2.54 2.66 2.53 2.64 2.90 2.81 3.18 3.07 2.93 2.97 2.51 2.57

SD 0.53 0.57 0.88 0.86 0.86 0.87 0.62 0.64 0.70 0.70 0.59 0.65 0.66 0.62 0.69 0.68

SE 0.03 0.03 0.05 0.05 0.05 0.05 0.03 0.03 0.04 0.04 0.03 0.03 0.04 0.03 0.04 0.04

N 378 371 378 371 378 371 378 371 378 371 378 371 378 371 378 371

Regression analyses

J07-08a .111(.042),p < .01

.118(.042),p < .01

.117(.066),p < .10

.121(.064),p < .10

.110(.062),p < .10

.039

(.061)

n.s.

.137(.048),p < .01

.059

(.046)

n.s.

.106(.055)p=.052

.187(.056)p < .01

.032

(.045)

n.s.

.002

(.050)

n.s.

.058

(.051)

n.s.

.041

(.048)

n.s.

.052

(.051)

n.s.

.140(.053)p < .01

J08-09a �.003

(.041)

n.s.

.001

(.040)

n.s.

�.029

(.066)

n.s.

�.048

(.062)

n.s.

.144(.064),p < .05

.001

(.062)

n.s.

�.018

(.047)

n.s.

.001

(.045)

n.s.

�.042

(.054)

n.s.

�.049

(.054)

n.s.

.054

(.045)

n.s.

.002

(.048)

n.s.

�.085(.050),p < .10

�.001

(.046)

n.s.

�.011

(.050)

n.s.

.007

(.050)

n.s.

J07-09a .108(.041),p < .01

.119(.042),p < .01

.089

(.062)

n.s.

.073

(.073)

n.s.

.254(.063),p < .001

�.038

(.063)

n.s.

.120(.046),p < .01

.073

(.047)

n.s.

.064

(.053)

n.s.

.138(.057)p < .05

.086(.046)p < .10

.001

(.052)

n.s.

�.027

(.050)

n.s.

�.042

(.049)

n.s.

�.063

(.049)

n.s.

.147(.054)p < .01

M = combined mean; SD = combined standard deviation (naive pooling); SE = standard error of the mean; BR = Bremen, HE = Hesse; n.s. = non significant; and bold = significant at least at a 10%-level of significance.a J07-08, change from 2007 to 2008 (dummy variable); J08-09, change from 2008 to 2009 (dummy variable); J07-09, change from 2007 to 2009 (dummy variable); unstandardized regression coefficient (standard error), under

control of ‘cognitive ability test (KFT)’ and sex.

K.

Ma

ag

Merk

i /

Stud

ies in

Ed

uca

tion

al

Ev

alu

atio

n 3

7 (2

01

1)

19

6–

20

52

02

Page 8: Effects of the implementation of state-wide exit exams on students’ self-regulated learning

Table 3Multilevel analyses (HLM), advanced courses English.

Planning

strategies

(ICC: 0.048)

Monitoring

strategies

(ICC: 0.032)

Uncertainty about

passing the exit

exam (ICC: 0.032)

Elaboration

strategies

(ICC: 0.034)

Memorization

strategies

(ICC: 0.018)

Intercept (1 to 4) 2.888*** (0.074) 2.870*** (0.065) 2.853*** (0.076) 2.925*** (0.053) 2.772*** (0.072)

Level 1

Dummy: year 2007 (0 = 2008, 1 = 2007) -0.115+ (0.065) -0.138* (0.051) 0.054 (0.071) n.s. -0.108* (0.045) -0.113+ (0.065)Dummy: year 2009 (0 = 2008, 1 = 2009) 0.134* (0.059) -0.022 (0.052) n.s. -0.002 (0.046) n.s. -0.001 (0.048) n.s. -0.030 (0.073) n.s.

Sex (0 = girls, 1 = boys) -0.340*** (0.035) -0.162*** (0.028) -0.245*** (0.038) -0.171*** (0.030) -0.391*** (0.039)KFT (range 0-15) -0.022*** (0.004) -0.016*** (0.003) -0.015*** (0.003) -0.007** (0.003) -0.021*** (0.004)

Level 2

State (0 = Hesse, 1 = Bremen) 0.260** (0.079) 0.075 (0.056) n.s. 0.067 (0.070) n.s. 0.123* (0.049) 0.181* (0.069)Year 07* state 0.146 (0.089) n.s. 0.074 (0.072) n.s. -0.200+ (0.100) -0.007 (0.064) n.s. -0.013 (0.089) n.s.

Year 09* state -0.138 (0.082) n.s. 0.031 (0.063) n.s. -0.001 (0.085) n.s. -0.000 (0.062) n.s. -0.030 (0.089) n.s.

Random effects

u0 (level 2) 0.028** 0.011* 0.025*** 0.007* 0.009 n.s.

u1 (slope year 2007) 0.009 n.s. 0.012 n.s. 0.045** 0.008 n.s. 0.007 n.s.

u2 (slope year 2009) 0.002 n.s. 0.002 n.s. 0.027* 0.007 n.s. 0.007 n.s.

R 0.652 0.366 0.425 0.293 0.662

Unstandardized coefficients (standard error); n.s. = non significant; number of level 1 units: 2215; and number of level 2 units: 37. Bold = significant at least at a 10%-level of

significance.*Level of significance: p < .05. **Level of significance: p < .01. ***Level of significance: p < .001. +Level of significance: p < .10

K. Maag Merki / Studies in Educational Evaluation 37 (2011) 196–205 203

From 2008 to 2009 there was a significant increase in the use ofplanning strategies. However, as the cross-level-effect is notsignificant, it cannot be supposed that the corresponding changesin Bremen (with the change of the examination system) differsignificantly from the changes in Hesse (with no change in theexamination system).

In all of the cognitive and metacognitive learning strategiesinvestigated, effects were found owing to sex, whereby girls usedthese strategies more frequently than boys. The negative effects ofgeneral cognitive ability on the use of the learning strategies whenpreparing for the exit exam indicate in addition that the strategiesare used especially by students with lower cognitive ability testscores.

Discussion

The analyses examined the question of whether the implemen-tation of state-wide academic-track high school exit exams isassociated with a change in the self-regulated learning of studentswith advanced course in mathematics or English as their chosentest subject. The data analyzed were on students in two Germanstates, Hesse and Bremen, for a period of 3 years (2007–2009),collected shortly before the written exit exams were held. The stateexit exams were introduced in these two school subjects in Hessein 2007 but in Bremen only in 2008. This means that in Bremen, forthe advanced courses in mathematics and English, the change froma class-based to state-wide examination system can be examinedin comparison to the stable state-wide examination system inHesse.

Looking first at the analyses within Bremen, where the changein the examination system can be directly observed, as expectedthe results do not indicate any general effects but rather subject-specific effects (Baumert & Watermann, 2000); the analyses revealclearer changes for English than for mathematics.

For mathematics, controlling for sex and cognitive ability testscores, no significant effects of the immediate introduction ofstate-wide exit exams were identified. At most there is a negativeeffect on students’ persistence. However, it is no longer evident inthe second year of the implementation of the exams. In all otherdimensions investigated there were no effects, or it was only in the3-year comparison that there were significant positive changes.

This was the case for students’ interest in the school subjectmathematics and use of elaboration strategies.

Also the second step of the analysis, where both German statesand the multilevel structure of the data were integrated, yieldedthe finding that the direct introduction of state-wide exit exams inmathematics resulted in no specific short-term effect on students’self-regulated learning.

This interpretation, that the implementation in mathematicscan be judged as neutral or tending toward positive, is additionallyreinforced by the findings in Hesse. Although in Hesse it was notthe change but rather the first 3 years of the implementation of theexams that could be examined, the findings showing positivedevelopments in students’ interest in the school subject, inscholastic self-efficacy, and in the use of elaboration strategiesagain tend to confirm the hypothesis (Baumert & Watermann,2000) that there are productive changes in dimensions that arepositively associated with school subject learning.

In English the results for Bremen show a significant positiveeffect of the introduction of state-wide exit exams on the use ofelaboration strategies when preparing for the exams and oninterest in the school subject. Whereas in the advanced course inmathematics no negative effects could be identified, in English thestudents in the state-wide exit examination system reportedgreater uncertainty about passing the exit exam than in the class-based examination system. These three effects were revealedexplicitly in the change of the system from 2007 to 2008, with noadditional changes evident from 2008 to 2009.

When the Hesse data and the multilevel structure of the dataare taken into consideration in particular the negative effect onuncertainty about passing the exit exam is confirmed with aprobability of error of 5.3%. This supports the findings by Jurgeset al. (Jurges & Schneider, 2010; Jurges et al., 2009) on theemotional burden of state-wide exit exams and contradicts thefindings of Baumert and Watermann (2000). With regard to theeffects of state-wide exit exams on cognitive or metacognitiveregulation strategies, this study found no significant differencesbetween the two German states. But we did find a tendency foryear effects, which, for one, are positive and in line with theexpectations (in that they increase the use of elaboration,monitoring, and planning strategies; see Baumert and Watermann(2000)). But for another, they also indicate to a small extent andonly in trend an increase in the use of surface-level learning

Page 9: Effects of the implementation of state-wide exit exams on students’ self-regulated learning

K. Maag Merki / Studies in Educational Evaluation 37 (2011) 196–205204

strategies. In coming years, it will be important to examinesystematically whether this will lead in the long term to negativelearning behaviors or to a further strengthening of self-regulatedlearning abilities in association with exam preparation. The analysesin Hesse, where it is not the change in the examination system butinstead the first 3 years of the implementation that can beinvestigated, indicate that not only the immediate change but alsothe years thereafter can lead to changes, depending on what theexperience with the state-wide exit exams was in the initial years.

In closing, it can be noted that no general effect of state-wideexit exams can be assumed and that, instead, the effects varyspecific to the school subject. This brings up the question as to howthese differences between the school subjects can be explained.From a methodological point of view the different sample sizes ofstudents who took mathematics or English as their examinationsubject could be way to understand these differences. Our analysesshow that at least the in-state analyses in Bremen (Analyses 1) donot support this argument due to the higher significant effect sizes(Cohens’d) in English than in mathematics courses (interest inschool subject: .256 vs. .083; elaboration strategies: .231 vs. 103;uncertainty: .215 vs. 136, memorization strategies: .180 vs. .148).

However, considering the effects in Analyses 2, the standardizedregression coefficients for the English courses are only somewhathigher than for the mathematics courses (planning strategies: .062vs. .019; monitoring strategies: .101 vs. .024; memorizationstrategies: .062 vs. .010), and the coefficient regarding theelaboration strategies is almost equal (.101 vs. .090). Therefore,the reported subject-specific effects focusing the results of theAnalyses 2 might be partially related to the different sample sizes.

Additionally, content-specific or pedagogically specific expla-nations have to be considered. Baumert and Watermann (2000)found that standardizing effects can be assumed mainly in schoolsubjects that are required for all students or are chosen frequentlyby students. The more selective a school subject is (for example,physics), the more likely it is that the effect of the form oforganization of the exams will disappear. In the case of English andmathematics, both are frequently chosen, and they are in partrequired subjects. Therefore, further explaining factors must beconsidered. It could be that mathematics had reached a relativelyhigh degree of standardization in the conducting of high school exitexams already prior to the introduction of state-wide exit exams,which was not so much the case for English. In addition, the qualityof instruction appears to have improved more in the advancedcourses in English than in the advanced courses in mathematics(Maag Merki, 2011). But further studies are needed to test thesetheses and they should consider also self-selecting processing ofchoosing examination subjects. It may be that subjects requiredifferent levels of competency of self-regulated learning in order toachieve well. If students are able to anticipate the required level ofself-regulated learning in the different subjects self-selectingprocesses may explain the different results better than content-specific differences or differences in terms of varying educationalpractices. In this case the differences are not correlated with thechange of the examination system and with subject-specificfostering of the students’ competencies.

Alternatively, in line with previous studies (e.g., Artelt et al.,2003; Zimmerman & Martinez-Pons, 1990) the impact of otherbackground variables on the development of self-regulatedlearning, particularly family background of the student, has tobe explored to understand the subject-specific results.

The only small effects, or often no effects, tell us that thetheoretical and normative cause and effect assumptions should bequalified. The introduction of state-wide exit exams alone does notalways and automatically lead to more productive learningenvironments that find expression in a strengthening of self-regulated learning. Instead, it must be assumed that the cause and

effect relationship is much more complex, and that in addition todirect factors, particularly also indirect effects in the multilevelsystem need to be studied. Consequently, it will be worthwhile toconduct long-term analyses and structure equation modeling.Further, the different effects in dependency on students’ sex andcognitive ability test score indicate that group-specific studies onthe effects of state-wide exit exams are needed. Up to now therehave been only very few studies of that kind.

Finally, some limitations of the study have to be taken intoaccount. First and foremost, we should be aware that, even if it hadbeen possible to analyze the immediate change of the exitexamination system in Bremen, it would have been moresophisticated to have a larger sample that included also long-term data before the change occurred. Unfortunately, due to somepolitical conditions, this was not possible in this study. Further-more, we analyzed these effects only in mathematics and Englishin two German states. Therefore, additional analyses have to beconducted in other subjects and states. Consequently, thepresented results should not be generalized.

References

Amrein, A. L., & Berliner, D. C. (2002). High-stakes testing, uncertainty, and studentlearning. Education Policy Analysis Archives 10(18) Retrieved from http://epaa.a-su.edu/epaa/v2010n2018/.

Artelt, C., Baumert, J., Julius-McElvany, N., & Peschar, J. (2003). Learners for life: Studentapproaches to learning. Results from PISA 2000. Paris, France: OECD.

Baumert, J., Fend, H., O’Neil, H., & Peschar, J. L. (1998). Prepared for life-long learning:Frame of reference for the measurement of self-regulated learning as a cross curricularcompetence (CCC) in the PISA project. Paris, France: OECD.

Baumert, J., & Watermann, R. (2000). Institutionelle und regionale Variabilitat und dieSicherung gemeinsamer Standards in der gymnasialen Oberstufe [Institutional andregional variability and assuring common standards in the Gymnasium upperlevel]. In J. Baumert, W. Bos, & R. Lehmann (Eds.), TIMSS/III. Dritte InternationaleMathematik – und Naturwissenschaftsstudie – Mathematische und naturwissenschaf-tliche Bildung am Ende der Schullaufbahn. Band 2. Mathematische und physikalischeKompetenzen am Ende der gymnasialen Oberstufe (pp. 317–372). Opladen,Germany: Leske + Budrich.

Baumert, J., Klieme, E., Neubrand, M., Prenzel, M., Schiefele, U., Schneider, W., et al.(2000). Fahigkeit zum selbstregulierten Lernen als facherubergreifende Kompetenz[Self regulated learning as a cross-curriculum competence]. Berlin: PISA ProjektConsortium.

Bishop, J. H. (1997). The effect of national standards and curriculum-based exams onachievement. American Economic Review, 87, 260–264.

Bishop, J. H. (1999). Are national exit examinations important for educational efficien-cy? Swedish Economic Policy Review, 6, 349–398.

Catterall, J. S. (1989). Standards and school dropouts: A national study of tests requiredfor high school graduation. American Journal of Education, 98(1), 1–34.

Earl, L., Nancy, T., & Sutherland, S. (2006). Changing secondary schools is hard. Lessonsfrom 10 years of school improvement in the Manitoba School ImprovementProgram. In A. Harris & J. H. Chrispeels (Eds.), Improving schools and educationalsystems (pp. 109–128). New York, NY: Routledge.

Graham, J. W. (2009). Missing data analysis: Making it work in the real world. AnnualReview of Psychology, 60, 549–576.

Grob, U. & Maag Merki, K. (2001). Uberfachliche Kompetenzen. Theoretische Grundle-gung und empirische Erprobung eines Indikatorensystems [Cross-curricular com-petencies. Theoretical foundation and empirical analysis of an indicator system].Bern: Peter Lang.

Heller, K., & Perleth, C. (2000). Kognitiver Fahigkeitstest KFT 4-12+ R (fur 4. bis 12. Klassen,Revision). Cognitive ability test KFT 4-12+ R (for grades 4 to 12, revised]. Gottingen,Germany: Beltz-Test GmbH.

Holme, J. J., Richards, M. P., Jimerson, J. B., & Cohen, R. W. (2010). Assessing the effects ofhigh school exit examinations. Review of Educational Research, 80, 476–526.

Jerusalem, M., & Satow, L. (1999). Schulbezogene Selbstwirksamkeitserwartung [Scho-lastic self-efficacy belief]. In R. Schwarzer & M. Jerusalem (Eds.), Skalen zurErfassung von Lehrer – und Schulermerkmalen. Dokumentation der psychometrischenVerfahren im Rahmen der wissenschaftlichen Begleitung des Modellversuchs ‘Selbst-wirksame Schulen’ (pp. 15–16). Berlin, Germany: Freie Universitat Berlin.

Jurges, H., & Schneider, K. (2010). Central exit examinations increase performance, buttake the fun out of mathematics. Journal of Population Economics 23.

Jurges, H., Schneider, K., & Buchel, F. (2005). The effect of central exit examinations onstudent achievement: Quasi-experimental evidence from TIMSS Germany. Journalof European Economic Association, 3, 1134–1155.

Jurges, H., Schneider, K., Senkbeil, M., & Carstensen, C. H. (2009). Assessment driveslearning: The effect of central exams on curriculum knowledge and mathematicalliteracy. Munich, Germany: Ifo Institute for Economic Research.

Klein, E. D., Kuhn, S. M., Van Ackeren, I., & Block, R. (2009). Wie zentral sind zentralePrufungen? – Abschlussprufungen am Ende der Sekundarstufe II im nationalenund internationalen Vergleich [How central are central exit exams? Final exams at

Page 10: Effects of the implementation of state-wide exit exams on students’ self-regulated learning

K. Maag Merki / Studies in Educational Evaluation 37 (2011) 196–205 205

the end of the upper secondary level II in national and international comparison].Zeitschrift fur Padagogik, 55, 596–621.

Ludtke, O., Robitzsch, A., Trautwein, U., & Koller, O. (2007). Umgang mit fehlendenWerten in der psychologischen Forschung. Probleme und Losungen [Handlingmissing values in psychological research: Problems and solutions]. PsychologischeRundschau, 58, 103–117.

Maag Merki, K. & Grob, U. (2002). Guiding principles of cantonal and intercantonalcompulsory school curricula in the context of evaluation research. In Rosenmund,M., Fries, A.-V. & Heller, W. (Hrsg.), Comparing Curriculum-Making Processes (pp.153–163). Bern: Peter Lang.

Maag Merki, K. (2011). The Introduction of State-wide Exit Examinations: EmpiricalEffects on Math and English Teaching in German Academically Oriented SecondarySchools. In M. A. Pereyra, H.-G. Kotthoff, & R. Cowen (Eds.), PISA under Examination:Changing Knowledge, Changing Tests, and Changing Schools (pp. 125–142). Rotter-dam: Sense Publishers.

Meyer, L. H., McClure, J., Walkey, F., Weir, K. F., & McKenzie, L. (2009). Secondarystudent motivation orientations and standards-based achievement outcomes.British Journal of Educational Psychology, 79, 273–293.

Miller, S., Heafner, T., & Massey, D. (2009). High-school teachers’ attempts to promoteself-regulated learning: I may learn from you, yet how do I do it? The Urban Review,41, 121–140.

OECD. (2010). PISA 2009 results: Learning to learn. Student engagement, strategies andpractices (Volume III). Paris, France: OECD.

Oerke, B. & Maag Merki, K. (2009). Einfluss der Implementation zentraler Abiturpru-fungen auf die leistungsbezogenen Attributionen von Schulerinnen und Schulernvor dem Abitur [The impact of the implementation of state-wide exit exams onstudent attributions before exit exam]. In W. Bottcher, J.N. Dicke und H. Ziegler(Hrsg.), Evidenzbasierte Bildung. Wirkungsevaluation in Bildungspolitik und padago-gischer Praxis (pp. 171–179). Munster: Waxmann.

Pedulla, J., Abrams, L. M., Madaus, G., Russell, M., Ramos, M., & Miao, J. (2003). Perceivedeffects of state-mandated testing programs on teaching and learning: Findings from anational survey of teachers. Chestnut Hill, MA: Boston College.

Dokumentation der Erhebungsinstrumente (vol. 72, pp. ).). Berlin, Germany: Max-PlanckInstitut fur Bildungsforschung.

Raudenbusch, S. W., Bryk, A. S., Cheong, Y. F., & Congdon, R. (2004). HLM 6: Hierarchicallinear and nonlinear modeling. Lincolnwood, IL: Scientific Software International.

Richman, C. L., Brown, K., & Clark, M. (1987). Personality changes as a function ofminimum competency test success or failure. Contemporary Educational Psycholo-gy, 12, 7–16.

Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York, NY: Wiley.Ryan, K. E., Ryan, A. M., Arbuthnot, K., & Samuels, M. (2007). Students’ motivation for

standardized math exams. Educational Researcher, 36(1), 5–13.Ryan, R. M., & Sapp, A. (2005). Considering the impact of test-based reforms: A self-

determination theory perspective on high stakes testing and student motivationand performance. Unterrichtswissenschaft, 33, 143–159.

Schunk, D. H. (2008). Metacognition, self-regulation, and self-regulated learning:research recommendations. Educational Psychology Review, 20, 463–467.

Taylor, N. (2009). Standard-based accountability in South Africa. School Effectivenessand School Improvement, 20(3), 341–356.

Wossmann, L. (2003). Central exams as the currency of school systems: Internationalevidence on the complementary of school autonomy and central exams. DICEReport – Journal for Institutional Comparisons, 1, 46–56.

Zimmerman, B. J., & Martinez-Pons, M. (1990). Student differences in self-regulatedlearning: Relating grade, sex and giftedness to self-efficacy and strategy use.Journal of Educational Psychology, 82(1), 51–59.

Zimmermann, B. J., & Schunk, D. H. (Eds.). (2001). Self-regulated learning and academicachievement: Theoretical perspectives. Mahwah, Germany: Erlbaum.

Dr. Katharina Maag Merki is a full professor of education at the University of Zurich,focusing on theoretical and empirical studies of educational processes in schools. She isthe president of the Swiss Society for Research in Education and a member of theEducation Sciences Review Board of the German Research Foundation (DFG). Her mainresearch interests are empirical educational research, school effectiveness and schoolimprovement research, educational governance, and self-regulated learning.