22
Academic Self-Concept, Interest, Grades, and Standardized Test Scores: Reciprocal Effects Models of Causal Ordering Herbert W. Marsh SELF Research Centre, University of Western Sydney Ulrich Trautwein and Oliver Lu ¨ dtke Max Planck Institute for Human Development, Berlin, Germany Olaf Ko ¨ ller University of Erlangen-Nuremberg, Germany Ju ¨ rgen Baumert Max Planck Institute for Human Development, Berlin, Germany Reciprocal effects models of longitudinal data show that academic self-concept is both a cause and an effect of achievement. In this study this model was extended to juxtapose self-concept with academic interest. Based on longitudinal data from 2 nationally representative samples of German 7th-grade students (Study 1: N 5 5,649, M age 5 13.4; Study 2: N 5 2,264, M age 5 13.7 years), prior self-concept significantly affected subsequent math interest, school grades, and standardized test scores, whereas prior math interest had only a small effect on subsequent math self-concept. Despite stereotypic gender differences in means, linkages relating these con- structs were invariant over gender. These results demonstrate the positive effects of academic self-concept on a variety of academic outcomes and integrate self-concept with the developmental motivation literature. Academic self-concept, interest, and achievement are interrelated, and stereotypic gender differences are found in specific domains such as English and mathematics. In the present investigation we went beyond merely observing correlations at a single point in time to attempt to disentangle the causal mechanisms relating these constructs across multiple waves of data collection. In a growing body of re- search covering a range of developmental periods, researchers have used reciprocal effects models to explore the causal ordering of academic achievement and academic self-concept. The overarching ration- ale of this work is that people who perceive them- selves to be more effective, more confident, and more able will accomplish more than people who have less positive self-beliefs (e.g., Marsh & Craven, in press). Unlike prior research that has focused on academic self-concept as a causal factor, we also examined ef- fects of academic interest, thus aligning our interests more closely to mainstream motivation research in developmental psychology. Specifically, we focused on the role of gender on self-concept, interest, and achievement in mathemat- ics. There are substantial gender differences in mean levels of these constructs (Beaton et al., 1996; Ko ¨ller, Baumert, & Schnabel, 2001; Marsh & Yeung, 1998; Watt, 2004). However, a more complicated question is the extent to which the relations among these con- structs vary with gender over time. Thus, for example, are high levels of prior math self-concept and math interest more likely to lead to higher levels of subse- quent attainment for girls or for boys? In our research we integrated these issues from different research traditions into a common methodological framework of structural equation modeling (SEM) that has broad applicability in development research. Development of Academic Self-Concept and Its Relation to Achievement Developmental perspectives of self-concept. Self-con- cept, self-perceived competence, self-beliefs, and the role of gender are important in developmental per- spectives of motivation such as expectancy-value r 2005 by the Society for Research in Child Development, Inc. All rights reserved. 0009-3920/2005/7602-0007 These data come from two large-scale German projects directed by Ju ¨ rgen Baumert of the Max Planck Institute for Human De- velopment: Learning Processes, Educational Careers and Psycho- social Development in Adolescence and the German component of the Third International Mathematics and Science Study. The pre- sent investigation was conducted while Herbert Marsh was a visiting scholar at the Center for Educational Research at the Max Planck Institute for Human Development and was supported in part by the University of Western Sydney, the Max Planck Insti- tute, and the Australian Research Council. Correspondence concerning this article should be addressed to Herbert W. Marsh, Director, SELF Research Centre, University of Western Sydney, Bankstown Campus, Locked Bag 1797 Penrith South DC NSW 1797, Australia. Electronic mail may be sent to [email protected]. Child Development, March/April 2005, Volume 76, Number 2, Pages 397 –416

Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering

Embed Size (px)

DESCRIPTION

self concept

Citation preview

Page 1: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering

Academic Self-Concept, Interest, Grades, and Standardized Test Scores:

Reciprocal Effects Models of Causal Ordering

Herbert W. MarshSELF Research Centre, University of Western Sydney

Ulrich Trautwein and Oliver LudtkeMax Planck Institute for Human Development,

Berlin, Germany

Olaf KollerUniversity of Erlangen-Nuremberg, Germany

Jurgen BaumertMax Planck Institute for Human Development,

Berlin, Germany

Reciprocal effects models of longitudinal data show that academic self-concept is both a cause and an effect ofachievement. In this study this model was extended to juxtapose self-concept with academic interest. Based onlongitudinal data from 2 nationally representative samples of German 7th-grade students (Study 1: N5 5,649,Mage5 13.4; Study 2: N5 2,264, M age5 13.7 years), prior self-concept significantly affected subsequent mathinterest, school grades, and standardized test scores, whereas prior math interest had only a small effect onsubsequent math self-concept. Despite stereotypic gender differences in means, linkages relating these con-structs were invariant over gender. These results demonstrate the positive effects of academic self-concept on avariety of academic outcomes and integrate self-concept with the developmental motivation literature.

Academic self-concept, interest, and achievement areinterrelated, and stereotypic gender differences arefound in specific domains such as English andmathematics. In the present investigation we wentbeyond merely observing correlations at a singlepoint in time to attempt to disentangle the causalmechanisms relating these constructs across multiplewaves of data collection. In a growing body of re-search covering a range of developmental periods,researchers have used reciprocal effects models toexplore the causal ordering of academic achievementand academic self-concept. The overarching ration-ale of this work is that people who perceive them-selves to be more effective, more confident, and moreable will accomplish more than people who have lesspositive self-beliefs (e.g., Marsh & Craven, in press).

Unlike prior research that has focused on academicself-concept as a causal factor, we also examined ef-fects of academic interest, thus aligning our interestsmore closely to mainstream motivation research indevelopmental psychology.

Specifically, we focused on the role of gender onself-concept, interest, and achievement in mathemat-ics. There are substantial gender differences in meanlevels of these constructs (Beaton et al., 1996; Koller,Baumert, & Schnabel, 2001; Marsh & Yeung, 1998;Watt, 2004). However, a more complicated question isthe extent to which the relations among these con-structs vary with gender over time. Thus, for example,are high levels of prior math self-concept and mathinterest more likely to lead to higher levels of subse-quent attainment for girls or for boys? In our researchwe integrated these issues from different researchtraditions into a common methodological frameworkof structural equation modeling (SEM) that has broadapplicability in development research.

Development of Academic Self-Concept and Its Relationto Achievement

Developmental perspectives of self-concept. Self-con-cept, self-perceived competence, self-beliefs, and therole of gender are important in developmental per-spectives of motivation such as expectancy-value

r 2005 by the Society for Research in Child Development, Inc.All rights reserved. 0009-3920/2005/7602-0007

These data come from two large-scale German projects directedby Jurgen Baumert of the Max Planck Institute for Human De-velopment: Learning Processes, Educational Careers and Psycho-social Development in Adolescence and the German component ofthe Third International Mathematics and Science Study. The pre-sent investigation was conducted while Herbert Marsh was avisiting scholar at the Center for Educational Research at the MaxPlanck Institute for Human Development and was supported inpart by the University of Western Sydney, the Max Planck Insti-tute, and the Australian Research Council.Correspondence concerning this article should be addressed to

Herbert W. Marsh, Director, SELF Research Centre, University ofWestern Sydney, Bankstown Campus, Locked Bag 1797 PenrithSouth DC NSW 1797, Australia. Electronic mail may be sent [email protected].

Child Development, March/April 2005, Volume 76, Number 2, Pages 397 – 416

Page 2: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering

theory. Consistent themes have emerged from re-views of the development of competence self-beliefs(Harter, 1990, 1992, 1998; Jacobs, Lanza, Osgood,Eccles, & Wigfield, 2002; Marsh, 1989; Marsh, Craven,& Debus, 1991, 1998; Marsh, Debus, & Bornholt, 2005;Watt, 2004; Wigfield, 1994; Wigfield & Eccles, 1992).With improved methodology (better measurement,stronger applications of confirmatory factor analyses[CFA]), researchers have demonstrated that even veryyoung children are able to differentiate betweendifferent domains of self-concept (e.g., verbal, math-ematics, physical ability, physical appearance, peerrelations, relations with parents). There is clear evi-dence for increasing differentiation among these do-mains through age 12 (Marsh, 1989; Marsh & Ayotte,2003), but not for older children (Marsh, 1989).

Age and gender differences in mean levels of self-concept are generally small but systematic. Self-concept declines from a young age through adoles-cence, levels out, and then increases at least throughearly adulthood (Marsh, 1989, 1993b; see also Crain,1996; Jacobs et al., 2002; Marsh & Craven, 1997; Wig-field et al., 1997). There are also counterbalancinggender differences consistent with gender stereotypes.Consistent across preadolescent, adolescent, late-ad-olescent/young adult periods, males report high-er physical ability, physical appearance, and mathself-concepts, whereas females report higher verbalself-concepts (Marsh, 1989; see also Crain, 1996;Wigfield et al., 1997). Contrary to gender intensifi-cation hypotheses, gender differences did not varysubstantially with age. Based on longitudinal growthtrajectories of children in Grades 1 through 12, Jacobset al. (2002) reported gender stereotypic differencesand age-related declines in competence perceptionsbut concluded that their results were broadly con-sistent with Marsh’s (1993b) findings of no age-re-lated changes in gender differences in self-concept.

Most self-concept studies have focused on genderand age differences in mean levels of self-concept butnot on factor structure differences, including rela-tions among key constructs. Byrne and Shavelson(1987), for example, concluded, ‘‘Clearly, interpreta-tions of mean differences in SC [self-concept] be-tween males and females are problematic unless theunderlying construct has the same structure in thetwo groups’’ (p. 369). Hattie (1992) also emphasizedthat ‘‘the differences in means may not be as criticalin the development of self-concept as changes infactor structure’’ (pp. 177–178). Testing how rela-tions among these constructs vary with gender andage is even more complicated. Thus, for example,Marsh (1993b) tested the gender-stereotypic modelthat hypothesized that: (a) math self-concept would

be more highly correlated with academic self-con-cept and global self-esteem for boys than for girls, (b)verbal self-concept would be more highly correlatedwith academic self-concept and global self-esteemfor girls than for boys, and (c) the contrasting patternof results would intensify and increase with age.Instead, however, he found support for the gender-invariant model in which relations among math,verbal, academic, and general self-concepts did notvary as a function of gender or age. More recently,Watt (2004; see also Jacobs et al., 2002) demonstratedthat gender differences favoring boys for math andEnglish for girls showed little support for eithergender-intensification or -convergence hypotheses.

Academic self-concept and achievement: A reciprocaleffects model. The causal ordering of academic self-concept and academic achievement has importanttheoretical and practical implications, and has beenthe focus of considerable research. Byrne (1996) em-phasized that much of the interest in the self-con-cept/achievement relation stems from the belief thatacademic self-concept has motivational propertiessuch that changes in academic self-concept will leadto changes in subsequent academic achievement.Calsyn and Kenny (1977) contrasted self-enhance-ment and skill development models. According tothe self-enhancement model, academic self-conceptis a primary determinant of academic achievement(ASC ! ACH), whereas the skill development modelimplies that academic self-concept emerges princi-pally as a consequence of academic achievement(ACH ! ASC). However, Marsh and colleagues(Marsh, 1990, 1993a; Marsh, Byrne, & Yeung, 1999;Marsh & Craven, in press) argued that much of theearly research was methodologically unsound andinconsistent with the academic self-concept theory.Based on theory, a review of empirical research, andmethodological advances in SEM, he argued for areciprocal effects model in which prior self-conceptaffects subsequent achievement and prior achieve-ment affects subsequent self-concept. In their meta-analysis of self-belief measures, Valentine, Dubois,and Cooper (2004) also found clear support for a re-ciprocal effects model. They concluded that the ef-fects of self-beliefs on subsequent performance werestronger when the measure of self-belief was basedon domain-specific measures rather than globalmeasures, such as self-esteem, and when self-beliefand achievement measures were matched in terms ofsubject area (e.g., mathematics achievement andmathself-concept) as is typical in self-concept research.

Academic achievement: School grades and standardizedtest scores. Academic self-concept, interest, and re-lated motivation constructs should be substantially

398 Marsh, Trautwein, Ludtke, Koller, and Baumert

Page 3: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering

correlated with both school grades and standardizedtest scores. However, Wylie (1979) posited that self-concept should be more strongly related to schoolgrades than to test scores because school grades are amore salient source of feedback to students that alsoreflect motivational properties likely to be related toself-concept (see also Hattie, 1992; Marsh, 1987, 1990,1993a). Marsh (1987) extended this proposal to lon-gitudinal causal modeling studies, suggesting thatpaths from self-concept to achievement should bestronger for school-based performance measuresthan for low-stakes standardized achievementmeasures. For low-stakes standardized tests, stu-dents have no opportunity and little incentive tostudy for the tests. Hence, characteristics such asstudy habits, effort, and persistence are unlikely toaffect test performance. In contrast, these character-istics are likely to have more impact on examinationperformance when students are highly motivated toperform well on an examination and know the con-tent of the examinationFwhen these characteristicsare an actual part of the grading process, as is typi-cally the case with school grades (e.g., students arepenalized for sloppy work habits or not completingassignments on time but are rewarded for conscien-tious effort). Thus, the effects of prior self-concept onsubsequent achievement should be stronger whenachievement is based on high-stakes school gradesrather than low-stakes standardized tests (Marsh,1987, 1990, 1993a; Marsh & Yeung, 1997a, 1998; butsee also Helmke & van Aken, 1995). Here, we extendthis hypothesis to include academic interest andevaluate whether this pattern of results generalizesover responses by boys and girls. This distinctionbetween school grades and test scores is also relevantto the study of gender differences, as girls typicallydo better than boys on school grades, which rein-force conscientious effort and penalize poor workhabits, compared with standardized test scores,which are purer measures of learning (e.g., Marsh &Yeung, 1998).

Developmental perspectives on the reciprocal effectsmodel. Young children’s understanding of compe-tence changes with age and, compared with olderchildren, their academic self-concepts are more posi-tive and less related to objective outcomes (e.g.,Marsh, 1989; Marsh & Craven, 1997; Marsh et al.,1998). Wigfield and Karpathian (1991; see also Wig-field, 1994) further argued that ‘‘once ability percep-tions are more firmly established the relation likelybecomes reciprocal: Students with high perceptions ofability would approach new tasks with confidence,and success on those tasks is likely to bolster theirconfidence in their ability’’ (p. 255). Consistent with

these suggestions, Skaalvik and Hagtvet (1990) foundsupport for a reciprocal effects model for older stu-dents (sixth and seven grades) but a skill develop-ment (ACH ! ASC) model for younger students.Marsh et al. (1999) also argued that although relationsbetween academic self-concept and achievement be-come stronger with age, there was insufficient evi-dence to determine whether the causal relationsbetween these variables change with age or whetherdifferences reflect underlying processes or research-ers’ inability to measure these constructs with youngchildren (see Marsh et al., 2005).

Guay, Marsh, and Boivin (2003) took this up, usinga multicohort –multioccasion design (i.e., three agecohorts: students in Grades 2, 3, and 4, each withthree measurement waves separated by 1-year in-tervals). They found that as children grew older,their academic self-concept responses became morereliable, more stable, and more strongly correlatedwith academic achievement. However, the magni-tude of these developmental differences was small. Itis important that there was stronger support for theself-enhancement model (ACH ! ASC) than for theskill development model (ACH ! ASC) for all threeage cohorts, and support for the reciprocal effectsmodel was invariant over age. This study providesgood support for the generalizability of reciprocaleffects to young children as well as adolescents.

Academic Interest and Achievement

Individual interest is hypothesized to be a rela-tively enduring predisposition to attend to certainobjects and activities, and is associated with positiveaffect, persistence, and learning (Hidi & Ainley, 2002;Koller et al., 2001; Krapp, 2000; Renninger, 2000).Academic interests are postulated to be dispositionsbased on mental schemata associating the objects ofinterest with positive experiences and a personalvalue system that are activated in the form of interest-driven actions. Whereas there is a theoretical distinc-tion between the value (affective) and commitment(importance) components of interest, researchers havebeen unable to distinguish between these componentsempirically (Koller et al., 2001). Interest-driven activ-ities are characterized by the experience of compe-tence and personal control; feelings of autonomy andself-determination; positive emotional states; and,under optimal circumstances, an experience of flowwhereby the person and the object of interest merge(Csikszentmihalyi & Schiefele, 1993). Other motiva-tional researchers (e.g., Wigfield & Eccles, 1992) positinterest as one of the components of task value in anexpectancy-value framework.

A Reciprocal Effects Model 399

Page 4: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering

Like academic self-concept, academic interest isdomain specific; there are stereotypic gender differ-ences such that boys have more interest in math andscience whereas girls have more interest in verbalareas, and there are gradual declines in interest lev-els before adolescence and in early adolescence (e.g.,Eccles, Wigfield, & Schiefele, 1998). It is surprising,however, that there is little research incorporatingboth academic self-concept and academic interestinto longitudinal SEMs evaluating the reciprocal ef-fects of these constructs on each other and on otheracademic outcomes.

On the basis of their meta-analysis, Schiefele,Krapp, and Wintler (1992) concluded that the overallcorrelation between interest and academic achieve-ment was about .30 but that this relation was heter-ogeneous across different school subjects andindicators of achievement. In subsequent experi-mental studies, Schiefele (1996) demonstrated thatinterest was a significant predictor of subsequentachievement, mediated in part by activation; that is,interest increased activation, which in turn led togreater achievement. However, because most studiesin this area are cross-sectional studies based on cor-relations, Schiefele (1998) concluded that there is nobasis for drawing causal conclusions from this re-search or even to claim that interest predicts subse-quent achievement beyond what can be predicted byprior achievement.

Several authors have proposed that academicachievement or academic self-concept affect interest(e.g., Koller et al., 2001; Krapp, 2000). Marsh, Craven,and Debus (2000) demonstrated that cognitive andaffective self-perceptions were highly correlated. Inher theoretical model of self-concept development,Harter (1992, 1998) posited that students feel moreintrinsically motivated in domains in which they feelcompetent. However, because her results were basedon cross-sectional data, stronger tests of her causalordering hypothesis require longitudinal designslike those in tests of the reciprocal effects model. Incognitive evaluation theory, Deci and Ryan (1985)also hypothesized that increased perceptions ofcompetency lead to increased levels of intrinsicmotivation. Hence, Baumert, Schnabel, and Lehrke(1998) suggested that the effect of achievementon interest might be mediated through academicself-concept. Based on responses by elementaryschool children, Bouffard, Marcoux, Vezeau, andBordeleau (2003) reported that self-concept wasconsistently related to achievement in reading andmathematics at each year in school, whereas intrinsicmotivation did not contribute to the prediction ofachievement.

In her original expectancy-value model, Eccles(1983) posited no links between expectations ofsuccess and task value (including interest). However,she hypothesized academic self-concept to affectboth expectations and value directly, and to affectachievement-related choices indirectly through itsinfluences on expectations and value. Complicatingthese predictions further, subsequent research (Eccles& Wigfield, 1995; Wigfield & Eccles, 2002) indicatedthat academic self-concept and expectations for suc-cess could not be distinguished empirically. Puttingtogether these different perspectives, expectancy-val-ue theory posits academic self-concept to have acausal effect on both academic interest and achieve-ment, and academic interest to have an effect on ac-ademic achievement. However, reciprocal effects inwhich prior achievement also affects subsequent in-terest and self-concept (mediated, perhaps, by otherconstructs such as causal attributions) are also ap-parently consistent with expectancy-value theory.Expectancy-value theory apparently does not, how-ever, posit a direct effect of interest on self-concept.

In empirical research based on expectancy-valuetheory, Eccles, Wigfield, and colleagues (Eccles, 1983;Wigfield, 1994; Wigfield et al., 1997) showed thatcorrelations between self-perceived competency andinterest were evident for even very young childrenbut that the size of this relation increased with ageduring early school years. Although self-perceivedcompetence was related to several different valueconstructs in the expectancy-value model, the rela-tions with interest were consistently strongest (Wig-field & Eccles, 2002). Particularly Wigfield et al.(1997; see also Wigfield & Eccles, 2002) evaluatedpatterns of relations between competence and inter-est with a multiwave–multicohort study for childrenranging from second to sixth grades. Whereas com-petence perceptions were linked over time, as wereinterest ratings, there were few cross-construct linksrelating these two constructs over time. Where theselinks did occur, they tended to be from prior com-petence perceptions to subsequent interest, thussupporting expectancy-value predictions. For longi-tudinal growth trajectories of children in Grades 1through 12, Jacobs et al. (2002) found positive rela-tions between competency beliefs and task valuesthat generalized over domains and age. Consistentwith the expectancy-value assumption that compe-tence causes task value, much of the variance in taskvalues was explained by competence perceptions.However, Jacobs et al. also noted there might be re-ciprocal (bidirectional) effects between these twoconstructs over time and argued for longitudinalresearch that simultaneously considered competency

400 Marsh, Trautwein, Ludtke, Koller, and Baumert

Page 5: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering

beliefs and task values. Hence, a primary aim of thepresent investigation was to pursue this limitation inexisting research.

Koller et al. (2001) argued that the role of interestwas particularly relevant in mathematics because itis perceived to be a difficult subject; thus, motiva-tional factors are important for enhancing academicachievement. In longitudinal research covering thehigh school years (Grades 7, 10, and 12), mathe-matics interest in Grade 7 had no direct effect onachievement in Grade 10. However, interest did havean effect on coursework selection, which in turn hadan effect on achievement in Grade 12. Koller et al.also found that interest in Grade 10 did have a directeffect on achievement in Grade 12, suggesting thatinterest became more important in later school years,when the learning environment was not so highlystructured and intrinsic motivation played a moreimportant role in academic choice. The results mayalso be consistent with Wigfield’s (1994; Wigfield &Eccles, 2002) conclusion that whereas prior compe-tence perceptions are strong predictors of subse-quent achievement, task values such as interest arethe better predictors of decisions to enroll in math-ematics and English classes.

The Present Investigation

Despite the substantial overlap in the historicaldevelopment and juxtaposition of theoretical issues,there have been few longitudinal causal-orderingstudies of relations among academic self-concept,academic interest, and academic achievement, andhow these relations vary with gender. Marsh et al.(1999) noted methodological limitations of tests ofthe reciprocal effects model in self-concept research.Particularly relevant to the present investigation,they encouraged researchers to: (a) explore how ef-fects associated with school grades and standardizedachievement test scores differed by contrasting bothachievement constructs in the same study, (b) in-corporate additional motivational variables in theseSEMs to determine their role in the reciprocal effectsmodel, (c) consider a sufficiently large and diversesample to justify the use of SEMs, (d) evaluate thegenerality of the findings across different subgroupsof respondents based on characteristics such asgender, and (e) explore the implications of lags be-tween different waves of data collection that variedin length of timeFparticularly those within a singleacademic year against those that span more than oneacademic school year. Reciprocal effects between ac-ademic self-concept and achievement have been es-tablished but this research has not included academic

interest even though these constructs have beenjuxtaposed in several theories. Hence, it is importantto evaluate the reciprocal effects among constructsand academic achievement, using longitudinal de-signs and tests of causal ordering (Jacobs et al., 2002).In the present investigation we pursued these theo-retically important issues in two large studies ofmath self-concept, interest, school grades, andstandardized test scores for nationally representativesamples of German Grade 7 students. Study 1 wasbased on one cohort of students using two waves ofdata collected in a single school year. Study 2 con-sisted of a different cohort of Grade 7 students col-lected approximately 3 years later, comprised twowaves spanning two different school years, and in-cluded both the interest measure used in Study 1 anda new, potentially stronger measure of interest.

Study 1

Method

Data. Study 1 is based on the longitudinal studyLearning Processes, Educational Careers and Psy-chosocial Development in Adolescence and YoungAdulthood, which was conducted by the Max PlanckInstitute for Human Development in Berlin, Ger-many. Data were collected from large, representativesamples from four German states in which schoolswere selected randomly and two classes within eachschool were sampled randomly. Most students (93%)were German citizens and most were Caucasian(95%). Because the sample was representative, it washeterogeneous in relation to socioeconomic status.The final sample used here consists of 5,649 seventhgraders (M age5 13.4 years, 54% females) who weretested at two points during the same school year.Classes were excluded that participated in the studyon only one of the two occasions or that had less than10 responding students. (For more detailed descrip-tions of the study and resulting database, see Koller,1998; see also http://www.biju.mpg.de).

Measures. Math self-concept and math interestwere measured on each occasion (see the Appendixfor the wording of the items). Scores on both mea-sures were reliable (as4.8) and have been shown inprevious research to have convergent and discrimi-nant validity in relation to classroom-based per-formance in different school subjects (Baumert et al.,1998). Standardized achievement in mathematicswas measured at Time 1 (T1) and Time 2 (T2). Mathachievement test items were taken from previousnational and international studies, in particularthe International Association for the Evaluation of

A Reciprocal Effects Model 401

Page 6: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering

Educational Achievement (IEA) First and SecondInternational Mathematics Study. Although most ofthe items had a multiple-choice format, approxi-mately 15% had a short-answer format. The content ofthe test items was based on mathematics curriculacovered in Grades 5 to 7. Previous analyses (see Kol-ler, 1998) based on item response theory revealed thata unidimensional model was appropriate for de-scribing the latent variable underlying the test results.Therefore, we used the total mathematics scores atboth points of measurement. The coefficient alphaestimates of reliability were greater than .75 for bothmeasurement points. We characterized the test as alow-stakes test because the results did not contributeto the formal evaluation of individual students orschool grades, students did not expect to receivefeedback on their individual performance, and stu-dents had no incentive to study for the examination.School grades in mathematics were based on self-re-ports at the end of sixth grade (but reported at thestart of seventh grade) and in the middle of seventhgrade (T2).

Statistical analysis. SEMs were estimated withLISREL (version 8.54) using maximum likelihood es-timation (for further discussion of SEM, see Bollen,1989; Byrne, 1998; Joreskog & Sorbom, 1993; Kaplan,2000). Following recommendations by Marsh andHau (1996; see also earlier discussion by Joreskog,1979), correlated uniquenesses were included for thematching items collected at T1 and T2 because theirexclusion would positively bias the correspondingcorrelation estimates. Their inclusion, however, hadno substantively important effect on the pattern ofparameter estimates, suggesting that the inclusion ofcorrelated uniquenesses was not a critical issue. Tofacilitate the substantive import of the results, only themodels with correlated uniquenesses are presented.

In most educational research based on school set-tings, individual student characteristics are poten-tially confounded with those associated with classesor schools because individuals are not assignedrandomly to groups (Raudenbush & Bryk, 2002). Ourdata had a multilevel or hierarchical structure in thatstudents were nested within classes. Because stu-dents within the same class were likely to be morehomogeneous than a truly random sample of stu-dents, standard errors of parameter estimates werelikely to be biased in the direction of being too smalland to result in inflated levels of Type I errors (i.e.,false positives). Fortunately, this problem is typicallyless serious for parameter estimates based on rela-tions between variables than means and typically nota serious problem for self-concept data where thereis little variation among classes (e.g., Marsh, Hau, &

Kong, 2002; Marsh & Rowe, 1996). Consistent withthis previous research, preliminary analyses in thepresent investigation indicated that variance compo-nents associated with the self-concept and interestscores were very small, varying between .05 and .10 inboth Studies 1 and 2. In the present investigation, wedealt with this problem by constructing a pooledwithin-class covariancematrix in which between-classdifferences were controlled. We did this by centeringthe means of all variables at the mean of the class fromwhich the case came (i.e., computing the deviationbetween the raw score and the corresponding classmean for that score; for further discussion, see Gold-stein, 2003; Raudenbush & Bryk, 2002).

Particularly for longitudinal data, the inevitablemissing data are a potentially important problem. InStudy 1, for example, approximately 12% of the re-sponses were missing for the total sample of stu-dents who responded at either T1 or T2. In themethodological literature on missing data (e.g.,Graham &Hoffer, 2000; Little & Rubin, 1987), there isa growing consensus that the imputation of missingobservations has several advantages over traditionallistwise, and particularly pairwise, deletion meth-ods. In the present investigation, we explored a va-riety of alternative approaches to this problem butchose to emphasize results based on the full infor-mation maximum likelihood (FIML) approach tomissing data (e.g., Allison, 2001). This approachbetter represents the entire sample rather than justthe subsample of students who have no missing datawhile still providing appropriate tests of statisticalsignificance that reflect the amount of missing datafor each variable. Whereas this distinction may beless important for studies based on samples of con-venience, it is important for samples specifically se-lected to be representative of a larger population, asin the present investigation. For FIML analyses,LISREL provides only the root mean square error ofapproximation (RMSEA) to evaluate goodness of fit.Whereas tests of statistical significance and indexesof fit aid in evaluating the fit of a model, there isultimately a degree of subjectivity and professionaljudgment in the selection of a ‘‘best’’ model (Marsh,Balla, & McDonald, 1988). Hence, our main focus ison the evaluation of parameter estimates. Althoughwe argue a priori for the superiority of the FIMLapproach that is the focus of our presentation, it isimportant to emphasize that the results from theFIML analyses were very similar to unreportedanalyses based on imputation with expectationmaximization, as well as to other approaches that weexplored. In particular, both analyses resulted infully proper solutions, well-defined factors with

402 Marsh, Trautwein, Ludtke, Koller, and Baumert

Page 7: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering

substantial factor loadings, standardized parameterestimates that were very similar, and satisfactorygoodness of fit.

In CFA studies with multiple groups, it is possibleto test the invariance of any one, any set, or all pa-rameter estimates across the multiple groups. Testsof factorial invariance (see Bollen, 1989; Byrne, 1998;Joreskog & Sorbom, 1993; Marsh, 1994) traditionallyposit a series of nested models in which the end-points are the least restrictive model with no invar-iance constraints and the most restrictive (totalinvariance) model with all parameters constrained tobe the same across all groups. Testing for factor in-variance essentially involves comparing a number ofmodels in which aspects of the factor structure aresystematically held invariant across groups (malesand females in the present investigation), and as-sessing fit indexes when elements of these structuresare constrained. If the introduction of increasinglystringent invariance constraints results in little or nochange in goodness of fit, there is evidence in sup-port of the invariance of the factor structure. Ingeneral, the minimal condition for factorial invari-ance is the equivalence of all factor loadings in themultiple groups (e.g., Bollen, 1989; Byrne, 1998;Joreskog & Sorbom, 1993; Marsh, 1994), but our mainfocus is on the invariance of correlations among thelatent constructs and path coefficients relating T1constructs to T2 constructs.

Within this framework, it is also possible to extendtests of invariance of mean and covariance structuresto include measured variable intercepts and latentmean differences in the constructs (Byrne, 1998;Kaplan, 2000; Joreskog & Sorbom, 1993; Marsh &Grayson, 1994). Adapting terminology from itemresponse theory (Marsh & Grayson, 1994), eachmeasured variable (t) is related to the latent construct(T) by the equation t5 a1bTwhere b is the slope (ordiscrimination) parameter that reflects how changesin the observed variable are related to changes in thelatent construct and a is the intercept (or difficulty)parameter that reflects the ease or difficulty of get-ting high manifest scores for a particular measuredvariable. Unless there is complete or at least partialinvariance of both the a and b parameters across themultiple groups, the comparison of mean differencesacross the groups may be unwarranted. FollowingMeredith (1993), we adapt the terms strong and strictinvariance. Strong invariance holds when factorloadings and measured variable intercepts are in-variant across groups so that between-group differ-ences in average item scores reflect differences inlatent means. Strict invariance holds when measuredvariable uniquenesses also are invariant across

groups so that item and scale variances are compa-rable across groups. Although latent mean differ-ences are typically tested with CFAs, it is alsopossible to test for differences with an SEM in whichsubsequent mean differences are corrected for dif-ferences in variables occurring earlier in the causalordering of longitudinal data (Marsh & Grayson,1990). In the present investigation, for example, wetested for latent mean differences between responsesby boys and girls in four constructs (self-concept,interest, grades, test scores) at T1 and T2 in a CFAmodel, and then evaluated differences at T2 con-trolling for differences at T1 in an SEM.

Results and Discussion

We began by evaluating the complete SEM thatincluded T1 and T2 measures of math self-concept,interest, school grades, and test scores (see Table 1).Math self-concept and interest were both positivelycorrelated with math grades and test scores (Table 1).However, consistent with a priori predictions, bothself-concept and interest were systematically morehighly correlated with school grades (about .40 forself-concept and .22 for interest) than with stand-ardized test scores (about .30 for self-concept and .15for interest). It is also important to note that corre-lations of self-concept with achievement were morepositive than the corresponding correlations be-tween interest and achievement. Finally, the patternof these results and actual sizes of the correlationswere consistent across T1 and T2 responses.

In the evaluation of path coefficients relating T1and T2 constructs, we juxtaposed the results of sev-eral models. In each case we began with the overallmodel that contained all four constructs (self-con-cept, interest, grades, test scores). However, becausethere were positive correlations among all four con-structs, multicollinearity might obscure the patternof results. Thus, for example, it would be possible forboth self-concept and interest to have significant ef-fects on achievement when considered separatelybut for the effects of neither construct to be statisti-cally significant when both are considered simulta-neously (i.e., the unique effect of neither is significantwhen the effects of each are controlled for the effectsof the other). For this reason, we also conducted aseries of supplemental models in which we evalu-ated the causal ordering among various pairs ofconstructs (see Table 2 and Figure 1) that are morelike traditional causal models used in previous re-ciprocal effects research.

Math self-concept and achievement. Consistent witha priori predictions and previous research, reciprocal

A Reciprocal Effects Model 403

Page 8: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering

Table 1

Factor Solution Relating Academic Self-Concept, Interest, School Grades, and Test Scores at Times 1 and 2 in Study 1: Full Information Maximum

Likelihood Estimation

Time 1 constructs Time 2 constructs

MASC MInt MGrd MTst MASC Mint MGrd MTst

Factor loadings

T1MASC1 .63a

T1MASC2 .77���

T1MASC3 .80���

T1MASC4 .63���

T1MASC5 .80���

T1MINT1 .62a

T1MINT2 .75���

T1MINT3 .80���

T1MINT4 .62���

T1MGrade 1.00a

T1Mtest 1.00a

T2MASC1 .72a

T2MASC2 .84���

T2MASC3 .86���

T2MASC4 .73���

T2MASC5 .84���

T2MINT1 .65a

T2MINT2 .78���

T2MINT3 .82���

T2MINT4 .65���

T2MGrade 1.00a

T2MTst 1.00a

Path coefficients

T2MASC .57��� .04� .03 .06���

T2Mint .07��� .55��� .00 .02

T2MGrd .24��� .01 .35��� .15���

T2MTst .09��� .02 .17��� .40���

Residual variances/covariances

T1MASC 1.00

T1Mint .56��� 1.00

T1MGrd .41��� .22��� 1.00

T1MTst .32��� .15��� .35��� 1.00

T2MASC .61���

T2Mint .16��� .65���

T2MGrd .09��� .06��� .66���

T2MTst .07��� .03� .10��� .71���

Correlationsb

T1MASC 1.00

T1Mint .56��� 1.00

T1MGrd .41��� .22��� 1.00

T1MTst .32��� .15��� .35��� 1.00

T2MASC .62��� .37��� .29��� .25��� 1.00

T2Mint .39��� .59��� .16��� .12��� .41��� 1.00

T2MGrd .44��� .25��� .50��� .35��� .38��� .23��� 1.00

T2MTst .29��� .17��� .35��� .49��� .28��� .15��� .37��� 1.00

Note. All variables were given a label that identifies the Time (T1 or T2), the construct (MASC5math self-concept, MINT5math interest,Mgrd5math grade, or MTst5math test), and, for the multiple indicators of each latent construct, the item number. All parameter es-timates are presented in completely standardized form. Not presented are the uniquenesses and correlated uniquenesses. Although thefull information maximum likelihood chi-square of 3516.3 (df5 176) was highly significant because of the large sample size, the root meansquare error of approximation (RMSEA) of .058 demonstrated that the model was able to fit the data well. For comparison purposes, thesame model was estimated, in which missing values were imputed using the expectation maximization algorithm. The parameter esti-mates were nearly identical to those presented here and the goodness-of-fit statistics indicated a good fit to the data (normal theoryweighted least squares w25 4250.681, df5 176, RMSEA5 .064, non-normed fit index5 .965, comparative fit index5 .973, standardized rootmean square residual5 .0383).aIn the unstandardized model, the first indicator of each construct was fixed to 1.0 to fix the metric of the factor.bFactor correlations are based on the equivalent confirmatory factor analysis model in which all constructs are correlated. Time 1 corre-lations are the same as in the structural equation model (see residual variance and covariance estimates) but differ from Time 2 estimates inthat the effects of Time 1 constructs are not partialed out of the correlations, whereas they are partialed out for the Time 2 residualvariances and covariances.�p o .05. ���p o .001.

404 Marsh, Trautwein, Ludtke, Koller, and Baumert

Page 9: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering

effects were found between math self-concept andachievement in the overall model (see Table 1 andFigure 1). It is not surprising that the strongest effectof T1 math self-concept was on T2 math self-concept.However, the effect of T1 math self-concept was alsostatistically significant for both T2 math grades (.24)and T2 math test scores (.09), even after controllingfor the effects of other T1 measures (interest, grades,test scores). Also consistent with a priori predictions,the effects of T1 math self-concept were greater forT2 school grades than for T2 test scores. The effectsof the T1 achievement on T2 self-concept weresmaller than the effects of T1 self-concept on T2achievement. Whereas the effects of T1 math testscores on T2 math self-concept were small (.06) buthighly significant (po.001), the effects of T1 mathgrades were even smaller (.03) and marginally non-

significant (.104p4.05). Although these resultssupport the reciprocal effects model, the effects ofprior self-concept on subsequent achievement werestronger than the corresponding effects of priorachievement on subsequent self-concept.

In supplemental analyses, we constructed separatemodels to evaluate the reciprocal effects of math self-concept with math grades and with math test scores(excluding math interest; see Figures 2.1 and 2.2 forStudy 1). Although the pattern of results in thesesupplemental models was the same as in the overallmodel, the effects were stronger. Thus, for example,the effects of T1 math self-concept were .15 and .28for the models of the test scores and grades, re-spectively (compared with .07 and .24 in the overallmodel). Also, the .04 effect of T1 grades on T2 mathself-concept was statistically significant (po.01),whereas it was marginally nonsignificant in theoverall analysis. Hence, results of the supplementalanalyses were consistent with those based on theoverall model and supported predictions based onthe reciprocal effects model of self-concept andachievement.

Table 2

Path Coefficients Relating Time 1 Constructs (Math Self-Concept,

Interest, School Grades, Standardized Test Scores) to Corresponding

Time 2 Constructs for Alternative Structural Equation Models

Considered in Studies 1 and 2

Time 1 constructs

Time 1 constructs T1MASC T1MInt T1MGRd T1MTst

1. Study 1

T2MASC .57��� .04� .03 .06���

T2MInt .07��� .55��� .00 .02

T2MGrd .24��� .01 .35��� .15���

T2MTst .09��� .02 .17��� .40���

2. Study 2, subject-specific interest

T2MASC .55��� .07� .08��� .03

T2MInt .12��� .51��� .04 � .03

T2MGrd .26��� .01 .37��� .06�

T2MTst .16��� � .01 .14��� .35���

3. Study 2, domain-specific interest

T2MASC .55��� .09��� .08��� .03

T2MInt .16��� .51��� .01 .00

T2MGrd .26��� � .01 .37��� .07���

T2MTst .16��� � .01 .14��� .35���

4. Study 2, combined interest

T2MASC .54��� .09� .08� .03

T2MInt .13��� .52��� .04 � .02

T2MGrd .26��� .00 .37��� .06�

T2MTst .16��� � .01 .14��� .35���

Note. All variables were given a label that identifies the Time (T1 orT2), the construct (MASC5math self-concept, MInt5math in-terest, MGrd5math grade, or MTst5math test). All models werebased on full information maximum likelihood. All parameterestimates are presented in completely standardized form. Resultsfor Model 1 (see Table 1) andModel 3 (see Table 3) are presented inmore detail and are the main focus of the present investigation,whereas critical parameter estimates for the alternative models arepresented here for comparison purposes.�p o .05. ���p o .001.

MSC

MTest

MGrd

MInt

Time 1 Time 2

.07/.12

.24/.26

.09/.16.04/.07

.17/.14

.06/.03*

.15/.06

.03*/.08

.57/.55

.55/.51

.35/.37

.40/.35MTest

MGrd

MInt

MSC

Figure 1. Structural equation model paths relating Time 1 (T1) toTime 2 (T2) constructs. Stability (horizontal) paths betweenmatching T1 and T2 constructs are presented in gray and all pathsbetween nonmatching constructs are presented in black. The firstcoefficient in each box is based on Study 1 results (see also Table 1)and the second is based on Study 2 results (see also Table 5). Onlystatistically significant paths are presented (except where a pathwas significant in only one of the two studies, in which case thenonsignificant path is presented with an asterisk). MSC5mathself-concept, MInt5math interest, MGrd5math grade, MTest5math test scores.

A Reciprocal Effects Model 405

Page 10: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering

Math interest and achievement. Although math in-terest was correlated with math achievement, therewas no support for any reciprocal effects between thetwo constructs based on the overall model (Table 1).Whereas T1 math interest had a substantial effect onT2 math interest, it had no significant effects on ei-ther T2 math test scores (.02) or T2 math grades (.01).Similarly, influences on T2 math interest were notstatistically significant for either T1 math test scores(.02) or T2 math grades (.00). However, in differentsupplemental analyses that considered only pairs ofthe four constructs, some of these effects were sta-tistically significant. In particular, there were statis-tically significant effects of T1 math interest on T2

math grades (.15; Figure 2.3) and T2 math test scores(.09; Figure 2.4). Hence, some of the small effectsassociated with math interest in these supplementalanalyses were lost in the more demanding overallmodel in which the effects of all four T1 constructswere controlled. Although the effects were small, thesupplemental analyses suggest that the effects of T1interest on subsequent achievement were strongerthan the effects of T1 achievement on subsequentT2 interest.

Math self-concept and interest. The evaluation of thecausal ordering of academic self-concept and interestis apparently unique to the present investigation.Based on the full model (Table 1), there was some

.28/.27 .04/.09

.60/.60

.39/.39MGrd

MSC MSC

MGrd

1: Self-Concept and Grades

Time 1 Time 2

.15/.21 .06/.05

.60/.63

.45/.39MTest

MSC MSC

MTest

2: Self-Concept and Test Scores

Time 1 Time 2

.15/.14 .03*/.08

.59/.56

.47/.50MGrd

MInt MInt

MGrd

3: Interest and Grades

Time 1 Time 2

.09/.10

.59/.58

.48/.45MTest

MInt MInt

MTest

4: Interest and Test Scores

Time 1 Time 2

.08/.08 .04/.10

.60/.60

.55/.60MInt

MSC MSC

MInt

5: Self-Concept and Interest

Time 1 Time 2

Figure 2. Structural equation model paths relating Time 1 (T1) to Time 2 (T2) constructs. Separate models were fitted to selected pairs ofconstructs. Stability (horizontal) paths are presented in gray and all statistically significant paths between different constructs are pre-sented in black. The first coefficient in each box is based on Study 1 results (see also Table 1) and the second is based on Study 2 results (seealso Table 5). Only statistically significant paths are presented (except where a path was significant in only one of the two studies, in whichcase the nonsignificant path is presented with an asterisk). MSC5math self-concept, MInt5math interest, MGrd5math grade,MTest5math test scores.

406 Marsh, Trautwein, Ludtke, Koller, and Baumert

Page 11: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering

support for the reciprocal effects of math self-conceptand math interest. The effect of T1 math self-concepton T2 math interest (.07) was highly significant,whereas the effect of T1 math interest on T2 mathself-concept (.04) was marginally significant. In theseparate analysis of the self-concept and interestconstructs (Figure 2.5), the effect of T1 math self-concept on T2 math interest (.09) was slightly morepositive than in the overall analysis (Table 1). It isinteresting that the effect of T1 math interest on T2math self-concept was slightly lower (.036 vs. .039)and was marginally nonsignificant (p5 .051). Hence,whereas there was consistent support for the effect ofprior self-concept on subsequent interest, the evi-dence supporting the effect of prior interest on sub-sequent self-concept was marginal.

Gender differences. We now extend the results totest the generalizability of the results across gender,exploring two sets of related questions. First, do thereciprocal effects of math self-concept, interest, andachievement vary as a function of gender? To eval-uate this question we tested the invariance of the fullfactor model with all four constructs (math self-concept, interest, grades, test scores), but our mainfocus was on the invariance of the path coefficientsrelating T1 and T2 constructs. Second, are there

gender differences in mean levels of the four con-structs? As emphasized earlier, advances in SEMallow us to incorporate both questions into a latentfactor model such that inferences about means arebased on latent mean differences derived from anappropriate factor structure.

First, to evaluate the invariance of the SEM acrossgender, we pursued a traditional two-group analysisin which we constrained various sets of parameterestimates to be invariant over gender. The invarianceof factor loadings is typically considered the minimalcondition for factorial invariance. In the present in-vestigation, we compared RMSEA indexes for modelswith a variety of different sets of invariance con-straints ranging from model MG1 (no invarianceconstraints for any parameter estimates) to the mostrestrictive model MG9 (all parameter estimatesFfactor loading, factor variances and covariances,factor path coefficients, and measured variableuniquenessesFconstrained to be the same in solu-tions for males and females). Although many modelswere considered, the results are easy to summarize.RMSEAvalues improved progressively for each set ofinvariance constraints such that the best model (withlowest RMSEA value) was the model imposing com-plete invariance across gender (MG9 in Table 3).

Table 3

Structural Equation Models of Gender Invariance of Factor Structure and Latent Means: Fit of Alternative Models

Model df

Study 1 Study 2

Invariance constraintsw2 RMSEA 95% CI w2 RMSEA 95% CI

Invariance of factor structure

MG1 352 3759 .060 (.058 – .061) 1387 .051 (.048 – .054) FL5Free FV/CV5Free Uniq5Free PC5Free

MG2 366 3825 .059 (.057 – .061) 1432 .051 (.048 – .054) FL5 Inv FV/CV5Free Uniq5Free PC5Free

MG3 382 3882 .058 (.057 – .060) 1467 .050 (.047 – .053) FL5 Inv FV/CV5Free Uniq5Free PC5 Inv

MG4 386 4018 .059 (.057 – .061) 1548 .051 (.049 – .054) FL5 Inv FV/CV5 Inv Uniq5Free PC5 Free

MG5 393 3978 .058 (.056 – .060) 1517 .050 (.048 – .053) FL5 Inv FV/CV5Free Uniq5 Inv PC5 Free

MG6 402 4071 .058 (.056 – .060) 1582 .051 (.048 – .054) FL5 Inv FV/CV5 Inv Uniq5Free PC5 Inv

MG7 409 4040 .057 (.056 – .059) 1553 .050 (.047 – .052) FL5 Inv FV/CV5Free Uniq5 Inv PC5 Inv

MG8 413 4188 .058 (.057 – .060) 1650 .051 (.049 – .054) FL5 Inv FV/CV5 Inv Uniq5 Inv PC5Free

MG9 429 4244 .057 (.056 – .059) 1684 .051 (.048 – .054) FL5 Inv FV/CV5 Inv Uniq5 Inv PC5 Inv

Invariance of latent means

MG10 429 4244 .057 (.056 – .059) 1684 .051 (.048 – .054) FL5Free FV/CV5Free Uniq5Free PC5

Free Inter5Free LFMD5 zero

MG11 443 4286 .056 (.055 – .058) 1737 .051 (.048 – .053) FL5Free FV/CV5Free Uniq5Free PC5

Free Inter5 Inv LFMD5Free

MG12 451 4665 .059 (.057 – .060) 1919 .054 (.051 – .056) FL5Free FV/CV5Free Uniq5Free PC5

Free Inter5 Inv LFMD5 zero

Note. In each model, factor structures for responses by males and females were compared subject to constraints that some parameterestimates were the same (Inv5 invariant) in the two solutions or were unconstrained and freely estimated in the two solutions (Free).w25 full information chi-square; RMSEA5 root mean square error of approximation; 95% CI5 95% confidence interval about the RMSEA;FL5 factor loading; FV/CV5 factor variance/covariance matrix; uniq5measured variable uniqueness; PC5path coefficient;LFMD5 latent factor mean differences (between males and females).

A Reciprocal Effects Model 407

Page 12: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering

Hence, consistent with a priori predictions, the re-sults provide strong support for the generalizabilityof results of Study 1 over gender.

Second, although tangential to our main focus, weevaluated gender differences in the latent means forfactors representing our four constructs at T1 and T2.In pursuing this issue, we began with the model ofcomplete invariance of the latent factor structure andevaluated whether item intercepts were invariantover gender. Item intercepts reflect the difficulty ofan item in the sense that students at a given level ofthe underlying latent construct (e.g., math self-con-cept) will give systematically more positive re-sponses to easy items and systematically lesspositive response to difficult items. At least reason-able support for the invariance of item interceptsacross males and females is typically taken to be theminimal condition for valid comparisons of meandifferences. Unless there is such support, differenceson latent means are not consistent across items usedto define the factor. Fortunately, a comparison of theRMSEAs for models MG10 and MG11 providedgood support for the invariance of item intercepts.Next, we compared models in which latent meansare constrained to be equal across responses bymales and females. In contrast to all other tests ofinvariance over gender, comparison of these models(MG11 and MG12 in Table 3) indicated that the latentmeans differed for males and females.

To evaluate the nature of these latent mean dif-ferences (Table 4), we compared responses for malesand females on the four T1 constructs, the corre-sponding four T2 constructs without controlling forT1 constructs, and the corresponding four T2 con-structs after controlling for T1 constructs. In general,males had substantially higher math self-conceptsand interest at T1 and T2. Controlling for T1 con-structs largely, but not completely, eliminated dif-ferences in these two constructs at T2. For math testscores, males scored higher than did females at T1,but their advantage was much smaller at T2 so thatafter controlling for T1 scores females actually didslightly better than males at T2. For math grades,there were no significant differences between malesand females at T1, and girls performed marginallybetter than did males at T2.

In summary, tests of gender differences largelysupported a priori predictions that males would havesubstantially higher math self-concept and interestscores and moderately higher math test scores.However, also consistent with a priori predictionsbased on the gender-invariant model, the pattern ofpath coefficients linking T1 constructs to T2 constructswas remarkably similar for male and female students.

Study 2

The purpose of Study 2 was to evaluate the replica-bility of results from Study 1 and the generalizabilityof results across responses by students in two dif-ferent school years to two different interest measuresbased on data from the German component of theThird International Mathematics and Science Study(TIMSS; Baumert et al., 1997). Unlike the interna-tional TIMSS study, the German component waslongitudinal in that math achievement, interest, andself-concept were collected in both Grades 7 and 8.Study 2 contained the same math self-concept andinterest items used in Study 1 and similar measuresof math achievement (standardized test scores andschool grades). It differed in that data were based onresponses by students in Grades 7 and 8 so that thedata collection waves were separated by a full aca-demic year (Study 1 was based on responses fromtwo occasions in Grade 7) and data in Study 2 werecollected about 3 years after those in Study 1. Inaddition, two measures of interest were used inStudy 2. To maintain comparability with Study 1, thesame interest items were included in Study 2.However, a separate measure of interest was alsoincluded based on subsequent theoretical work byKrapp, Schiefele, and colleagues (e.g., Krapp, 2000;

Table 4

Latent Mean Differences for Males and Females in Four Math

Constructs (Positive Values Reflect Higher Scores for Males)

Time 1 Time 2 Time 2

No control No control Control

Mn SE Mn SE Mn SE

Study 1

Math self-concept .46� .03 .39� .03 .12� .03

Math interest .29� .03 .29� .03 .10� .03

Math Grades .00 .03 � .08� .03 � .08� .02

Math test scores .26� .03 .10� .03 � .06� .02

Study 2

Math self-concept .46� .05 .34� .04 .06 .04

Math interest .27� .05 .35� .05 .21� .05

Math grades .16� .04 .06 .05 � .17� .04

Math test scores .33� .04 .27� .04 .06 .02

Note. Completely standardized mean differences (based on modelMG12 with factor loading, factor variances/covariances, path co-efficients, measured variable uniquenesses, and measured varia-ble intercepts all invariant over solutions by males and females,but latent factor means freely estimated). Results based on Time 2responses are presented without controlling for correspondingTime 1 factors (no control) and controlling for Time 1 responsesconsistent with the a priori path model (control).�po.05.

408 Marsh, Trautwein, Ludtke, Koller, and Baumert

Page 13: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering

Krapp, Prenzel, & Schiefele, 1986; Schiefele, 1996,1998; Schiefele et al., 1992) and others (e.g., Wigfield& Eccles, 1992), as described by Koller et al. (2001).The interest measure used in both Studies 1 and 2referred to the mathematics course in which studentswere currently enrolled (class-specific interest),whereas interest in the second measure referred tointerest in the mathematics domain more generally(domain-specific interest; see the Appendix for thewording of items from both measures). Particularlygiven the weak effects associated with math interestin Study 1, it is important to evaluate the generaliz-ability of these effects with potentially strongermeasures of interest.

Method

Data. Study 2 was based on a sample of GermanGrade 7 students who participated in the TIMSS(Baumert et al., 1997; Beaton et al., 1996). The samplewas nationally representative with respect to region,school type, and gender. The German TIMSS studyin Grades 7 and 8 contained some national exten-sions compared to the international design. Thelongitudinal, nationally representative sample con-sisted of 128 randomly selected schools in which oneclass per school was sampled randomly. Most stu-dents (90%) were born in Germany, including 94%with German citizenship (including 3.4% with dualcitizenship). Students spoke German at home alwaysor almost always (88%) or sometimes (10%), andmost were Caucasian (95%). Because the sample wasrepresentative, it was heterogeneous in relation tosocioeconomic status. The final sample in the presentinvestigation included a total of 2,264 students (50%female, M age in Grade 75 13.7 years) who weretested at two points. Excluded from this final samplewere schools that participated in the study on onlyone of the two occasions, or for which there were lessthan 10 students who responded.

Measures. The math self-concept and class-spe-cific interest measures were the same as those usedin Study 1, whereas the domain-specific interestmeasure was based on a new five-item scale (see theAppendix). Math achievement in Grade 8 wasmeasured by 158 items that were part of the officialTIMSS item base. The items were distributed overeight booklets. Each booklet contained between 30and 40 items, some specific to that booklet and someanchor items common to all booklets. Studentsworked on one booklet each, thus allowing broadsubject matter coverage without student exhaustion.All items were checked for curricular validity. Six

content areas and four performance categories werecovered. All responses were scaled using item re-sponse techniques (see Ludtke, Koller, Marsh, &Trautwein, in press; see also Beaton et al., 1996). The36 math items in Grade 7 (T1) were taken fromprevious studies by the International Association forthe Evaluation of Educational Achievement, in par-ticular from the First and Second InternationalMathematics Study (cf. Husen, 1967; Robitaille &Garden, 1989) and from an earlier investigationconducted by the Max Planck Institute for HumanDevelopment. Several items used at T1 were alsoadministered in the official TIMSS study 1 year later.We were thus able to build a common achievementmetric for T1 and T2 using item response theoryapplications.

Statistical analysis. As in Study 1, SEMs includedcorrelated uniquenesses for matching items in thelongitudinal data, data were centered within eachclass, FIML was used because of small amounts ofmissing data (approximately 3.4% of the responseswere missing for the total sample of students whoresponded at either Time 1 or Time 2), and genderdifferences were evaluated in the pattern of causalrelations among the four constructs measured at T1and T2 and in the latent mean responses by boys andgirls.

Results and Discussion

As in Study 1, we began with an evaluation of thecomplete SEM that included T1 and T2 measures ofmath self-concept, math interest, math school grades,and math test scores. Separate models were evalu-ated using class-specific measures of math interestthat are directly comparable to Study 1 (Table 5),domain-specific measures of math interest, andmodels with both class and domain-specific mea-sures of interest. In preliminary analyses, resultsbased on the class- and domain-specific measures ofinterest were similar; therefore, we focus on theclass-specific measures of interest that are mostcomparable to those used in Study 1. In a subsequentmodel that contained both class- and domain-spe-cific measures of interest as separate constructs, thetwo latent interest constructs were so highly corre-lated at each occasion (approximately .9 at T1 and T2)that their separate effects could not be reliably dis-tinguished (because of multicollinearity). On thisbasis we also tested a model in which both class- anddomain-specific interest items reflected a single in-terest factor, but the results were essentially the sameas the separate models based on class- and domain-specific measures of interest. To facilitate presentation

A Reciprocal Effects Model 409

Page 14: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering

Table 5

Factor Solution Relating Academic Self-Concept, Interest, School Grades, and Test Scores at Times 1 and 2 in Study 2: Full Information Maximum

Likelihood Estimation

Time 1 constructs Time 2 constructs

MASC MInt MGrd MTst MASC MInt MGrd MTst

Factor loadings

T1MASC1 .57a

T1MASC2 .74���

T1MASC3 .82���

T1MASC4 .51���

T1MASC5 .83���

T1MINT1 .52a

T1MINT2 .76���

T1MINT3 .80���

T1MINT4 .54���

T1MGrade 1.00a

T1Mtest 1.00a

T2MASC1 .62a

T2MASC2 .79���

T2MASC3 .84���

T2MASC4 .64���

T2MASC5 .85���

T2MINT1 .62a

T2MINT2 .76���

T2MINT3 .81���

T2MINT4 .62���

T2MGrade 1.00a

T2MTst 1.00a

Path coefficients

T2MASC .55��� .07� .08��� .03

T2Mint .12��� .51��� .04 � .03

T2MGrd .26��� .01 .37��� .06�

T2MTst .16��� � .01 .14��� .35���

Residual variances/covariances

T1MASC 1.00

T1Mint .58��� 1.00

T1MGrd .52��� .24��� 1.00

T1MTst .36��� .17��� .40��� 1.00

T2MASC .58���

T2Mint .25� .64���

T2MGrd .21��� .13��� .66���

T2MTst .14��� .06� .13��� .73���

Correlationsb

T1MASC 1.00

T1Mint .58��� 1.00

T1MGrd .52��� .24��� 1.00

T1MTst .36��� .17��� .40��� 1.00

T2MASC .64��� .42��� .39��� .27��� 1.00

T2MInt .43��� .58��� .22��� .12��� .55��� 1.00

T2MGrd .47��� .25��� .53��� .31��� .54��� .33��� 1.00

T2MTst .35��� .18��� .36��� .46��� .39��� .20��� .38��� 1.00

Note. All variables were given a label that identifies the Time (T1 or T2), the construct (MASC5math self-concept, MINT5math interest,Mgrd5math grade, or MTst5math test), and, for the multiple indicators of each latent construct, the item number. All parameter es-timates are presented in completely standardized form. Not presented are the uniquenesses and correlated uniquenesses. Although thefull information maximum likelihood chi-square of 1106.477 (df5 176) was highly significant because of the large sample size, the rootmean square error of approximation (RMSEA) of .054 demonstrated that the model was able to fit the data well. For comparison purposes,the same model was estimated, in which missing values were imputed using the expectation maximization algorithm. The parameterestimates were nearly identical to those presented here and the goodness-of-fit statistics indicated a good fit to the data (normal theoryweighted least squares w25 1181.9, df5 176, RMSEA5 0.0563, non-normed fit index5 0.970, comparative fit index5 0.977, standardizedroot mean square residual5 0.0378).aIn the unstandardized model, the first indicator of each construct was fixed to 1.0 to fix the metric of the factor.bFactor correlations are based on the equivalent confirmatory factor analysis model in which all constructs are correlated. Time 1 corre-lations are the same as in the structural equation model (see residual variance/covariance estimates) but differ from Time 2 estimates inthat the effects of Time 1 constructs are not partialed out of the correlations, whereas they are partialed out for the Time 2 residualvariances and covariances.�po.05. ���po.001.

410 Marsh, Trautwein, Ludtke, Koller, and Baumert

Page 15: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering

of the results, we focus on the model with class-specific measures of interest (Table 5) but also brieflysummarize results based on the other models as well(see Table 2).

Math self-concept and class-specific interest wereboth positively correlated with math grades and testscores (Table 5), but both self-concept and interestwere more highly correlated with school grades thanwith standardized test scores. Also, correlations be-tween self-concept and both achievement constructs(grades and test scores) were more positive than thecorresponding correlations between interest andachievement. These patterns of results were con-sistent across T1 and T2 responses.

In evaluating path coefficients relating T1 and T2constructs, we focused on the overall model thatcontained all four constructs (self-concept, interest,grades, test scores) and was based on the class-spe-cific measure of interest. However, we also juxta-posed the results of models in which interest wasbased on responses to the class-specific items, thedomain-specific items, or both sets of interest items.As in Study 1, we also conducted supplementalanalyses in which we evaluated various pairs of con-structs separately (see Figure 2) that were more liketraditional causal models used in previous research.

Math self-concept and achievement. Consistent withprevious self-concept research (and Study 1), therewere reciprocal effects between math self-conceptand achievement in the overall model that includedall constructs (see Table 5 and Figure 1). Of particularrelevance, the effect of T1 math self-concept wasstatistically significant for both T2 math grades (.26)and T2 math test scores (.16), and the effects of T1math self-concept were greater for T2 school gradesthan for T2 test scores. The effects of T1 math gradeson T2 math self-concept were small (.08) but highlysignificant (po.001), whereas the effects of T1 mathtest scores were even smaller (.03) and not statisti-cally significant. Hence, whereas the effects werereciprocal, the effects of self-concept on achievementare stronger than the effects of achievement on self-concept. In additional models based on domain-specific measures of interest or both class- and do-main-specific measures of interest (Table 2), thesepath coefficients were nearly identical to those inTable 5 based on the class-specific measure of inter-est (none of the path coefficients differed by morethan .05 and the pattern of significant and nonsig-nificant effects was the same in all three models).

As in Study 1, we constructed separate models toevaluate the reciprocal effects of math self-conceptwith math school grades and with math test scores(excluding math interest; see Figures 2.1 and 2.2 for

Study 2). The pattern of results in these supple-mental models was the same as in the overall model,but the effects tended to be strongerFparticularlythe effect of T1 math self-concept on T2 math grades(.27; Figure 2.1) and on T2 math test scores (.21;Figure 2.2). The effects of T1 math grades and T1math test scores on T2 math self-concept, althoughclearly smaller than the effect of T1 math self-concepton T2 math achievement, were also statistically sig-nificant. Hence, these supplemental analyses sup-ported the reciprocal effects model of self-conceptand achievement.

Math interest and achievement. Math interest wascorrelated with both measures of math achievement.However, in the overall model, the effects of T1 mathinterest were nonsignificant for both T2 schoolgrades and T2 test scores, as were the effects of T1test scores and T1 school grades on T2 math interest(Table 5). Furthermore, this consistent pattern ofnear-zero, nonsignificant effects between interest andachievement was consistent across different modelsbased on the class-specific, domain-specific, andcombined (class- and domain-specific) measures ofmath interest (see Table 2). However, in supple-mental analyses that considered math interest incombination with either grades or test scores (ex-cluding math self-concept), some effects involvingmath interest were statistically significant. In partic-ular, math interest had statistically significant effectson both math grades (Figure 2.3) and math test scores(Figure 2.4), whereas the effects of math grades onmath interest (Figure 2.3) were also significant.

Math self-concept and interest. Based on the fullmodel (Table 5), there was also some support for thereciprocal effects of math self-concept and math in-terest; the effects of T1 math self-concept on T2 mathinterest (.12) and of T1 math interest on T2 math self-concept (.07) were both statistically significant. Thissame pattern of results was evident in additionalmodels using class-specific, domain-specific, andcombined (class- and domain-specific) measures ofmath interest (see Table 2). However, even in thesupplemental analyses that excluded the math testscores and math grades (see Figure 2.5), the recip-rocal effects of self-concept on interest and intereston self-concept did not exceed .10.

Gender differences. As in Study 1, we tested thegeneralizability of the results across gender. To fa-cilitate comparisons between Studies 1 and 2, weonly considered the subject-specific measure of in-terest in Study 2 that was the same as the interestmeasure in Study 1. We began with two-groupinvariance tests in which we constrained varioussets of parameter estimates to be invariant over

A Reciprocal Effects Model 411

Page 16: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering

responses by males and females. Based on compar-ison of RMSEA values across models MG1 to MG9,there were almost no differences in fit between anyof the models. In particular, the most restrictivemodel (in which all parameters were constrained tobe the same for males and females) had an RMSEAof .051, equal to that in the least restrictive model inwhich there were no invariance constraints. Hence,consistent with a priori predictions, the results pro-vide strong support for the generalizability of resultsover gender.

Next, we evaluated gender differences in themeans of latent factors representing our four con-structs at T1 and T2. A comparison of the RMSEAsfor models MG10 and MG11 provided good supportfor the invariance of item intercepts, whereas com-parison of models MG11 and MG12 (Table 3, Study2) indicated that the latent means did differ. In gen-eral, males had substantially higher latent means formath self-concept, math interest, and math tests at T1and T2 (Table 4, Study 2). Whereas males hadslightly higher math grades at T1, there was no sig-nificant difference between males and females at T2so that females had slightly higher school grades atT2 after controlling for T1 constructs. A comparisonof the results for Studies 1 and 2 (see Figure 1) showsthat the size and pattern of statistically significantand nonsignificant path coefficients from each studyare similar.

General Discussion

Although developmental and educational psychol-ogists posit interest and self-concept as primary de-terminants of outcomes such as achievement,performance, and choice of behavior, as emphasizedby Wigfield and Eccles (2002), there is a need to in-tegrate the developmental and educational psychol-ogy research traditions to provide a more completepicture. Even though there is substantial overlap ofresearch into academic self-concept and academicinterest, there have been few if any longitudinalcausal-ordering studies of relations among self-con-cept, interest, achievement, and gender differences.

We began by arguing that a critical question inself-concept and motivation research is whetherthere are causal links from prior measures of aca-demic self-concept, academic interest, and academicachievement to subsequent measures of these sameconstructs. Our results provide clear evidence thatprior academic self-concept does predict subsequentacademic achievement beyond what can be explain-ed in terms of prior measures of academic interest,school grades, and standardized achievement test

scores. Whereas the contribution of prior academicself-concept was stronger for school grades, the ef-fects were also highly significant for standardizedtest scores. Although the causal effects of academicinterest on subsequent achievement were largelynonsignificant, possibly there was some shared var-iance in subsequent measures of academic achieve-ment that could not be uniquely explained by eitherself-concept or interest. There were stereotypic gen-der differences in the mean levels of math motivationconstructs, but in contrast to predictions from gen-der-intensification and gender-stereotypic models(but in support of gender-invariance models), sup-port for the reciprocal effects relating math self-concept, interest, school grades, and test scores wassimilar for boys and girls. Thus, for example, eventhough girls had lower math self-concepts than didboys, the positive effect of a high math self-concepton subsequent math achievement was similar forboth boys and girls.

In relation to the generalizability of the results andfuture research, several characteristics of the presentinvestigation warrant further consideration. It isimportant to evaluate the generalizability of theseresults in different settings, countries, and agegroups. Our results showed similar findings in dif-ferent studies where the data-collection waves werein the academic school year (Study 1) or spanneddifferent school years (Study 2), suggesting that thismethodological issue may not be as important assuggested by Marsh et al. (1999). We note, however,that the mathematics classes in Germany are rea-sonably similar for students in Grades 7 and 8.Hence, the critical feature might not be the length ofthe interval per se, but the amount of change thattakes place between consecutive data collections.Future research should more fully attend to differ-ences in the contextual characteristics as well as thelength of time between waves. Also, because ourstudy was based only on math constructs, it is im-portant to test the generalizability of the results toother academic (and perhaps nonacademic) domains.

More complicated is the generalizability of ourresults to other age groups. Although some re-searchers have suggested that effects are stronger forolder samples, Guay et al. (2003) have demonstratedstrong support for the reciprocal effects model thatgeneralized over elementary school years. However,Koller et al. (2001) speculated that the effect of in-terest would be stronger during later school years,when the curriculum was not so highly structuredand students had more freedom in selecting courses.Marsh and Yeung (1997a, 1997b) similarly showedthat whereas both achievement and academic

412 Marsh, Trautwein, Ludtke, Koller, and Baumert

Page 17: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering

self-concept were substantially related to course-work selection in different school subjects, domain-specific components of self-concept were much bet-ter predictors of course selection.

The focus of reciprocal effects models in academicself-concept research has been primarily on achieve-ment. However, Wigfield and Eccles (1992, 2002) havesuggested that whereas self-concept is more stronglyrelated to actual achievement, task values may bemore strongly related to choice behavior (e.g.,coursework selection). There is a need for further re-search juxtaposing the combined effects of academicself-concept, interest, and achievement on a morevaried set of academic choice behaviors. Theoretically,our research helps bridge gaps between the educa-tional and developmental research literatures, andamong large bodies of self-concept research, interestresearch, and more general approaches to motivationsuch as expectancy-value theory.

Of course, the direction of causality among aca-demic self-concept, academic interest, and achieve-ment has important practical implications foreducators, as well as for educational and develop-mental psychologists who work in school settings.If the direction of causality were from academicself-concept and interest to achievement (a self-enhancement model), teachers should concentratemore effort on enhancing students’ self-concepts andintrinsic interest rather than focusing on achieve-ment. On the other hand, if causality were fromachievement to self-concept and interest (a skill de-velopment model), teachers should focus on im-proving academic skills as the best way to improveself-concept and interest. In contrast to both theseapparently simplistic (either –or) models, the recip-rocal effects model implies that academic self-concept, interest, and academic achievement arereciprocally related and mutually reinforcing. Im-proved academic self-concepts and interest will leadto better achievement, and improved achievementwill lead to better academic self-concepts and inter-est. Thus, for example, if teachers enhance students’academic self-concepts and interest withoutimproving achievement, the gains in self-conceptand interest are likely to be short-lived. However,if teachers improve students’ academic achievementwithout fostering students’ self-beliefs in their aca-demic capabilities and intrinsic interest, theachievement gains are also unlikely to be long last-ing. If teachers focus on one construct to the exclu-sion of the other, both are likely to suffer. Thereciprocal effects model suggests that the most ef-fective strategy is to improve academic self-concept,interest, and achievement simultaneously.

References

Allison, P. D. (2001). Missing data [Sage University PapersSeries on Quantitative Applications in the Social Sci-ences, 07-136]. Thousand Oaks, CA: Sage.

Baumert, J., Lehmann, R. H., Lehrke, M., Schmitz, B.,Clausen, M., Hosenfeld, I., et al. (1997). TIMSS: Mat-hematisch-Naturwissenschaftlicher Unterricht im interna-tionalen Vergleich [TIMSS: Mathematics and scienceinstruction in an international comparison]. Opladen,Germany: Leske & Budrich.

Baumert, J., Schnabel, K., & Lehrke, M. (1998). Learn-ing math in school: Does interest really matter? InL. Hoffmann, A. Krapp, K. A. Renninger, & J. Baumert(Eds.), Interest and learning (pp. 327 – 336). Kiel,Germany: Institut fur die Padagogik der Naturwissen-schaften IPN.

Beaton, A. E., Mullis, I. V. S., Martin, M. O., Gonzales, E. J.,Kelly, D. L., & Smith, T. A. (1996). Mathematics achieve-ment in the middle school years: IEA’s Third InternationalMathematics and Science Study. Chestnut Hills, MA: Bos-ton College.

Bollen, K. A. (1989). Structural equations with latent variables.New York: Wiley.

Bouffard, T., Marcoux, M., Vezeau, C., & Bordeleau, L.(2003). Changes in self-perceptions of competenceand intrinsic motivation among elementary school-children. British Journal of Educational Psychology, 73,171 – 186.

Byrne, B. M. (1996). Academic self-concept: Its structure,measurement, and relation to academic achievement. InB. A. Bracken (Ed.), Handbook of self-concept (pp. 287 –316). New York: Wiley.

Byrne, B. M. (1998). Structural equation modeling with LIS-REL, PRELIS, and SIMLIS: Basic concepts, applications andprogramming. Mahwah, NJ: Erlbaum.

Byrne, B. M., & Shavelson, R. J. (1987). Adolescent self-concept: Testing the assumption of equivalent structureacross gender. American Educational Research Journal, 24,365 – 385.

Calsyn, R., & Kenny, D. (1977). Self-concept of ability andperceived evaluations by others: Cause or effect of aca-demic achievement?. Journal of Educational Psychology,69, 136 – 145.

Crain, R. M. (1996). The influence of age, race, and genderon child and adolescent multidimensional self-concept.In B. A. Bracken (Ed.), Handbook of self-concept: Develop-mental, social, and clinical considerations (pp. 395 – 420).Oxford, England: Wiley.

Csikszentmihalyi, M., & Schiefele, U. (1993). Die Qualitatdes Erlebens und der Proze� des Lernens [The quality ofexperience and the process of learning]. Zeitschrift furPadagogik, 39, 207 – 221.

Deci, E. L., & Ryan, R. M. (1985). Intrinsic motivation andself-determination in human behavior. New York: Plenum.

Eccles, J. S. (1983). Expectancies, values, and academicchoice: Origins and changes. In J. Spence (Ed.),Achievement and achievement motivation (pp. 87 – 134). SanFrancisco: Freeman.

A Reciprocal Effects Model 413

Page 18: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering

Eccles, J. S., & Wigfield, A. (1995). In the mind of the actor:The structure of adolescents’ achievement task valuesand expectancy-related beliefs. Personality & Social Psy-chology Bulletin, 21, 215 – 225.

Eccles, J. S., Wigfield, A., & Schiefele, U. (1998). Motivationto succeed. In W. Damon (Series Ed.) & N. Eisenberg(Vol. Ed.), Handbook of child psychology: Vol. 3. Social,emotional, and personality development (5th ed., pp. 1017 –1095). New York: Wiley.

Goldstein, H. (2003). Multilevel statistical models (3rd ed.).London: Hodder Arnold.

Graham, J. W., & Hoffer, S. M. (2000). Multiple imputationin multivariate research. In T. D. Little, K. U. Schnabel, &J. Baumert (Eds.), Modeling longitudinal and multileveldata: Practical issues, applied approaches, and specific exam-ples (pp. 201 – 218). Mahwah, NJ: Erlbaum.

Guay, F., Marsh, H. W., & Boivin, M. (2003). Academic self-concept and academic achievement: Developmentalperspectives on their causal ordering. Journal of Educa-tional Psychology, 95, 124 – 136.

Harter, S. (1990). Processes underlying adolescent self-concept formation. In R. Montemayor, G. Adams, &T. Gullotta (Eds.), From childhood to adolescence: A transi-tional period? (pp. 205 – 239). Thousand Oaks, CA: Sage.

Harter, S. (1992). The relationship between perceivedcompetence, affect, and motivational orientation withthe classroom: Processes and patterns of change. In A. K.Boggiano & T. S. Pittman (Eds.), Achievement and moti-vation: A social-developmental perspective (pp. 77 – 113).New York: Cambridge University Press.

Harter, S. (1998). Developmental perspectives on the self-system. In W. Damon (Series Ed.) & N. Eisenberg (Vol.Ed.), Handbook of child psychology: Vol. 3. Social, emotional,and personality development (5th ed., pp. 553 – 618). NewYork: Wiley.

Hattie, J. (1992). Self-concept. Hillsdale, NJ: Erlbaum.Helmke, A., & van Aken, M. A. G. (1995). The causal or-

dering of academic achievement and self-concept ofability during elementary school: A longitudinal study.Journal of Educational Psychology, 87, 624 – 637.

Hidi, S., & Ainley, M. (2002). Interest and adolescence. In F.Pajares & T. Urdan (Eds.), Academic motivation of adoles-cents (pp. 247 – 275). Greenwich, CT: Information Age.

Husen, T. (1967). International study of achievement in math-ematics. A comparison of 12 countries (Vols. I and II).Stockholm: Almqvist & Wiksell.

Jacobs, J. E., Lanza, S., Osgood, D. W., Eccles, J. S., &Wigfield, A. (2002). Changes in children’s self-compe-tence and values: Gender and domain differences acrossgrades one though twelve. Child Development, 73,509 – 527.

Joreskog, K. G. (1979). Statistical estimation of structuralmodels in longitudinal investigations. In J. R. Nessel-roade & P. B. Baltes (Eds.), Longitudinal research in thestudy of behavior and development (pp. 303 – 351). NewYork: Academic Press.

Joreskog, K. G., & Sorbom, D. (1993). LISREL 8: User’s ref-erence guide. Chicago: Scientific Software.

Kaplan, D. (2000). Structural equation modeling: Foundationsand extensions. Newbury Park, CA: Sage.

Koller, O. (1998). Zielorientierungen und schulisches Lernen[Goal orientations and academic learning]. Munster, Ger-many: Waxmann.

Koller, O., Baumert, J., & Schnabel, K. (2001). Does interestmatter? The relationship between academic interest andachievement in mathematics. Journal for Research inMathematics Education, 32, 448 – 470.

Krapp, A. (2000). Interest and human development duringadolescence: An educational-psychological approach. InJ. Heckhausen (Ed.), Motivational psychology of humandevelopment (pp. 109 – 128). London: Elsevier.

Krapp, A., Prenzel, M., & Schiefele, H. (1986). Grundzugeeiner padagogischen Interessentheorie [Basic principlesof an educational theory of interest]. Zeitschrift furPadagogik, 32, 163 – 173.

Little, R. J. A., & Rubin, D. B. (1987). Statistical analysis withmissing data. New York: Wiley.

Ludtke, O., Koller, O., Marsh, H. W., & Trautwein, U.(in press). Teacher feedback and the big-fish-little-pondeffect. Contemporary Educational Psychology.

Marsh, H. W. (1987). The big-fish-little-pond effect on ac-ademic self-concept. Journal of Educational Psychology, 79,280 – 295.

Marsh, H. W. (1989). Age and sex effects in multiple di-mensions of self-concept: Preadolescence to early-adulthood. Journal of Educational Psychology, 81, 417 – 430.

Marsh, H. W. (1990). The causal ordering of academicself-concept and academic achievement: A multiwave,longitudinal panel analysis. Journal of EducationalPsychology, 82, 646 – 656.

Marsh, H. W. (1993a). Academic self-concept: Theorymeasurement and research. In J. Suls (Ed.), Psychologicalperspectives on the self (Vol. 4, pp. 59 – 98). Hillsdale, NJ:Erlbaum.

Marsh, H. W. (1993b). The multidimensional structure ofacademic self-concept: Invariance over gender and age.American Educational Research Journal, 30, 841 – 860.

Marsh, H. W. (1994). Confirmatory factor analysis modelsof factorial invariance: A multifaceted approach. Struc-tural Equation Modeling, 1, 5 – 34.

Marsh, H. W., & Ayotte, V. (2003). Do multiple dimensionsof self-concept become more differentiated with age?The differential distinctiveness hypothesis. Journal ofEducational Psychology, 95, 687 – 706.

Marsh, H. W., Balla, J. R., & McDonald, R. P. (1988).Goodness of fit indexes in confirmatory factor analysis:The effect of sample size. Psychological Bulletin, 103,391 – 410.

Marsh, H. W., Byrne, B. M., & Yeung, A. S. (1999). Causalordering of academic self-concept and achievement:Reanalysis of a pioneering study and revised recom-mendations. Educational Psychologist, 34, 154 – 157.

Marsh, H. W., & Craven, R. (1997). Academic self-concept:Beyond the dustbowl. In G. Phye (Ed.), Handbook ofclassroom assessment: Learning, achievement, and adjust-ment (pp. 131 – 198). Orlando, FL: Academic Press.

414 Marsh, Trautwein, Ludtke, Koller, and Baumert

Page 19: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering

Marsh, H. W., & Craven, R. G. (in press). What comes first?A reciprocal effects model of the mutually reinforcingeffects of academic self-concept and achievement. InH. W. Marsh, R. G. Craven, & D. M. McInerney (Eds.),International advances in self research (Vol. 2). Greenwich,CT: Information Age.

Marsh, H. W., Craven, R. G., & Debus, R. (1991).Self-concepts of young children aged 5 to 8: Their mea-surement and multidimensional structure. Journal ofEducational Psychology, 83, 377 – 392.

Marsh, H. W., Craven, R. G., & Debus, R. (1998). Structure,stability, and development of young children’s self-concepts: A multicohort-multioccasion study. ChildDevelopment, 69, 1030 – 1053.

Marsh, H. W., Craven, R. G., & Debus, R. (2000). Separationof competency and affect components of multiple di-mensions of academic self-concept: A developmentalperspective. Merrill-Palmer Quarterly, 45, 567 – 560.

Marsh, H. W., Debus, R., & Bornholt, L. (2005). Validatingyoung children’s self-concept responses: Methodologi-cal ways and means to understand their responses. InD. M. Teti (Ed.), Handbook of research methods in develop-mental science (pp. 138 –160). Oxford, England: Blackwell.

Marsh, H. W., & Grayson, D. (1990). Public/Catholic dif-ferences in the high school and beyond data: A multi-group structural equation modeling approach to testingmean differences. Journal of Educational Statistics, 15,199 – 235.

Marsh, H. W., & Grayson, D. (1994). Longitudinal stabilityof latent means and individual differences: A unifiedapproach. Structural Equation Modeling, 1, 317 – 359.

Marsh, H. W., & Hau, K.-T. (1996). Assessing goodness offit: Is parsimony always desirable? Journal of Experi-mental Education, 64, 364 – 390.

Marsh, H. W., Hau, K. T., & Kong, K. W. (2002). Multilevelcausal ordering of academic self-concept and achieve-ment: Influence of language of instruction (English vs.Chinese) for Hong Kong Students. American EducationalResearch Journal, 37, 245 – 282.

Marsh, H. W., & Rowe, K. J. (1996). The negative effects ofschool-average ability on academic self-conceptFAnapplication of multilevel modeling. Australian Journal ofEducation, 40, 65 – 87.

Marsh, H. W., & Yeung, A. S. (1997a). Causal effects ofacademic self-concept on academic achievement: Struc-tural equation models of longitudinal data. Journal ofEducational Psychology, 89, 41 – 54.

Marsh, H. W., & Yeung, A. S. (1997b). Coursework selec-tion: The effects of academic self-concept and achieve-ment. American Educational Research Journal, 34, 691 – 720.

Marsh, H. W., & Yeung, A. S. (1998). Longitudinal struc-tural equation models of academic self-concept andachievement: Gender differences in the development ofmath and English constructs. American EducationalResearch Journal, 35, 705 – 738.

Meredith, W. (1993). Measurement invariance, factoranalysis and factorial invariance. Psychometrika, 58,525 – 543.

Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linearmodels: Applications and data analysis methods (2nd ed.).Thousand Oaks, CA: Sage.

Renninger, K. A. (2000). How might the development ofindividual interest contribute to the conceptualizationof intrinsic motivation? In C. Sansome & J. M.Harackiewicz (Eds.), Intrinsic and extrinsic motivation: Thesearch for optimal motivation and performance. New York:Academic Press.

Robitaille, D., & Garden, R. (1989). The IEA Study ofMathematics II. Contents and outcomes of school mathemat-ics. Oxford, England: Pergamon.

Schiefele, U. (1996). Motivation und Lernen mit Texten[Motivation and learning with texts]. Gottingen, Germany:Hogrefe.

Schiefele, U. (1998). Individual interest and learningFWhat we know and what we do not know. In L. Hoff-mann, A. Krapp, K. A. Renninger, & J. Baumert (Eds.),Interest and learning (pp. 91 – 104). Kiel, Germany: Institutfur die Padagogik der Naturwissenschaften (IPN).

Schiefele, U., Krapp, A., & Winteler, A. (1992). Interest aspredictor of academic achievement: A meta-analysis ofresearch. In K. A. Renninger, S. Hidi, & A. Krapp(Eds.), The role of interest in learning and development (pp.183 – 212). Hillsdale, NJ: Erlbaum.

Skaalvik, E. M., & Hagtvet, K. A. (1990). Academicachievement and self-concept: An analysis of causalpredominance in a developmental perspective. Journal ofPersonality & Social Psychology, 58, 292 – 307.

Valentine, J. C., DuBois, D. L., & Cooper, H. (2004). The re-lations between self-beliefs and academic achievement: Asystematic review. Educational Psychologist, 39, 111 – 133.

Watt, H. M. G. (2004). Development of adolescents’ self-perceptions, values, and task perceptions according togender and domain in 7th- through 11th-grade Aus-tralian students. Child Development, 75, 1556 – 1574.

Wigfield, A. (1994). Expectancy-value theory of achieve-ment motivation: A developmental perspective. Educa-tional Psychology Review, 6, 49 – 78.

Wigfield, A., & Eccles, J. S. (1992). The development ofachievement task values: A theoretical analysis. Devel-opmental Review, 12, 265 – 310.

Wigfield, A., & Eccles, J. S. (2002). The development ofcompetence beliefs, expectancies for success, andachievement values from childhood through adoles-cence. In A. Wigfield & J. S. Eccles (Eds.), Development ofachievement motivation (pp. 173 – 195). San Diego, CA:Academic Press.

Wigfield, A., Eccles, J. S., Yoon, K. S., Harold, R. D., Arbr-eton, A., Freedman-Doan, K., et al. (1997). Changes inchildren’s competence beliefs and subjective task valuesacross the elementary school years: A three-year study.Journal of Educational Psychology, 89, 451 – 469.

Wigfield, A., & Karpathian, M. (1991). Who am I and whatcan I do? Children’s self-concepts and motivation in achi-evement solutions. Educational Psychologist, 26, 233 – 261.

Wylie, R. C. (1979). The self-concept (Vol. 2, Lincoln: Uni-versity of Nebraska Press.

A Reciprocal Effects Model 415

Page 20: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering

Appendix

Self-Concept and Interest Items Used inStudies 1 and 2

Math Self-Concept (Studies 1 and 2)

� I would much prefer math if it weren’t so hard.(15 strongly disagree to 45 strongly agree)

� Although I make a real effort, math seems to beharder for me than for my fellow students.(15 strongly disagree to 45 strongly agree)

� Nobody’s perfect, but I’m just not good at math.(15 strongly disagree to 45 strongly agree)

� Some topics in math are just so hard that I knowfrom the start I’ll never understand them.(15 strongly disagree to 45 strongly agree)

� Math just isn’t my thing. (15 strongly disagree to45 strongly agree)

Math Class-Specific Interest (Studies 1 and 2)

� How important is it for you to learn a lot in math-ematics classes? (15 not at all important to 55 veryimportant)

� Would you like mathematics classes to be taughtmore often? (15 not at all to 55 very much)

� How much do you look forward to mathematicsclasses? (15 not at all to 55 very much)

� How important is it for you to remember what youhave learned in mathematics classes? (15 not at allimportant to 55 very important)

Math Domain-Specific Interest (Study 2 Only)

� It is important to me to be a good mathematician.(15 strongly disagree to 45 strongly agree)

� I enjoy working on mathematical problems.(15 strongly disagree to 45 strongly agree)

� Mathematics is one of the things that is important tome personally. (15 strongly disagree to 45 stronglyagree)

� I would even give up some of my spare time to learnnew topics in mathematics. (15 strongly disagree to45 strongly agree)

� While working on a mathematical problem, itsometimes happens that I don’t notice time passing.(15 strongly disagree to 45 strongly agree)

416 Marsh, Trautwein, Ludtke, Koller, and Baumert

Page 21: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering
Page 22: Academic Self-Concept, Interest, Grades, And Standardized Test Scores Reciprocal Effects Models of Causal Ordering