2
threat as a factor to consider “beyond these threats” (p. 616). Most important, the focus of stereo- type threat is to explain the residual, that portion of variance left over in the racial achievement gap after prior preparation and skills (as roughly assessed by prior indicators such as college board scores) have been controlled. As Bowen and Bok (1998) documented, even at the most se- lective universities, there is a large race gap disfavoring Black students in graduation rates, grade point average, and class rank even after controlling for SAT scores, so- cioeconomic status, and high school grades (see also Jensen, 1980). Likewise, there is a large racial gap in SAT performance— of about 150 points—at every level of socio- economic status as measured by family in- come (Hacker, 1995). This residual gap is called the “overprediction” or “under- achievement” phenomenon. It is a large and persistent difference in scholastic suc- cess between the races that occurs even when extraneous factors are controlled. It is this gap that stereotype threat is aimed at explaining (see Steele & Aronson, 1995). It is this gap that has garnered the attention of the social sciences more generally. And it is this gap that Steele and Aronson ob- served in the ability-diagnostic condition of their studies when, even after controlling for student SAT scores, a large difference in test performance was found between Black students and White students. Given the context of the problem, controlling for prior differences in SAT is a perfectly ap- propriate laboratory analog to the real- world issue at hand. The third point not conveyed by Sack- ett et al. (2004) is the existence of a grow- ing body of work suggesting that the theo- retical insights offered by stereotype threat can be applied to close the racial achieve- ment gap in real classroom settings (Steele, 1997; see also Aronson, Fried, & Good, 2002; Good et al., 2003). When the per- ceived relevance and salience of negative stereotypes are reduced, African American students have been found to perform sig- nificantly better in school, sometimes dra- matically. The utility of stereotype threat is the strongest gauge of its relevance and validity vis-a `-vis understanding real race differences in intellectual attainment. REFERENCES Aronson, J., Fried, C. B., & Good, C. (2002). Reducing the effects of stereotype threat on African American college students by shaping theories of intelligence. Journal of Experi- mental Social Psychology, 38, 113–125. Blascovich, J., Spencer, S. J., Quinn, D., & Steele, C. (2001). African Americans and high blood pressure: The role of stereotype threat. Psychological Science, 12, 225–229. Bowen, W. G., & Bok, D. (1998). The shape of the river: Long-term consequences of consid- ering race in college and university admis- sions. Princeton, NJ: Princeton University Press. Croizet, J. C., & Claire, T. (1998). Extending the concept of stereotype and threat to social class: The intellectual underperformance of students from low socioeconomic back- grounds. Personality and Social Psychology Bulletin, 24, 588 –594. Good, C., Aronson, J., & Inzlicht, M. (2003). Improving adolescents’ standardized test per- formance: An intervention to reduce the ef- fects of stereotype threat. Journal of Applied Developmental Psychology, 24, 645– 662. Hacker, A. (1995). Two nations: Black and white, separate, hostile, unequal. New York: Ballantine. Jensen, A. R. (1980). Bias in mental testing. New York: Free Press. Sackett, P. R., Hardison, C. M., & Cullen, M. J. (2004). On interpreting stereotype threat as accounting for African American–White dif- ferences on cognitive tests. American Psy- chologist, 59, 7–13. Steele, C. M. (1997). A threat in the air: How stereotypes shape the intellectual identities and performance of women and African Americans. American Psychologist, 52, 613– 629. Steele, C. M., & Aronson, J. A. (1995). Stereo- type threat and intellectual test performance of African Americans. Journal of Personality and Social Psychology, 69, 797– 811. Correspondence concerning this comment should be addressed to Geoffrey Cohen, Depart- ment of Psychology, Yale University, 2 Hill- house Avenue, Box 208205, New Haven, CT 06520-8205. E-mail: [email protected] DOI: 10.1037/0003-066X.60.3.271 On Interpreting Research on Stereotype Threat and Test Performance Paul R. Sackett and Chaitra M. Hardison University of Minnesota, Twin Cities Campus Michael J. Cullen Personnel Decisions Research Institutes We are gratified that our article (Sackett, Hardison, & Cullen, January 2004) prompted these thoughtful reactions. We offer comments on each. Wicherts (2005, this issue) focused on the assumptions un- derlying the use of analysis of covariance and noted possible violations of these as- sumptions in the use of prior test scores as covariates in stereotype threat research. We agree that violations are possible, though at this point the likelihood and impact of these violations are matters of speculation. We take this opportunity to note that in- cluding a prior test score as a covariate is not a critical element of testing stereotype threat theory. There are sound reasons for including a prior test score as a covariate, such as increasing the power to detect an effect via reduction of the error term. How- ever, reporting results without the covariate would permit a straightforward examina- tion of subgroup differences in threat ver- sus nonthreat conditions. A theme of both the Helms (2005, this issue) comment and the Cohen and Sher- man (2005, this issue) comment is that studies other than the Steele and Aronson (1995) work that was the focus of our ar- ticle are important for understanding the effects of stereotype threat. We fully agree that it is important to consider the full range of research on stereotype threat. We certainly do not believe that the question of the effects of stereotype threat on test scores in high-stakes settings is settled; in fact, our own work continues to explore this question (e.g., Cullen, Hardison, & Sackett, 2004). But we also urge careful examination of the research studies cited by Helms and by Cohen and Sherman. Helms (2005) presented a reanalysis of data from McKay, Doverspike, Bowen- Hilton, and McKay (2003) and proposed a regression-based mediation analysis as a means of testing whether stereotype threat accounts for group differences in mean test scores. She reported data showing that race has no effect on test scores once measured stereotype threat is controlled, which leads her to the conclusion that in this data set, threat does explain the racial group differ- ence. We raise two concerns here about the procedure and her interpretation of the McKay et al. (2003) data. First, readers may be confused because of an initial mis- statement about the requirements for medi- ation. Helms stated that “for stereotype threat to account for racial-group differ- ences in test scores, measures or manipu- lations of it only have to account for at least as much variance in racial group as racial group explains in test scores” (p. 269). In fact, the requirement for full mediation is not that race and threat covary to at least the same degree as race and test scores covary but rather that race accounts for the same variance in test scores as is accounted for by threat. In other words, the effect of race on test performance is transmitted via stereotype threat, such that race has no effect on test score once stereotype threat is controlled. 271 April 2005 American Psychologist

On Interpreting Research on Stereotype Threat and Test Performance

Embed Size (px)

Citation preview

threat as a factor to consider “beyond thesethreats” (p. 616).

Most important, the focus of stereo-type threat is to explain the residual, thatportion of variance left over in the racialachievement gap after prior preparationand skills (as roughly assessed by priorindicators such as college board scores)have been controlled. As Bowen and Bok(1998) documented, even at the most se-lective universities, there is a large race gapdisfavoring Black students in graduationrates, grade point average, and class rankeven after controlling for SAT scores, so-cioeconomic status, and high school grades(see also Jensen, 1980). Likewise, there is alarge racial gap in SAT performance—ofabout 150 points—at every level of socio-economic status as measured by family in-come (Hacker, 1995). This residual gap iscalled the “overprediction” or “under-achievement” phenomenon. It is a largeand persistent difference in scholastic suc-cess between the races that occurs evenwhen extraneous factors are controlled. Itis this gap that stereotype threat is aimed atexplaining (see Steele & Aronson, 1995). Itis this gap that has garnered the attention ofthe social sciences more generally. And itis this gap that Steele and Aronson ob-served in the ability-diagnostic condition oftheir studies when, even after controllingfor student SAT scores, a large differencein test performance was found betweenBlack students and White students. Giventhe context of the problem, controlling forprior differences in SAT is a perfectly ap-propriate laboratory analog to the real-world issue at hand.

The third point not conveyed by Sack-ett et al. (2004) is the existence of a grow-ing body of work suggesting that the theo-retical insights offered by stereotype threatcan be applied to close the racial achieve-ment gap in real classroom settings (Steele,1997; see also Aronson, Fried, & Good,2002; Good et al., 2003). When the per-ceived relevance and salience of negativestereotypes are reduced, African Americanstudents have been found to perform sig-nificantly better in school, sometimes dra-matically. The utility of stereotype threat isthe strongest gauge of its relevance andvalidity vis-a-vis understanding real racedifferences in intellectual attainment.

REFERENCES

Aronson, J., Fried, C. B., & Good, C. (2002).Reducing the effects of stereotype threat onAfrican American college students by shapingtheories of intelligence. Journal of Experi-mental Social Psychology, 38, 113–125.

Blascovich, J., Spencer, S. J., Quinn, D., &Steele, C. (2001). African Americans and high

blood pressure: The role of stereotype threat.Psychological Science, 12, 225–229.

Bowen, W. G., & Bok, D. (1998). The shape ofthe river: Long-term consequences of consid-ering race in college and university admis-sions. Princeton, NJ: Princeton UniversityPress.

Croizet, J. C., & Claire, T. (1998). Extending theconcept of stereotype and threat to socialclass: The intellectual underperformance ofstudents from low socioeconomic back-grounds. Personality and Social PsychologyBulletin, 24, 588–594.

Good, C., Aronson, J., & Inzlicht, M. (2003).Improving adolescents’ standardized test per-formance: An intervention to reduce the ef-fects of stereotype threat. Journal of AppliedDevelopmental Psychology, 24, 645–662.

Hacker, A. (1995). Two nations: Black andwhite, separate, hostile, unequal. New York:Ballantine.

Jensen, A. R. (1980). Bias in mental testing.New York: Free Press.

Sackett, P. R., Hardison, C. M., & Cullen, M. J.(2004). On interpreting stereotype threat asaccounting for African American–White dif-ferences on cognitive tests. American Psy-chologist, 59, 7–13.

Steele, C. M. (1997). A threat in the air: Howstereotypes shape the intellectual identitiesand performance of women and AfricanAmericans. American Psychologist, 52,613–629.

Steele, C. M., & Aronson, J. A. (1995). Stereo-type threat and intellectual test performance ofAfrican Americans. Journal of Personalityand Social Psychology, 69, 797–811.

Correspondence concerning this commentshould be addressed to Geoffrey Cohen, Depart-ment of Psychology, Yale University, 2 Hill-house Avenue, Box 208205, New Haven, CT06520-8205. E-mail: [email protected]

DOI: 10.1037/0003-066X.60.3.271

On Interpreting Research onStereotype Threat and Test

Performance

Paul R. Sackett and Chaitra M. HardisonUniversity of Minnesota, Twin Cities

Campus

Michael J. CullenPersonnel Decisions Research Institutes

We are gratified that our article (Sackett,Hardison, & Cullen, January 2004)prompted these thoughtful reactions. Weoffer comments on each. Wicherts (2005,this issue) focused on the assumptions un-derlying the use of analysis of covarianceand noted possible violations of these as-sumptions in the use of prior test scores ascovariates in stereotype threat research. We

agree that violations are possible, though atthis point the likelihood and impact ofthese violations are matters of speculation.We take this opportunity to note that in-cluding a prior test score as a covariate isnot a critical element of testing stereotypethreat theory. There are sound reasons forincluding a prior test score as a covariate,such as increasing the power to detect aneffect via reduction of the error term. How-ever, reporting results without the covariatewould permit a straightforward examina-tion of subgroup differences in threat ver-sus nonthreat conditions.

A theme of both the Helms (2005, thisissue) comment and the Cohen and Sher-man (2005, this issue) comment is thatstudies other than the Steele and Aronson(1995) work that was the focus of our ar-ticle are important for understanding theeffects of stereotype threat. We fully agreethat it is important to consider the fullrange of research on stereotype threat. Wecertainly do not believe that the question ofthe effects of stereotype threat on testscores in high-stakes settings is settled; infact, our own work continues to explorethis question (e.g., Cullen, Hardison, &Sackett, 2004). But we also urge carefulexamination of the research studies citedby Helms and by Cohen and Sherman.

Helms (2005) presented a reanalysisof data from McKay, Doverspike, Bowen-Hilton, and McKay (2003) and proposed aregression-based mediation analysis as ameans of testing whether stereotype threataccounts for group differences in mean testscores. She reported data showing that racehas no effect on test scores once measuredstereotype threat is controlled, which leadsher to the conclusion that in this data set,threat does explain the racial group differ-ence. We raise two concerns here about theprocedure and her interpretation of theMcKay et al. (2003) data. First, readersmay be confused because of an initial mis-statement about the requirements for medi-ation. Helms stated that “for stereotypethreat to account for racial-group differ-ences in test scores, measures or manipu-lations of it only have to account for at leastas much variance in racial group as racialgroup explains in test scores” (p. 269). Infact, the requirement for full mediation isnot that race and threat covary to at leastthe same degree as race and test scorescovary but rather that race accounts for thesame variance in test scores as is accountedfor by threat. In other words, the effect ofrace on test performance is transmitted viastereotype threat, such that race has noeffect on test score once stereotype threat iscontrolled.

271April 2005 ● American Psychologist

Second, Helms’s (2005) analyticmethod is based on the assumption of acausal link between stereotype threat andtest performance. However, McKay et al.(2003) measured stereotype threat via self-report after taking the test in question. Theyacknowledged that the self-report of threatmay be in response to perceived perfor-mance on the test (i.e., poor performancemay cause reported threat, rather thanthreat causing poor performance). This isone reason for the use of experimentalmethods in this domain: Comparing Black–White test score differences in the presenceand absence of experimentally inducedthreat resolves this critical causal ambigu-ity. Note that the McKay et al. (2003) ar-ticle uses a subset of a larger data set usedby McKay, Doverspike, Bowen-Hilton,and Martin (2002). That study assignedparticipants to threat and nonthreat condi-tions and reported a nonsignificant correla-tion of �.02 between the self-report mea-sure of threat used in Helms’s reanalysisand assignment to experimental condition.This lack of correlation between manipu-lated and measured threat warrants cautionin interpreting these data.

Cohen and Sherman (2005) raisedthree points. First, they cited three otherstudies that we did not discuss and that theyasserted show large stereotype threat ef-fects, reducing or eliminating group differ-ences. Two of these studies did addressrace. Both are interesting and well done,but neither directly addresses the questionof whether the mind-set with which exam-inees approach a test (e.g., induced threatvs. no threat) has an effect on the meanscore gap on a test with a well-documentedmean score gap. Blascovich, Spencer,Quinn, and Steele (2001) used the 20-itemRemote Associates Test, commonly usedas a measure of creativity. The RemoteAssociates Test is well suited for use instereotype threat research, as it containsitems of a type likely to be unfamiliar tostudents and is thus amenable to a constructlabeling manipulation. Blascovich et al.found a significant threat effect for difficultitems, though not for easy or moderateitems. For these items, African Americanstudents scored one half of one item higherin the nonthreat condition than in the threatcondition; in the nonthreat condition, scoresof African American and White students didnot differ. No race differences were found foreasy or moderate difficulty items. Thus, evenin the threat condition the overall group meandifference was small.

The other study involving race wasGood, Aronson, and Inzlicht’s (2003) ex-amination of the effects of a mentoringprogram on end-of-year achievement

scores of a sample of low-income, largelyminority seventh graders. A control groupwas contrasted with three treatments de-signed to reduce stereotype threat (e.g., acondition in which mentors emphasizedviewing intelligence as malleable ratherthan fixed). Two treatment conditions pro-duced significantly higher reading scoresthan the control condition. Note that theintervention is much broader than alteringthe mind-set with which an examinee ap-proaches a test. In this study, the interven-tion had the potential to alter the mind-setwith which students approached their aca-demic work throughout the course of aschool year. Thus, although this is innova-tive and important research, it should notbe interpreted as the effects of instructionalset directly on test scores.

Cohen and Sherman’s (2005) secondpoint was to note that Steele and colleagueshave not argued that threat alone accountsfor the racial achievement gap and to artic-ulate a rationale for the use of a prior testscore as a covariate in stereotype threatresearch—namely, to examine perfor-mance differences with and without in-duced threat after controlling for priorachievement. We agree with Cohen andSherman on both points. We emphasizedrepeatedly in our article that any claimsthat removing threat eliminated racial dif-ferences were claims made by those inter-preting Steele and Aronson’s (1995) workin the popular and scientific press, not claimsmade by Steele and Aronson. We do not takeissue with using a prior test score as a covari-ate; our concerns were with commentators’failure to recognize the implications of theuse of a covariate (i.e., that “no difference incovariate-adjusted scores” is not the samething as “no difference in test scores”).

Cohen and Sherman’s (2005) thirdpoint was that insights from stereotypethreat theory are proving useful in design-ing interventions in classroom settingsaimed at reducing the racial achievementgap, and they cited studies in this domainthat we did not mention in our article. Ourfocus was on misinterpretations of Steeleand Aronson’s (1995) classic article exam-ining the effects of stereotype threat on thetest performance of African American andWhite students. We were particularly inter-ested in the implications of stereotypethreat for understanding the commonly ob-served African American–White mean testscore difference on tests widely used forhigh-stakes decisions, such as educationaladmissions and employment. Given ournarrow purpose, our article did not reviewthe literature on stereotype threat and class-room intervention, but we concur with Co-hen and Sherman as to its promise (see thediscussion of Good et al., 2003, above).

In brief, we welcome the thoughtfulinsights of Wicherts (2005), Helms (2005),and Cohen and Sherman (2005), and wehope that these comments stimulate furthercritical analysis of methodological issuesassociated with stereotype threat research.We do not dispute that stereotype threat isa real phenomenon or that it remains apotentially important contributor to the ra-cial achievement gap. We encourage re-searchers to continue their efforts to deter-mine what role stereotype threat plays incontributing to that gap, especially in real-world testing situations.

REFERENCES

Blascovich, J., Spencer, S. J., Quinn, D., &Steele, C. (2001). African Americans and highblood pressure: The role of stereotype threat.Psychological Science, 12, 225–229.

Cohen, G. L., & Sherman, D. K. (2005). Stereo-type threat and the social and scientific con-texts of the race achievement gap. AmericanPsychologist, 60, 270–271.

Cullen, M. J., Hardison, C. M., & Sackett, P. R.(2004). Using SAT-grade and ability-job per-formance relationships to test predictions de-rived from stereotype threat theory. Journal ofApplied Psychology, 89, 220–230.

Good, C., Aronson, J., & Inzlicht, M. (2003).Improving adolescents’ standardized test per-formance: An intervention to reduce the ef-fects of stereotype threat. Journal of AppliedDevelopmental Psychology, 24, 645–662.

Helms, J. E. (2005). Stereotype threat mightexplain the Black–White test-score difference.American Psychologist, 60, 269–270.

McKay, P. F., Doverspike, D., Bowen-Hilton,D., & Martin, Q. D. (2002). Stereotype threateffects on the Raven Advanced ProgressiveMatrices scores of African-Americans. Jour-nal of Applied Social Psychology, 32,767–787.

McKay, P. F., Doverspike, D., Bowen-Hilton,D., & McKay, Q. D. (2003). The effects ofdemographic variables and stereotype threaton the Black/White differences in cognitiveability test performance. Journal of Businessand Psychology, 18, 1–14.

Sackett, P. R., Hardison, C. M., & Cullen, M. J.(2004). On interpreting stereotype threat asaccounting for African American–White dif-ferences on cognitive tests. American Psy-chologist, 59, 7–13.

Steele, C. M., & Aronson, J. (1995). Stereotypethreat and the intellectual test performance ofAfrican Americans. Journal of Personalityand Social Psychology, 69, 797–811.

Wicherts, J. M. (2005). Stereotype threat re-search and the assumptions underlying analy-sis of covariance. American Psychologist, 60,267–269.

Correspondence concerning this commentshould be addressed to Paul R. Sackett, Depart-ment of Psychology, University of Minnesota,Elliott Hall, 75 East River Road, Minneapolis,MN 55455. E-mail: [email protected]

272 April 2005 ● American Psychologist