21
Industrial and Organizational Psychology, 4 (2011), 494–514. Copyright © 2011 Society for Industrial and Organizational Psychology. 1754-9426/11 FOCAL ARTICLE The Uniform Guidelines Are a Detriment to the Field of Personnel Selection MICHAEL A. MCDANIEL, SVEN KEPES, AND GEORGE C. BANKS Virginia Commonwealth University Abstract The primary federal regulation concerning employment testing has not been revised in over 3 decades. The regulation is substantially inconsistent with scientific knowledge and professional guidelines and practice. We summarize these inconsistencies and outline the problems faced by U.S. employers in complying with the regulations. We describe challenges associated with changing federal regulations and invite commentary as to how such changes can be implemented. We conclude that professional organizations, such as the Society for Industrial and Organizational Psychology (SIOP), should be much more active in promoting science-based federal regulation of employment practices. For most of the history of the United States, the employment opportunities of ethnic and racial minorities, women, and older adults were substantially restricted. With the enactment of federal civil rights leg- islation, the U.S. government sought to end such employment discrimination. The Uniform Guidelines on Employee Selection Procedures (Equal Employment Opportu- nity Commission, Civil Service Commis- sion, Department of Labor, & Department of Justice, 1978), hereafter ‘‘Uniform Guide- lines,’’ are U.S. federal guidelines, ‘‘which are designed to assist employers [...] to comply with requirements of federal law prohibiting employment practices which Correspondence concerning this article should be addressed to Michael A. McDaniel. E-mail: [email protected] Address: Virginia Commonwealth University, Snead Hall, 301 W. Main St., PO Box 844000, Rich- mond, VA 23284-4000 This paper has benefited substantially from the feedback of several individuals. Their help has been appreciated. discriminate on grounds of race, color, religion, sex, and national origin. They are designed to provide a framework for determining the proper use of tests and other selection procedures’’ (Section 1B). The Uniform Guidelines evolved from fed- eral legislative actions and court decisions related to employment discrimination in the United States. As such, these 33-year- old guidelines have substantial influence on how employers, industrial and orga- nizational (I–O) psychologists, and other practitioners in personnel selection conduct their work. In this article, we present arguments that the Uniform Guidelines are scien- tifically inaccurate and inconsistent with professional practice as summarized in the Standards for Educational and Psychologi- cal Testing (American Educational Research Association, American Psychological Asso- ciation, & National Council on Measure- ment in Education, 1999), hereafter ‘‘Stan- dards,’’ and the Principles for the Validation and Use of Personnel Selection Procedures 494

The Uniform GuidelinesAre a Detriment to the Field of

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Uniform GuidelinesAre a Detriment to the Field of

Industrial and Organizational Psychology, 4 (2011), 494–514.Copyright © 2011 Society for Industrial and Organizational Psychology. 1754-9426/11

FOCAL ARTICLE

The Uniform Guidelines Area Detriment to the Field of PersonnelSelection

MICHAEL A. MCDANIEL, SVEN KEPES, AND GEORGE C. BANKSVirginia Commonwealth University

AbstractThe primary federal regulation concerning employment testing has not been revised in over 3 decades. Theregulation is substantially inconsistent with scientific knowledge and professional guidelines and practice. Wesummarize these inconsistencies and outline the problems faced by U.S. employers in complying with theregulations. We describe challenges associated with changing federal regulations and invite commentary asto how such changes can be implemented. We conclude that professional organizations, such as the Societyfor Industrial and Organizational Psychology (SIOP), should be much more active in promoting science-basedfederal regulation of employment practices.

For most of the history of the United States,the employment opportunities of ethnicand racial minorities, women, and olderadults were substantially restricted. Withthe enactment of federal civil rights leg-islation, the U.S. government sought toend such employment discrimination. TheUniform Guidelines on Employee SelectionProcedures (Equal Employment Opportu-nity Commission, Civil Service Commis-sion, Department of Labor, & Departmentof Justice, 1978), hereafter ‘‘Uniform Guide-lines,’’ are U.S. federal guidelines, ‘‘whichare designed to assist employers [. . .] tocomply with requirements of federal lawprohibiting employment practices which

Correspondence concerning this article should beaddressed to Michael A. McDaniel.E-mail: [email protected]

Address: Virginia Commonwealth University,Snead Hall, 301 W. Main St., PO Box 844000, Rich-mond, VA 23284-4000

This paper has benefited substantially from thefeedback of several individuals. Their help has beenappreciated.

discriminate on grounds of race, color,religion, sex, and national origin. Theyare designed to provide a framework fordetermining the proper use of tests andother selection procedures’’ (Section 1B).The Uniform Guidelines evolved from fed-eral legislative actions and court decisionsrelated to employment discrimination inthe United States. As such, these 33-year-old guidelines have substantial influenceon how employers, industrial and orga-nizational (I–O) psychologists, and otherpractitioners in personnel selection conducttheir work.

In this article, we present argumentsthat the Uniform Guidelines are scien-tifically inaccurate and inconsistent withprofessional practice as summarized in theStandards for Educational and Psychologi-cal Testing (American Educational ResearchAssociation, American Psychological Asso-ciation, & National Council on Measure-ment in Education, 1999), hereafter ‘‘Stan-dards,’’ and the Principles for the Validationand Use of Personnel Selection Procedures

494

Page 2: The Uniform GuidelinesAre a Detriment to the Field of

Uniform Guidelines are a detriment 495

(SIOP, 2003), hereafter ‘‘Principles.’’ Weuse these arguments to conclude that theUniform Guidelines should be rescindedor at least extensively revised to be madeconsistent with current scientific knowledgeand professional practice.

Encouraging Debate for theBetterment of Personnel SelectionPractice

A discussion of the Uniform Guidelines is,in part, a discussion of mean racial dif-ferences. Past high-profile examinations ofrace-related issues (e.g., Herrnstein & Mur-ray, 1994; Jensen, 1969) have been highlyemotive. Within I–O psychology, the dis-cussion of race is embedded in papersaddressing high-stakes testing as well as per-sonnel selection and job performance (e.g.,McKay & McDaniel, 2006; Roth, Bevier,Bobko, Switzer, & Tyler, 2001; Sackett,Schmitt, Ellingson, & Kabin, 2001; Schmitt& Quinn, 2010), and these topics can alsoarouse emotion. In our experience, thesetopics tend not to be discussed in an openand professional manner and may degener-ate into argumentum ad hominen, such asasserting that researchers who study demo-graphic mean differences or who are criticsof the Uniform Guidelines are racists, sex-ists, ageists, or are unsupportive of equalemployment opportunity.

We note that nothing in our argumentsfor rescindment or extensive revision ofthe Uniform Guidelines is contrary to theauthors’ full support of equal employmentopportunity. Nor are the arguments contraryto affirmative action or diversity efforts. Fur-thermore, the authors are strong advocatesof continued research in understanding andreducing demographic mean differences inpersonnel selections tests and in assess-ments of job performance.

By presenting our arguments for therescindment or revision of the UniformGuidelines, we are hoping to foster a pro-fessional and collegial debate. Our articledraws in part from previous work thateither critiques the Uniform Guidelines orhighlights differences between the Uniform

Guidelines and the Standards and/or Princi-ples (e.g., Biddle, 2010; Cascio & Aguinis,2001; Copus, 2006; Daniel, 2001; Ewoh& Guseh, 2001; Jeanneret, 2005; Kleiman& Faley, 1985; McDaniel, 2007, 2010;O’Boyle & McDaniel, 2008; Sharf, 2006,2008). We suggest that the lack of pro-fessional debate concerning the UniformGuidelines damages the profession of I–Opsychology by encouraging the use of per-sonnel selection practices unsupported byscientific evidence. The lack of debate alsoencourages the gerrymandering of person-nel selection practices (McDaniel, 2009),and a general disregard of the ethics of suchpractices. Furthermore, we suggest that thecontinued inaction of our professional orga-nizations (e.g., SIOP) with respect to theinconsistency of the Uniform Guidelineswith scientific knowledge and professionalpractice is unwise.

We begin the article with the assertionthat the authoring agencies of the Uni-form Guidelines made unfulfilled promisesto keep the Uniform Guidelines andtheir interpretation consistent with scientificknowledge and professional practice. Wethen review sections of the Uniform Guide-lines that are most disparate with scientificknowledge and professional practice. Weoffer evidence concerning the prevalenceof racial disparities in employment screen-ing results and suggest that these disparitiesshould not generally trigger federal interfer-ence in personnel selection practices. Weoffer examples of how science and federalregulatory agencies interact. Finally, we callon the authoring agencies of the UniformGuidelines to initiate a revision and pro-vide suggestions for how SIOP and otherprofessional organizations can encouragescience-based federal regulation of employ-ment practices.

The Unfulfilled Promisesof the Uniform Guidelines

There is precedent for the revision of fed-eral regulations related to employee selec-tion. Before the Uniform Guidelines were

Page 3: The Uniform GuidelinesAre a Detriment to the Field of

496 M.A. McDaniel, S. Kepes, and G.C. Banks

issued, the EEOC released employment test-ing regulations in 1966 (Guidelines onEmployment Testing Procedures) and in1970 (Guidelines on Employee SelectionProcedures). The U.S. Civil Service Com-mission, the Department of Labor, andthe Department of Justice had guidelinesfor similar purposes (Daniel, 2001). Theissuance of successive guidelines may beviewed as an effort to maintain consistencywith federal court decisions and scientificknowledge (Daniel, 2001). To avoid confu-sion among the differing guidelines issuedby the four governmental agencies, theUniform Guidelines were jointly issued in1978 by the four agencies. They assertedthat the Uniform Guidelines were intendedto be consistent with professional prac-tice and scientific findings. Specifically, ina section titled ‘‘Guidelines are consistentwith professional standards,’’ the UniformGuidelines state:

The provisions of these guidelines relat-ing to validation of selection proceduresare intended to be consistent with gen-erally accepted professional standardsfor evaluating standardized tests andother selection procedures, such as thosedescribed in the Standards for Educa-tional and Psychological Tests preparedby a joint committee of the AmericanPsychological Association, the AmericanEducational Research Association, andthe National Council on Measurementin Education (American PsychologicalAssociation, Washington, D.C., 1974)(hereinafter ‘‘A.P.A. Standards’’) andstandard textbooks and journals in thefield of personnel selection. (Section 5C)

The Uniform Guidelines also assertedthat new scientific findings would be eval-uated. In Section 5A, they state that ‘‘newstrategies for showing the validity of selec-tion procedures will be evaluated as theybecome accepted by the psychological pro-fession.’’ The Uniform Guidelines, whenpublished in the Federal Register, includedSupplementary Information, which includesthe statement: ‘‘Validation has become

highly technical and complex, yet is con-stantly changing [. . .] Once the guidelinesare issued, they will have to be interpretedin light of changing factual, legal, and pro-fessional circumstances’’ (p. 38292). Withrespect to construct validity, it is stated thatthe ‘‘guidelines leave open the possibilitythat different evidence of construct valid-ity may be accepted in the future, as newmethodologies develop and become incor-porated in professional standards and otherprofessional literature’’ (p. 38295). Thus,the agency authors of the Uniform Guide-lines indicated that the guidelines and theirinterpretation should recognize advancesin scientific knowledge and professionalpractice.

Scientific Knowledge,Professional Practice, and theUniform Guidelines

Unfortunately for those who work in per-sonnel selection and for the U.S. employersto whom they provide services, the author-ing agencies of the Uniform Guidelineshave failed to keep their promises to main-tain and update the Uniform Guidelines.Thus, the next sections examine aspects ofthe Uniform Guidelines that substantiallydeviate from scientific knowledge and pro-fessional practice, ranging from the Guide-lines’ view of the situational specificityhypothesis to the lack of acknowledgementof the diversity–validity dilemma.

The Uniform Guidelines Embrace theSituational Specificity Hypothesis

Beginning in the 1920s and continuing intothe 1970s, it was observed that the sameemployment test yielded different validityresults across settings (Schmidt & Hunter,1998). For example, a test to screen banktellers in one bank would yield a highvalidity (i.e., a high magnitude correlationbetween the test and job performance) butcould yield a much lower validity for banktellers in a bank across the street. Suchfindings were frequent and led to specula-tion that there were as yet undiscovered

Page 4: The Uniform GuidelinesAre a Detriment to the Field of

Uniform Guidelines are a detriment 497

characteristics of employment situationsthat caused a test to be valid for onelocation but not for another. This spec-ulation became known as the situationalspecificity hypothesis, which was widelyaccepted as fact (Guion, 1975; Schmidt &Hunter, 2003).

Given that the situational specificityhypothesis suggested that there wereunknown causes of validity differencesdespite apparently similar employment sit-uations and jobs, professional practiceemphasized the conduct of detailed jobanalyses. There was an assumption thatconducting detailed job analyses woulduncover differences among employment sit-uations that caused validities to vary acrosssimilar situations and jobs. As knowledgeof the validity of a test in one situation for agiven job did not always predict the validityof the same test in a similar situation andjob, professional practice emphasized con-ducting local validation studies. Consistentwith this thinking, the Uniform Guidelinesemphasized the practices of detailed jobanalyses and local validation studies.

Beginning in 1977, Schmidt and Hunterbegan publishing empirical evidence dis-crediting the situational specificity hypoth-esis. Specifically, they demonstrated thatmuch of the variability in validity coeffi-cients across studies was due to randomsampling error. Any primary study examin-ing the correlation between a test and jobperformance seeks to estimate the validitycoefficient in the population. When samplesizes are relatively small (e.g., N < 500), thesamples have a high probability of beingnonrepresentative of the population andthus likely to offer an imprecise estimateof the population validity. Thus, the valid-ity coefficient derived from a small samplemight over- or underestimate the populationvalidity. At the time of Guion’s classic text(Guion, 1965), the average sample size ina validity study was 68. We now know thatthis sample size is far too small to estimatethe true validity of a test in the popula-tion accurately. For instance, a test witha population validity of .20 could easilyyield sample validities ranging from −.04

to .421 based on sample sizes of 68. Thus,small sample studies make validity coeffi-cients appear unstable even when they areconstant in the population.

The Emphasis of the Uniform Guidelineson Local Validation Studies

The Uniform Guidelines require validityevidence when a test demonstrates adverseimpact (i.e., differential hiring rates by race,gender, etc.). Yet, for most employers, localempirical validity studies are professionallyill-advised because of sample-size limita-tions. In contrast, the Uniform Guidelinesare largely oblivious to sample size issues intest validation. The Principles acknowledgethat ‘‘validation planning must considerthe feasibility of the design requirementsnecessary to support an inference of valid-ity. Validation efforts may be limited bytime, resource availability, sample size,or other organization constraints includ-ing cost’’ (p. 10). From the perspective ofprecision in estimating a population valid-ity coefficient, sample sizes below 100are clearly inadequate, yet 79% of U.S.employers have fewer than 100 employeesand 84% have fewer than 500 employees(U.S. Census Bureau, 2007). The employ-ees of these small- to medium-sized busi-nesses would likely be found in multipleoccupations, further reducing the samplesize available for a concurrent validationstudy of a single occupation. Similarly,such small employers are likely to hirerelatively few employees in a given timeperiod, making predictive validity stud-ies infeasible as well. In brief, only asmall percentage of employers have enoughemployees in a given occupation to per-mit credible local criterion-related valid-ity documentation. Thus, with respect tocriterion-related validity evidence, the Uni-form Guidelines seek documentation that

1. A point estimate of .2 with a sample size of 68leads to a 95% confidence interval ranging from−.04 to .42.

Page 5: The Uniform GuidelinesAre a Detriment to the Field of

498 M.A. McDaniel, S. Kepes, and G.C. Banks

cannot be provided by the majority of U.S.employers.2

The Uniform Guidelines and Evidencefor Validity Based on Content Similarity

We note that both the Principles andthe Uniform Guidelines address standardsfor validity documentation.3 However, theUniform Guidelines adopted a curiousstance with respect to what job-related per-sonal characteristics can and cannot bedefended based on content evidence. With-out any stated science-based justification,the Uniform Guidelines declare:

A selection procedure based upon infer-ences about mental processes cannotbe supported solely or primarily on thebasis of content validity. Thus, a contentstrategy is not appropriate for demon-strating the validity of selection proce-dures which purport to measure traitsor constructs, such as intelligence, apti-tude, personality, commonsense, judg-ment, leadership, and spatial ability.(Section C1)

We note that this section of the UniformGuidelines appears to rule out a contentvalidity defense for some very commonselection constructs including general andspecific tests of cognitive ability and the BigFive personality traits. It would also appearto exclude content validity as a defense formost interviews, assessment centers, and sit-uational judgment tests to the extent that themeasures seek to assess constructs associ-ated with cognitive ability, personality, andleadership.4 This situation leaves most U.S.

2. We note that this requirement from the UniformGuidelines has led to consortium groups (e.g.,Edison Electric Institute and Mayflower) thatconduct industry-wide selection validation studies.However, although these consortiums are useful toa few large industries (e.g., electric utilities), theyhave limited applicability to many U.S. employers.

3. We have some concerns regarding the use of theUniform Guidelines as a cookbook for job analysis.However, these concerns are criticisms of jobanalysts and not so much the Uniform Guidelines.

4. We recognize that content validity documentationin practice is often offered for mental constructs

employers in a very bad situation becausefew employers have sufficient employeesor applicants to conduct a criterion-relatedvalidity study, and they are further pre-cluded from using a content validity strategyto defend reasonable tests of cognitive abil-ity or personality.

The Uniform Guidelines do not appearto appreciate problems created in organi-zations as a result of the regulation. Forexample, the Uniform Guidelines approachto content validity is problematic for manyorganizations with rapidly evolving workand flexible occupational structures. In con-trast, the Principles note that organizationsexperiencing ‘‘rapid changes in the exter-nal environment, the nature of work, orprocesses for accomplishing work may findthat traditional jobs no longer exist. Insuch cases, considering the competenciesor broad requirements for a wider range ortype of work activity may be more appropri-ate’’ (p. 9). In addition, the Principles notethe value of a less detailed approach tojob analysis than is found in the UniformGuidelines:

A less detailed analysis may be appro-priate when prior research about the jobrequirements allows the generation ofsound hypotheses concerning the pre-dictors or criteria across job families ororganizations. When a detailed analysisof work is not required, the researchershould compile reasonable evidenceestablishing that the job(s) in questionare similar in terms of work behaviorand/or required knowledge, skills, abili-ties, and/or other characteristics, or fallsinto a group of jobs for which validitycan be generalized. (p. 11)

and measurement methods such as assessmentcenters. This is done in part by changing whatone calls constructs. Thus, an employment testassessing intelligence (i.e., general cognitive ability)by a composite of three ability tests (readingcomprehension, numerical fluency through tables,and reasoning) would be presented as the followingattributes: ability to read, ability to work with tables,and ability to solve problems.

Page 6: The Uniform GuidelinesAre a Detriment to the Field of

Uniform Guidelines are a detriment 499

We assert that cost and time constraintsmake the Uniform Guidelines’ contentvalidity requirements burdensome for manyU.S. employers because they may lack thefinancial and technical resources to fullycomply with them. The Principles addressfeasibility limitations on job analysis forcontent validity: ‘‘Among these issues arethe stability of the work and the workerrequirements, the interference of irrelevantcontent, the availability of qualified andunbiased subject matter experts, and costand time constraints’’ (p. 21).

The Uniform Guidelines and Evidencefor Validity Based on Construct Validity

The Standards state that validation beginswith ‘‘an explicit statement of the proposedinterpretation of test scores, along with arationale for the relevance of the interpre-tation to the proposed use. The proposedinterpretation refers to the constructs orconcepts the test is intended to measure’’(p. 9). Thus, although all validation con-cerns constructs, the Uniform Guidelinesadopted a curious position concerning con-struct approaches to validity evidence:

Construct validity is a more complexstrategy than either criterion-related orcontent validity. Construct validation is arelatively new and developing procedurein the employment field, and there is atpresent a lack of substantial literatureextending the concept to employmentpractices. The user should be aware thatthe effort to obtain sufficient empiricalsupport for construct validity is both anextensive and arduous effort involvinga series of research studies, whichinclude criterion related validity studiesand which may include content validitystudies. Users choosing to justify use ofa selection procedure by this strategyshould therefore take particular care toassure that the validity study meets thestandards set forth below. (Section D1)

This wording made it largely impossibleto use construct evidence as a validity

defense under the Uniform Guidelines.Counter to the statement in the Supple-mentary Information (p. 38295) of the Uni-form Guidelines concerning the evaluationof new scientific approaches to constructvalidity, the Uniform Guidelines have neverbeen revised with respect to constructvalidity.

In contrast to the nonscientific assertionsof the Uniform Guidelines, the Principlesand Standards recognize the importance ofvaried approaches to construct evidence insupport of validity. The Principles highlightthe value of validity evidence demonstrat-ing the relationship between an employ-ment test and other variables. For example,the Principles state that ‘‘evidence thattwo measures are highly related and con-sistent with the underlying construct canprovide convergent evidence in support ofthe proposed interpretation of test scores asrepresenting a candidate’s standing on theconstruct of interest’’ (p. 5). The Principlesalso discuss the usefulness of discriminantvalidity and the value of evidence relat-ing to the internal structure of the test. Forexample, a high degree of item internalconsistency would be supportive of a testargued to represent a single construct.

The Uniform Guidelines and Its 1950sPerspective on Separate ‘‘Types’’of Validity

The Principles note that in the early 1950sthree different types of test validity wereconsidered, these being content, criterion-related, and construct. The measurementliterature has since adopted the perspectivethat validity is a unitary concept in whichdifferent sources of information can informinferences about test scores. The Princi-ples emphasize that ‘‘nearly all informationabout a selection procedure, and infer-ences about the resulting scores, contributesto an understanding of its validity. Evi-dence concerning content relevance, cri-terion relatedness, and construct meaningis subsumed within this definition of valid-ity’’ (p. 4). In contrast to the professional

Page 7: The Uniform GuidelinesAre a Detriment to the Field of

500 M.A. McDaniel, S. Kepes, and G.C. Banks

practice summarized in the current Princi-ples and Standards, the Uniform Guidelinescontinue to embrace the 1950s perspectiveon three distinct types of validity.

The Uniform Guidelines and Meta-AnalysisAs a Source of Validity Documentation

The early work of Schmidt, Hunter and col-leagues (e.g., Pearlman, Schmidt, & Hunter,1980; Schmidt, Gast-Rosenberg, & Hunter,1980; Schmidt & Hunter, 1977) concerningsituational specificity evolved into psycho-metric meta-analysis procedures (Hunter &Schmidt, 2004). The application of meta-analysis to validity data became knownas validity generalization, and a test wasargued to show validity generalizationwhen a large majority (typically 90%) ofpopulation validities were above 0. TheStandards and the Principles endorse valid-ity generalization as evidence of the validityof employment tests. The Principles, forinstance, note:

Meta-analysis is the basis for the tech-nique that is often referred to as ‘‘valid-ity generalization.’’ In general, researchhas shown much of the variation inobserved differences in obtained valid-ity coefficients in different situationscan be attributed to sampling error andother statistical artifacts (Ackerman &Humphreys, 1990; Barrick & Mount,1991; Callender & Osburn, 1980, 1981;Hartigan & Wigdor, 1989; Hunter &Hunter, 1984; Schmidt, Hunter, &Pearlman, 1981). These findings are par-ticularly well established for cognitiveability tests; additional recent researchresults also are accruing that indicatethe generalizability of predictor-criterionrelationships for noncognitive constructsin employment settings. (p. 28)

From the perspective of scientific knowl-edge, meta-analytic evidence largely elim-inates the need for local validity studies.Specifically, only if ‘‘important conditionsin the operational setting are not repre-sented in the meta-analysis (e.g., the local

setting involves a managerial job and themeta-analytic data base is limited to entrylevel jobs)’’ do the Principles state that localindividual studies ‘‘may be more accuratethan the average predictor-criterion rela-tionship reported in a meta-analytic study’’(p. 29). In addition to the acceptance ofvalidity generalization in professional stan-dards, courts have found in favor of gener-alizing validity evidence (see Sharf, 2006).

We recognize that most of the evi-dence concerning validity generalizationwas developed after the publication of theUniform Guidelines. However, the Uni-form Guidelines have never been revisedto acknowledge the role of meta-analysis indemonstrating the validity of employmenttests. Reliance on validity generalizationevidence may be one of the most econom-ical approaches to test validation, and itsomission from the Uniform Guidelines isinappropriate.

We speculate that a primary reason whythe Uniform Guidelines have not beenrevised to incorporate validity generaliza-tion as an acceptable validity defense isthat it might change the litigation land-scape significantly. There are concerns thatassessments with strong validity general-ization support, such as general cognitiveability, will become more widely used andresult in a less racially diverse workforce.There are also individuals and organiza-tions, such as employment attorneys, expertwitnesses, employment testing consultants,and enforcement agencies, whose businessis driven, in part, by the Uniform Guide-lines. If litigation becomes less frequent dueto a wider acceptance of validity general-ization as a validity defense, some individ-uals and organizations will suffer financialharm. Finally, there are some who areworried that validity generalization couldbe applied inappropriately as a validationdefense. This concern could be reducedby more guidance, such as is found in thePrinciples, concerning how validity gener-alization results may be applied appropri-ately to specific testing situations (Banks &McDaniel, in press; McDaniel, 2007).

Page 8: The Uniform GuidelinesAre a Detriment to the Field of

Uniform Guidelines are a detriment 501

The Uniform Guidelines and Restrictionson Transportability of Evidence

Although applications of meta-analysis tovalidity data may be viewed as transporta-bility of evidence supporting validity, theuse of the word transportability often refersto using information from a primary validitystudy to generalize validity to the use ofthe test in a new situation. The Principlesaddress the value of transportability evi-dence in the documentation of the validityof employment tests:

One approach to generalizing the valid-ity of inferences from scores on a selec-tion procedure involves the use of aspecific selection procedure in a new sit-uation based on results of a validationresearch study conducted elsewhere.This is referred to as demonstrating the‘‘transportability’’ of validity evidence forthe selection procedure. When propos-ing to ‘‘transport’’ use of a procedure, acareful review of the original validationstudy is warranted to ensure acceptabilityof the technical soundness of that studyand to determine its relevance to the newsituation. Key points for considerationwhen establishing the appropriatenessof transportability are, most prominently,job comparability in terms of content orrequirements, as well as, possibly, simi-larity of job context and candidate group.(p. 26)

We note that the transportability lan-guage in the Principles does not limit thetype of validity evidence. Unfortunately, inthe Uniform Guidelines, transportability isonly mentioned with respect to criterion-related validity. With respect to contentvalidity, a reviewer has advised us that the‘‘transport’’ of content evidence devolvesto the job analysis and demonstration ofthe job relevance of the content, effec-tively repeating the content evidence fromthe original study. In brief, the UniformGuidelines make transportability of valid-ity evidence based on content or constructrelevance a difficult proposition and thus

are, once again, inconsistent with scientificknowledge and professional guidelines.

The Uniform Guidelines Position WithRespect to Differential Validity andDifferential Prediction

Belief in the situational specificity hypothe-sis coupled with the very common observa-tion of mean racial differences in test scoresencouraged scientific inquiries regardingthe possibility of differential validity and dif-ferential prediction (Boehm, 1977; Bray &Moses, 1972; Kirkpatrick, Ewen, Barrett, &Katzell, 1968). It was argued that the validity(i.e., differential validity) or the predictionaccuracy (i.e., differential prediction) mayvary by ethnic and racial group. However,during the late 1970s and early 1980s, itbecame evident that differential validity wasrare (Schmidt, 1988; Schmidt & Hunter,1981; Wigdor & Garner, 1982). Differentialprediction might result from either differ-ing slopes or differing intercepts. By thelate 1970s, it was demonstrated that differ-ential prediction by slope does not occurat higher levels than expected by chance(Bartlett, Bobko, Mosier, & Hannan, 1978).Differential prediction by intercept is lessrare, but the error in prediction tends tofavor minority groups (Hartigan & Wigdor,1989; Schmidt, Pearlman, & Hunter, 1980).

Unfortunately, the most definitive sci-entific knowledge concerning differentialvalidity and prediction developed largelyafter the publication of the Uniform Guide-lines. However, already in 1978, manyI–O psychologists believed that differen-tial prediction did not exist (Daniel, 2001;Hunter, Schmidt, & Hunter, 1979). Thus,the differential prediction requirement inthe Uniform Guidelines may have beenincluded due to enforcement considerationsrather than technical or scientific knowl-edge (Daniel, 2001). Nevertheless, evenwith the accumulation of scientific knowl-edge concluding that ‘‘differential validitydoes not exist’’ (Gatewood, Feild, & Barrick,2008, p. 547) and that differential predic-tion typically does not occur, and when it

Page 9: The Uniform GuidelinesAre a Detriment to the Field of

502 M.A. McDaniel, S. Kepes, and G.C. Banks

does, it tends to favor minority groups (Har-tigan & Wigdor, 1989; Schmidt, Pearlman,& Hunter 1980), the Uniform Guidelineshave not been revised to be consistent withcurrent knowledge.

We note the recent resurgence ofscientific interest in differential predic-tion (Aguinis, Culpepper, & Pierce, 2010;Borneman, 2010; Meade & Tonidandel,2010). As with all areas concerning person-nel selection and equal employment oppor-tunity, we encourage continued research.For our discussion, we suggest that themost relevant aspect of this research con-cerns statistical power. Given that researchgenerally argues that differential predictionstudies are almost always underpowered, itmakes little sense for the Uniform Guide-lines to encourage differential predictionstudies when the sample sizes availableto the vast majority of employers are toosmall to detect whether differential pre-diction exists. This is yet one more areawhere the Uniform Guidelines are incon-sistent with current scientific knowledge.

The Uniform Guidelines and FalseAssumptions Concerning Adverse Impact

The Uniform Guidelines incorporate the4/5ths rule to determine if adverse impactis present. If the ratio of the minority hiringrate is less than 80% of the majority hiringrate, adverse impact is generally consideredpresent. We note that the 4/5ths rule hasno scientific basis, and there are debatesconcerning its value (Cohen, Aamodt, &Dunleavy, 2010; Roth, Bobko, & Switzer,2006; Shoben, 1978). Although not men-tioned in the Uniform Guidelines, federalenforcement agencies often use a ‘‘twostandard deviation test,’’ which is a sta-tistical test for differences in proportions.Both the 4/5ths rule and the ‘‘two standarddeviation test’’ have been criticized as tech-niques for assessing adverse impact (Morris& Lobsenz, 2000; Roth et al., 2006). Whenhiring decisions result in adverse impact,the Uniform Guidelines make it the respon-sibility of the employer to provide test val-idation documentation. Developing such

documentation can be very expensive andlabor intensive because it often requires theservice of consulting firms, expert witnesses,and other specialists. Although we are notarguing that such validation evidence isnot desirable for all selection procedures,compliance with the Uniform Guidelinesdocumentation requirements can prove tobe very expensive, particularly for small-and medium-sized employers who com-prise the large majority of U.S. employers.

We suggest that an implicit assumptionof the Uniform Guidelines is that adverseimpact is an indication of a flawed test.We offer the alternative hypothesis thatthe employment test is an accurate assess-ment of subgroup differences in job-relatedattributes. Table 1 summarizes the field’scumulative knowledge on the extent ofmean score differences by race and gender.It is clear that almost all selection proce-dures, possibly excepting personality, arelikely to show mean racial differences ofsufficient magnitude to typically result inadverse impact for any reasonable passingpoint. Thus, unfortunately, adverse impactis the norm and not the exception. Weargue that the common finding of meanracial differences and the potential causesof the mean racial differences in employ-ment tests are ‘‘the elephant in the room’’of personnel selection (i.e., a large and obvi-ous problem that is seldom discussed). Wealso argue that, given the pervasiveness ofadverse impact, the presence of adverseimpact should not result in federal interfer-ence in employment practices when suchinterference is based on regulations incon-sistent with scientific knowledge. Note thatwe are strong advocates that all selectionprocedures should be job related. What weobject to is a requirement that validationevidence must comply with scientificallyinappropriate federal regulations.

We offer that the primary cause of meanracial differences in employment test scoresis mean racial differences in job-relatedattributes, not flawed employment tests. Wesuggest that employment tests are measur-ing mean racial differences in job-relatedattributes accurately. We offer the following

Page 10: The Uniform GuidelinesAre a Detriment to the Field of

Uniform Guidelines are a detriment 503

Table 1. Meta-Analytic Standardized Racioethnic and Gender Subgroup Differences andValidities

Predictora d-value(s) Criterion-related validity

General cognitive ability .51b

White–Black .99b

White–Hispanic .58 to .83b

White–Asian −.20b

Male–female .00b

Conscientiousness .18b

White–Black .06b and .07c

White–Hispanic .04b and .08c

White–Asian .08b and .11c

Male–female −.08b

Conscientiousness, global measuresWhite–Black .17c

White–Hispanic .20c

White–Asian .04c

Conscientiousness, achievementWhite–Black −.03c

White–Hispanic .10c

White–Asian .14c

Conscientiousness, dependabilityWhite–Black −.05c

White–Hispanic .00c

White–Asian −.01c

Conscientiousness, cautiousnessWhite–Black .16c

Conscientiousness, orderWhite–Black .01c

White–Hispanic .00c

White–Asian .50c

Extraversion .11b

White–Black .10b and −.16c

White–Hispanic −.01b and −.02c

White–Asian .15b and −.14c

Male–female .09b

Extraversion, global measuresWhite–Black −.21c

White–Hispanic .12c

White–Asian −.07c

Extraversion, dominanceWhite–Black −.03c

White–Hispanic −.04c

White–Asian −.19c

Extraversion, sociabilityWhite–Black −.39c

White–Hispanic −.16c

White–Asian −.09c

Emotional Stability .13b

White–Black −.04b and −.09c

White–Hispanic −.01b and .03c

Page 11: The Uniform GuidelinesAre a Detriment to the Field of

504 M.A. McDaniel, S. Kepes, and G.C. Banks

Table 1. Continued

Predictora d-value(s) Criterion-related validity

White–Asian .08b and −.12c

Male–female .24b

Emotional Stability, global measuresWhite–Black −.12c

White–Hispanic −.04c

White–Asian −.16c

Emotional Stability, self-esteemWhite–Black .17c

White–Hispanic .25c

White–Asian .30c

Emotional Stability, low anxietyWhite–Black −.23c

White–Hispanic .25c

White–Asian .27c

Emotional Stability, even temperedWhite–Black .06c

White–Hispanic .09c

White–Asian −.38c

Agreeableness .08b

White–Black .02b and −.03c

White–Hispanic .06b and −.05c

White–Asian .01b and .63c

Male–female −.39b

Openness to Experience .07b

White–Black .21b and −.10c

White–Hispanic .10b and −.02c

White–Asian .18b and .11c

Male–female .07b

Job knowledge .48b

White–Black .48b

White–Hispanic .47b

Spatial ability .51b

White–Black .66b

Psychomotor ability .35b

White–Black −.72d

White–Hispanic −.11d

Male–female −1.06d

Psychomotor ability, muscular strength .23b

Male–female 1.66b

Psychomotor ability, muscular power .26b

Male–female 2.10b

Psychomotor ability, muscular endurance .23b

Male–female 1.02b

Biodata .35b

White–Black .33b

Structured interview .51b

White–Black .23b

Page 12: The Uniform GuidelinesAre a Detriment to the Field of

Uniform Guidelines are a detriment 505

Table 1. Continued

Predictora d-value(s) Criterion-related validity

Situational judgment test (SJT)Video SJT .22 to .33d

White–Black .31b

White–Hispanic .41b

White–Asian .49b

Male–female −.06b

Written SJT .34b

White–Black .40b

White–Hispanic .37b

White–Asian .47b

Male–female −.12b

Accomplishment record .17 to .25d

White-Minority .24d

Male–female .09d

Work sample .33b

White–Black .52b

White–Hispanic .45b

Assessment center .37b

White–Black .60 or lessd

Note. Drawn from Ployhart and Holtz (2008) and Foldes, Duehr, and Ones (2008).aPredictors encompass predictor constructs that assess one construct (e.g., cognitive ability, Conscientiousness,and Extraversion) and predictor measurement methods that assess multiple constructs. For predictor measurementmethods, the magnitude of group differences will be a function of the constructs assessed. For racial comparisons,a positive d indicates Whites score higher than the other group on average. For comparisons by gender, a positived indicates men score higher than women on average. bEstimate from Ployhart and Holtz (2008); correctedunless otherwise indicated. cEstimate from Foldes, Duehr, and Ones (2008). dEstimate from Ployhart and Holtz(2008). Estimate is from primary studies; not meta-analytically derived.

lines of evidence in support of our position.First, mean differences are often substantialand present prior to the age in which peo-ple begin competing for jobs. For example,mean racial differences are found early inlife (e.g., age 3; see Jencks & Phillips, 1998;Phillips, Brooks-Gunn, Duncan, Klebanov,& Crane, 1998). Clearly, mean racial dif-ferences at age 3 cannot be attributed toflawed employment tests.

In further support of our position, wedescribe two sources of data relevant tothose currently in the workforce: highschool graduation rate and prose literacyin U.S. adults. High school graduation ratesby ethnicity and race are available from theNational Center for Educational Statistics(Stillwell, 2010). In these data, high schoolgraduation is defined as receiving a highschool diploma at the completion of4 years of high school for the cohort

graduating in the spring of 2008. Ninety-one percent of Asians, including PacificIslanders, receive a high school diploma.Ten percent fewer (81%) of Whites receiveone. For American Indians, includingAlaskan natives, the diploma recipient rateis 64%, which is tied with the Hispanicrate. The percent of Blacks receiving a highschool diploma is 62%. We assert that highschool diploma status covaries with manyjob-related attributes, including generalcognitive ability and Conscientiousness.Both of these attributes show validitygeneralizations for virtually all jobs (Barrick& Mount, 1991; Barrick, Mount, & Judge,2001; Hunter, Schmidt, & Le, 2006; Hurtz &Donovan, 2000; Schmidt & Hunter, 1998).

In 2011, individuals in this cohort areapproximately 22 years of age, and mostof them are likely employed or compet-ing for employment. These individuals are

Page 13: The Uniform GuidelinesAre a Detriment to the Field of

506 M.A. McDaniel, S. Kepes, and G.C. Banks

also likely to be employed or apply foremployment for the next 43 years, at whichtime they will reach the age of 65. Wesuggest that the job-related attributes asso-ciated with high school diploma statuswill likely yield adverse impact for thisage cohort for the next 43 years. FormerSupreme Court Justice O’Connor, in hermajority opinion in the Grutter v. Bollinger(2003) case concerning racial preferences inlaw school admission, wrote: ‘‘We expectthat 25 years from now, the use of racialpreferences will no longer be necessary tofurther the interest approved today.’’ Werespectfully suggest that her opinion wasnot based on a realistic appraisal of avail-able data. We offer an opinion based onscience: Mean racial differences in educa-tionally relevant and job-related attributeswill, unfortunately, not go away any timesoon.

Our second data set concerns prose lit-eracy for a representative sample of U.S.adults for the year 2003 (National Center forEducation Statistics, 2010). This data sourcedefines an intermediate level of literacy as‘‘able to read and understand moderatelydense, less commonplace prose text, aswell as summarize, make simple inferences,determine cause and effect, and recognizeauthor’s purpose’’ (National Center for Edu-cation Statistics, 2010, footnote 1). We offerthat most knowledge-worker occupationsrequire incumbents to read and understandmoderately dense prose, to make simpleinferences, and to determine cause andeffect. We suggest that one typically needsthese skills to graduate from high school.The 2003 data from the National Centerfor Educational Statistics indicate that 51%of Whites fall in this intermediate level ofskills, compared to 42% for Asians, 31%percent for Blacks, and 23% for Hispan-ics. We suggest that until a time whenmean racial differences in prose literacy areeliminated, regrettably, most valid employ-ment tests are likely to have adverse impact.

We encourage educational and otherinterventions that would eliminate orreduce these mean racial differences injob-related attributes. However, we are not

hopeful that these differences will be elimi-nated any time soon. Part of our pessimismis based on the intervention research sum-marized by Ceci and Papierno (2005). Evenif there was an intervention that would dra-matically improve job-related attributes, weshould not assume that such an interventionwould close the achievement gap betweenthe less able and the more able. Rather, theintervention might increase the gap, partlybecause the more able have a higher capac-ity to benefit more from the interventionand partly because the more able will bemore likely to participate in the intervention(Ceci & Papierno, 2005; Walberg & Tsai,1983). Thus, even with dramatically impres-sive interventions, mean racial differencesmay persist (Ceci & Papierno, 2005). Giventhe prevalence of mean racial differences,employers are typically in need of a vali-dation defense consistent with federal reg-ulations. Thus, it is imperative that federalregulations permit all scientifically basedapproaches to validity evidence. Currently,they do not.

The Uniform Guidelines and theDiversity–Validity Dilemma

The Uniform Guidelines are silent aboutthe diversity–validity dilemma (Ployhart &Holtz, 2008; Pyburn, Ployhart, & Kravitz,2008) that organizations face and how orga-nizations should deal with this dilemma.When faced with the adverse impact of anemployment test, the Uniform Guidelinesencourage employers to search for alterna-tive tests with the same or higher validitybut less adverse impact. Such searches arealmost always futile. Current employmenttests seldom maximize diversity and validitygoals because the validity of employmenttests tends to co-vary with mean racial dif-ferences such that the most valid tests havethe largest mean racial differences (Pyburnet al., 2008).

Organizations can use two strategies todeal with this diversity–validity dilemma(Pyburn et al., 2008). First, they can sac-rifice validity and use less valid selectiontests that do not result in adverse impact

Page 14: The Uniform GuidelinesAre a Detriment to the Field of

Uniform Guidelines are a detriment 507

to achieve social, ethical, or businessaims.5 Second, organizations can sacrificediversity by ignoring the potential adverseimpact of valid selection procedures toachieve different social, ethical, or businessaims. Obviously, neither strategy is optimalbecause the first can sacrifice work qualityand utility (Hunter & Hunter, 1984; Schmidt& Hunter, 1998), and the second can resultin racial imbalance and discriminationlawsuits. Thus, both strategies ultimatelyimpinge on important social, ethical, andeconomic objectives (Pyburn et al., 2008).Although the scientific community hasdebated this issue and provided recommen-dations of how to deal with the dilemma(e.g., Kravitz, 2008; Ployhart & Holtz, 2008;Pyburn et al., 2008), the legality of some ofthe proposed solutions is not clear. Unfor-tunately, the Uniform Guidelines do notaddress this vital issue. Thus, they implicitlydeny any dilemma or tradeoff.

The Broader Political andSocial Context and theUniform Guidelines

In the previous sections, we reviewed theinconsistencies between scientific knowl-edge and the Uniform Guidelines. Next,we speculate about the forces influenc-ing the inertia of the Uniform Guidelinesand present ideas about how they could berevised to reflect current scientific knowl-edge and professional practice.

Resistance to Changing theUniform Guidelines

Despite the overwhelming evidence thatthe Uniform Guidelines are not in com-pliance with important legal, technical,

5. We acknowledge that a combination of a cognitiveability test and a noncognitive measure mayimprove the validity to some degree while reducingadverse impact to some extent. Our reading ofthe literature causes us to conclude that theimprovements in validity and the reductions inadverse impact, when occurring, are typicallyrelatively modest. Thus, the use of such compositesprovides, at best, only a limited reduction of theproblems associated with the validity–diversitydilemma.

and scientific developments (Daniel, 2001;McDaniel, 2007), they have remainedunchanged for over 3 decades. Table 2summarizes inconsistencies between theUniform Guidelines and science-basedprofessional practice.

To address some of these issues, sev-eral attempts have been made to revise theUniform Guidelines. For instance, the Gen-eral Accounting Office proposed a reviewof the Guidelines in 1982 (Daniel, 2001).However, all efforts, including an over-sight hearing on the Civil Rights Division ofthe U.S. Department of Justice and severalhearings before the Committee on Educa-tion and Labor, Subcommittee on Employ-ment Opportunities, regarding the UniformGuidelines in 1985, yielded no tenable out-come (Daniel, 2001). Later, efforts in 1998were equally fruitless (Daniel, 2001). A par-tisan political climate may have preventeda science-based revision of the UniformGuidelines. We suggest that the best hopefor the revision of the Uniform Guidelineslies with the Obama administration. GivenPresident Obama’s mixed-racial heritage,an Obama-endorsed congressional effortto force a revision of the Uniform Guide-lines is less likely to be labeled as raciallymotivated.

The Role of Science in Federal Regulations

The failure to maintain the Uniform Guide-lines consistent with science and profes-sional practice is unfortunate. Other federallaws and regulations are updated regu-larly to address new scientific evidence.For instance, consumer protection wouldhave suffered if Congress had not passedthe Food and Drug Administration Amend-ments Act of 2007. Similarly, businesses,potential applicants, current employees,and the I–O psychology profession arenot well served by federal employmentguidelines that are inconsistent with legal,technical, and scientific developments.

We believe that the appropriate roleof science in federal employment reg-ulations can be explored by examiningnonemployment regulatory areas. Across

Page 15: The Uniform GuidelinesAre a Detriment to the Field of

508 M.A. McDaniel, S. Kepes, and G.C. Banks

Table 2. Summary of Scientific and Practical Problems and Inconsistencies in the UniformGuidelines

Problem/ inconsistency Uniform GuidelinesScientific knowledge and

professional practice

GeneralIssue date 1978 1999 (Standards) and 2003

(Principles)Scientific/practical

Situational specificityhypothesis

Endorsement of the situationalspecificity hypothesis

Rejection of the situationalspecificity hypothesis

Local validation studies Requirement of local validationstudies

No requirement of localvalidation studies

Content validity evidence Rejection of content validityevidence-based defensestrategies for some constructs

No construct limitations onwhat can be validated

Construct validity assessment Practical rejection of constructvalidity evidence-baseddefense strategies

Practical endorsement ofconstruct validityevidence-based defensestrategies

View of validity Outdated perspective of theconcept of validity (i.e., thereare three distinct types ofvalidity)

Endorsement of validity is aunitary concept in whichdifferent sources ofinformation can informinferences about a selectionapproach

Validity generalization Outdated perspective onvalidity generalization asevidence for the validity ofemployment tests

Endorsement of validitygeneralization as evidence ofthe validity of employmenttests

Transportability of evidence Transportability may only applyto criterion-related validity

Transportability applies to theconcept of validity as a whole

Differential validity anddifferential prediction

Requirement of the assessmentof differential validity andprediction evidence

Differential validity is unlikely toexist; no assessment isnecessary

Assumptions concerningadverse impact

A flawed employment test leadsto adverse impact

Multiple causes could lead toadverse impact

The diversity–validitydilemma

No clear guidance Guidance is provided

scientific areas, from educational inter-ventions to environmental protection andmedical research, powerful economic andsocial interests are often at play (Stein-brook, 2004). Political entities can be drivento influence science for both economicand social reasons. However, scientific evi-dence is not an a la carte menu from whichpolicymakers should be able to selectivelypick popular research and avoid resultswhich are unpopular (Schenkel, 2010). Itis critical that a clear distinction be made

between honest scientifically based chal-lenges and politically motivated attackson scientific evidence (Rosenstock & Lee,2002). To assist in this distinction, one mustfirst recognize the influence tactics oftenused, including economic, manufacturinguncertainty, and delay tactics (for a goodoverview of the influence and impact ofsuch tactics see Rosenstock & Lee, 2002).As a result of such tactics, federal reg-ulations can be delayed and misguided,which can result in uncertainty, financial

Page 16: The Uniform GuidelinesAre a Detriment to the Field of

Uniform Guidelines are a detriment 509

and economic loss (Michaels & Monforton,2005; Rosenstock & Lee, 2002; Slavin,2002), as well as human loss as was thecase when regulation requiring a simplewarning label on aspirin bottles indicat-ing that aspirin could increase children’srisk of Reye’s syndrome was successfullydelayed by the aspirin industry (Michaels &Monforton, 2005).

We suggest that all three tactics (e.g.,economic, manufacturing uncertainty, anddelay) will be used both for and againstefforts to make the Uniform Guidelinesconsistent with scientific evidence and pro-fessional practice. First, employers can doc-ument the costs associated with complyingwith the Uniform Guidelines. These includelabor and other monetary costs associ-ated with defending employee selectionsystems. There are also economic costsassociated with using lower validity selec-tion measures in hopes of reducing adverseimpact (Hunter & Hunter, 1984; Schmidt &Hunter, 1998). Second, employees of fed-eral regulatory agencies, human resourcesconsultants, and labor lawyers seeking topreserve their jobs can manufacture uncer-tainty about scientific findings. If the priceis right, one can find a ‘‘scientist’’ to tes-tify to almost anything. Third, regulatoryagencies and other interested parties (e.g.,consultants, lawyers, and expert witnesses)can engage in delay tactics (e.g., litigation,requiring parallel studies and fighting overaccess to raw data) to avoid revising theUniform Guidelines. Some might argue thatdelay tactics have contributed to the factthat no revisions have been made to theUniform Guidelines in over 3 decades.

Changing Federal Regulations ConcerningEmployment Testing

The rescindment or revision of the UniformGuidelines faces a variety of obstacles. First,employers may not like the Uniform Guide-lines and the expense of complying withthem, but they tend to like stability. Changesin the federal regulation of employmentpractices create uncertainty, which may notbe welcome by many employers. Second,

courts have given deference to the UniformGuidelines in hundreds of cases, and courtsgenerally abide by precedent. Thus, courtsmay be unlikely to alter their practicesto be consistent with scientific knowledgewithout changes to existing federal lawsuch as the Civil Rights Act of 1991. Inaddition, even if the Uniform Guidelineswere revised to be consistent with scientificknowledge, there would still be a need toinfluence and alter a formidable body ofcase law. Third, there are political obsta-cles to acknowledging that adverse impactcould reflect mean racial differences in job-related attributes and that the mean racialgap in such attributes is not going awayany time soon. It is easier for Congress, thecourts, and regulatory agencies to encour-age the belief that employment tests withadverse impacts are likely flawed than toadmit that there are mean racial differencesin job-related attributes. However, based ontrends in the debates of educational testing,we have some hope that these organiza-tions can accept conclusions based on cleardata. In K-12 educational testing, there wasonce substantial debate concerning ‘‘biasedtests.’’ With the passing of the No Child LeftBehind Act in 2001, there appears to be animplicit acceptance of the conclusion thatK-12 educational tests are good indicatorsof student achievement and learning.

Although we claim no substantial exper-tise in how to resolve the unfortunate situa-tion with the Uniform Guidelines, we offersome thoughts. We suggest that any reformin employment regulations be guided byscientific knowledge and professional prac-tice. Thus, for example, all federal employ-ment regulations should be fully consistentwith the Standards and Principles. In addi-tion, mechanisms should be establishedsuch that regulators rely on scientific knowl-edge as the basis for periodic revisions ofregulations. Employment regulations wouldcertainly benefit from scientific input. Wecall on regulatory agencies to issue anAdvanced Notice of Proposed Rulemaking(ANPR). An ANPR issued for the UniformGuidelines would be an invitation for pub-lic discussion on whether and how the

Page 17: The Uniform GuidelinesAre a Detriment to the Field of

510 M.A. McDaniel, S. Kepes, and G.C. Banks

Uniform Guidelines need to be changed.Although we appreciate the role of attorneysin federal regulation, we assert that federalemployment regulation will not improveuntil scientists, unaffiliated with the federalgovernment, engage in a cooperative part-nership with the regulatory process to alterthe Uniform Guidelines so as to be con-sistent with science. We recommend thatscientific organizations such as SIOP, part-ner with other professional organizations(e.g., Society of Human Resource Manage-ment, Equal Employment Advisory Council,Employment and Labor Law section of theAmerican Bar Association) in promotingrevisions to the regulations and in educatingthe Congress and the courts. What good isscience if no one pays attention to it?

We encourage commentaries on thisarticle to offer guidance concerning howthe problems with the Uniform Guidelinescan be remedied. That is, what are thereasonable next steps to cause federalregulation to be consistent with science?We also encourage commentaries on howCongress and the courts can be influencedto rely on scientific knowledge, even whenthe knowledge is politically and sociallyuncomfortable. Finally, given the emotivenature of this topic, we encourage collegialdebate. With emotive topics, it is easy tooffer opinions that yield more heat thanlight; it takes more work to consider themerits of both sides of an argument and toengage in a constructive, professional, andcollegiate debate.

Science-Based Federal Regulations: A Rolefor SIOP

Unlike the agency authors of the UniformGuidelines, many governmental agenciesrely on science to form policy. For instance,the U.S. Food and Drug Administration’s(FDA) mission depends on ‘‘science-ledregulatory decisions’’ (FDA, 2011a). Toensure this, the FDA has 49 committees andpanels to obtain expert advice on scientific,technical, and policy matters, including theScience Board to the FDA, whose role is toprovide advice to FDA officials on scientific

and technical issues. Currently, all boardmembers have doctorate degrees, and mostare affiliated with major research univer-sities (FDA, 2011b). The other committeesand panels are associated with specific divi-sions within the FDA (Food, Drugs, MedicalDevises, etc.). Although the scientific exper-tise is the top criterion in the selectionprocess, other criteria such as potential con-flict of interest are also evaluated (FDA,2006). We acknowledge that federal regu-lation in employment testing does not likelyneed as many scientific advisory commit-tees as the FDA, but scientific input intofederal employment regulations is clearlywarranted.

In addition to scientific panels guid-ing federal regulation, consumer advocacyorganizations such as the Consumer Feder-ation of America or the Center for Sciencein the Public Interest, both of which focuson nutrition and health and food safety,lobby for changes in laws and regula-tions. As an example of the successfulintersection among lawmakers, advocacyorganizations, and science, provisions inthe Patient Protection and Affordable CareAct of 2010 require restaurants to displaycalorie information. It is likely that influencefrom consumer advocacy groups and scien-tific evidence (e.g., Burton, Creyer, Kees, &Huggins, 2006) has affected this law.

As another example of the intersectionbetween science and federal regulations,several FDA guidelines specifically mentionmeta-analytic reviews as means to assessthe efficacy of drugs. For instance, the FDAguidelines for the evaluation of cardiovas-cular risk in new antidiabetic therapies totreat type-2 diabetes (FDA, 2008) specifi-cally state that meta-analyses of importantcardiovascular events across clinical trialsshould be conducted. If federal employmentregulation recognized meta-analysis as aform of validity documentation, the badsituation imposed on U.S. employers byfederal employment regulators would besubstantially improved.

We argue that the EEOC and relatedregulatory agencies could learn from thestructure and processes used by the FDA.

Page 18: The Uniform GuidelinesAre a Detriment to the Field of

Uniform Guidelines are a detriment 511

In particular, a scientific advisory com-mittee structure could guide the EEOCin the protection and advancement ofequal employment opportunity laws andregulations. Currently, employment-relatedenforcement agencies appear to lack suchan advisory committee structure. Certainly,such committees with independent expertswould help to ensure that the regula-tory process is transparent, which shouldincrease the acceptance of science-led reg-ulatory decisions by U.S. courts, Congress,businesses, employees, and the scientificcommunities.

SIOP’s mission is to ‘‘enhance humanwell-being and performance in organiza-tional and work settings by promoting thescience, practice, and teaching of indus-trial and organizational psychology’’ (SIOP,n.d., p. A-1). Toward this end, SIOP has sev-eral objectives, including support of ‘‘SIOPmembers in their efforts to study, apply, andteach the principles, findings, and methodsof industrial and organizational psychol-ogy’’; the identification of ‘‘opportunitiesfor expanding and developing the scienceand practice of industrial and organizationalpsychology’’; the monitoring and address-ing of ‘‘challenges to the understanding andpractice of industrial and organizationalpsychology in organizational and work set-tings’’; the promotion of ‘‘public awarenessof the field of industrial and organizationalpsychology’’; and the fostering of ‘‘coop-erative relations with allied groups andprofessions’’ (SIOP, n.d., p. A-1).

Many of these objectives require the edu-cation of regulatory agencies, businesses,and the general public regarding the sci-ence and practice of I–O psychology. Theseobjectives thus seem to call for an activerole in the regulatory processes that affectscientists, practitioners, and businesses. Todo this, SIOP has several committees,including the committee on ProfessionalPractice, whose role is to ‘‘promote theinterests of [SIOP] and its members by con-cerning itself with matters of professionalpractice and by developing relationshipswith other professional groups, business and

government leaders, and the public in gen-eral to advance the professional practice ofindustrial and organizational psychology’’(SIOP, n.d., p. A-6). Other committees suchas the Scientific Affairs and the State Affairsmay also interact with external organiza-tions, including federal and other regulatoryagencies, to fulfill their roles.

Thus, SIOP’s mission calls for, and itscommittee structure permits, the educationof organizations including the employmentregulatory agencies, the U.S. Congress, andU.S. courts. It is thus somewhat surpris-ing that SIOP has not managed to buildsupport from business and other organiza-tions (e.g., the Society for Human ResourceManagement, the Equal Employment Advi-sory Council, and the Employment andLabor Law section of the American BarAssociation) to voice the concerns in the sci-entific and business communities regardingthe Uniform Guidelines. SIOP’s inaction iscounter to its mission. To fulfill its missionand maintain its scientific credibility, werecommend that SIOP become more proac-tive and involved in regulatory decision-making processes, new U.S. employmentlaws, and U.S. court decisions.

ReferencesAguinis, H., Culpepper, S. A., & Pierce, C. A. (2010).

Revival of test bias research in preemployment test-ing. Journal of Applied Psychology, 95, 648–680.doi:10.1037/a0018714.

American Educational Research Association, Ameri-can Psychological Association, & National Councilon Measurement in Education. (1999). Standardsfor educational and psychological testing (2nd ed.).Washington, DC: American Educational ResearchAssociation.

Banks, G. C., & McDaniel, M. A. (in press). Meta-analyses and selection procedures. In N. Schmitt(Ed.), The Oxford handbook of personnel assess-ment and selection. Oxford, UK: Oxford UniversityPress.

Barrick, M. R., & Mount, M. K. (1991). The Big Fivepersonality dimensions and job performance: Ameta-analysis. Personnel Psychology, 44, 1–23.doi:10.1111/j.1744-6570.1991.tb00688.x.

Barrick, M. R., Mount, M. K., & Judge, T. A. (2001).Personality and performance at the beginning ofthe new millennium: What do we know and wheredo we go next? International Journal of Selectionand Assessment, 9, 9–30. doi:10.1111/1468-2389.00160.

Page 19: The Uniform GuidelinesAre a Detriment to the Field of

512 M.A. McDaniel, S. Kepes, and G.C. Banks

Bartlett, C. J., Bobko, P., Mosier, S. B., & Hannan, R.(1978). Testing for fairness with a moderated mul-tiple regression strategy: An alternative to differen-tial analysis. Personnel Psychology, 31, 233–241.doi:10.1111/j.1744-6570.1978.tb00442.x.

Biddle, D. A. (2010). Should employers rely on localvalidation studies or validity generalization (VG)to support the use of employment tests in TitleVII Situations? Public Personnel Management, 39,307–326.

Boehm, V. R. (1977). Differential prediction: Amethodological artifact? Journal of Applied Psy-chology, 62, 146–154. doi:10.1037/0021-9010.62.2.146.

Borneman, M. J. (2010). Using meta-analysis toincrease power in differential prediction analyses.Industrial and Organizational Psychology: Per-spectives on Science and Practice, 3, 224–227.doi:10.1111/j.1754-9434.2010.01228.x.

Bray, D. W., & Moses, J. L. (1972). Personnel selec-tion. Annual Review of Psychology, 23, 545–576.doi:10.1146/annurev.ps.23.020172.002553.

Burton, S., Creyer, E. H., Kees, J., & Huggins, K. (2006).Attacking the obesity epidemic: The potentialhealth benefits of providing nutrition informationin restaurants. American Journal of Public Health,96, 1669–1675. doi:10.2105/AJPH.2004.054973.

Cascio, W. E., & Aguinis, H. (2001). The federaluniform guidelines on employee selection pro-cedures (1978). An update on selected issues.Review of Public Personnel Administration, 21,200. doi:10.1177/0734371X0102100303.

Ceci, S. J., & Papierno, P. B. (2005). The rhetoric andreality of gap glosing: When the ‘‘have-nots’’ gainbut the ‘‘haves’’ gain even more. American Psy-chologist, 60, 149–160. doi:10.1037/0003-066x.60.2.149.

Cohen, M. S., Aamodt, M. G., & Dunleavy, E. M.(2010). Technical advisory committee reporton best practices in adverse impact analyses.Washington, DC: Center for Corporate Equality.

Copus, D. A. (2006). Validation of cognitive abilitytests. Letter to Charles James, Office of FederalContract Compliance Programs (March 27, 2006).Morristown, NJ: Ogletree Deakins.

Daniel, C. (2001). Separating law and professionalpractice from politics. Review of Public PersonnelAdministration, 21, 175. doi:10.1177/0734371X0102100301.

Equal Employment Opportunity Commission (EEOC).(1966). Guidelines on employment testing proce-dures. Federal Register, 31, 6414.

Equal Employment Opportunity Commission (EEOC).(1970). Guidelines on employee selection proce-dures. Federal Register, 35, 12333–12336.

Equal Employment Opportunity Commission (EEOC),Civil Service Commission, Department of Labor,& Department of Justice. (1978). Uniform guide-lines on employee selection procedures. FederalRegister, 43, 38290–39315.

Ewoh, A. I. E., & Guseh, J. S. (2001). The status of theuniform guidelines on employee selection proce-dures. Review of Public Personnel Administration,21, 185. doi:10.1177/0734371X0102100302.

Foldes, H. J., Duehr, E. E., & Ones, D. S. (2008).Group differences in personality: Meta-analyses

comparing five U.S. racial groups. Personnel Psy-chology, 61, 579–616. doi:10.1111/j.1744-6570.2008.00123.x.

Food and Drug Administration (FDA). (2006). FDAannounces plan to strengthen advisory com-mittee processes. Retrieved from http://www.fda.gov/NewsEvents/Newsroom/PressAnnouncements/2006/ucm108697.htm.

Food and Drug Administration (FDA). (2008). Guid-ance for industry: Diabetes mellitus—Evaluatingcardiovascular risk in new antidiabetic therapiesto treat type 2 diabetes: U.S. Food and DrugAdministration.

Food and Drug Administration (FDA). (2011a). Aboutscience & research at FDA. Retrieved from http://www.fda.gov/ScienceResearch/AboutScienceResearchatFDA/default.htm.

Food and Drug Administration (FDA). (2011b). Sci-ence board to the food and drug administra-tion. Retrieved from http://www.fda.gov/AdvisoryCommittees/CommitteesMeetingMaterials/ScienceBoardtotheFoodandDrugAdministration/default.htm.

Gatewood, R. D., Feild, H. S., & Barrick, M. R. (2008).Human resource selection (6th ed.). Mason, OH:South-Western.

Grutter v. Bollinger. (2003). 539 U.S. 306.Guion, R. M. (1965). Personnel testing. New York, NY:

McGraw-Hill.Guion, R. M. (1975). Recruitment, selection and job

placement. In M. D. Dunnette (Ed.), Handbook ofindustrial and organizational psychology. Chicago,IL: Rand McNally.

Hartigan, J. A., & Wigdor, A. K. (Eds.). (1989). Fairnessin employment testing: Validity generalization,minority issues, and the General Aptitude TestBattery. Washington, DC: National Academy Press.

Herrnstein, R. J., & Murray, C. (1994). The bell curve:Intelligence and class structure in American life.New York, NY: Free Press.

Hunter, J. E., & Hunter, R. F. (1984). Validity and util-ity of alternative predictors of job performance. Psy-chological Bulletin, 96, 72–98. doi:10.1037/0033-2909.96.1.72.

Hunter, J. E., & Schmidt, F. L. (2004). Methods of meta-analysis: Correcting error and bias in researchfindings. (2nd ed.). Newbury Park, CA: Sage.

Hunter, J. E., Schmidt, F. L., & Hunter, R. (1979). Dif-ferential validity of employment tests by race:A comprehensive review and analysis. Psycho-logical Bulletin, 86, 721–735. doi:10.1037/0033-2909.86.4.721.

Hunter, J. E., Schmidt, F. L., & Le, H. (2006).Implications of direct and indirect range restrictionfor meta-analysis methods and findings. Journal ofApplied Psychology, 91, 594–612. doi:10.1037/0021-9010.91.3.594.

Hurtz, G. M., & Donovan, J. J. (2000). Personalityand job performance: The Big Five revisited.Journal of Applied Psychology, 85, 869–879.doi:10.1037/0021-9010.85.6.869.

Jeanneret, P. R. (2005). Professional and technicalauthorities and guidelines. In F. J. Landy (Ed.),Employment discrimination litigation: Behavioral,quantitative, and legal perspectives (pp. 47–100).San Francisco, CA: Wiley.

Page 20: The Uniform GuidelinesAre a Detriment to the Field of

Uniform Guidelines are a detriment 513

Jencks, C., & Phillips, M. (Eds.). (1998). The Black–White test score gap. Washington, DC: BrookingsInstitution Press.

Jensen, A. R. (1969). How much can we boost IQand scholastic achievement? Harvard EducationalReview, 39, 1–123.

Kirkpatrick, J. J., Ewen, R. B., Barrett, R. S., & Katzell,R. A. (1968). Testing and fair employment. NewYork, NY: New York University Press.

Kleiman, L. S., & Faley, R. H. (1985). The implicationsof professional and legal guidelines for court deci-sions involving criterion-related validity: A reviewand analysis. Personnel Psychology, 38, 803–833.doi:10.1111/j.1744-6570.1985.tb00568.x.

Kravitz, D. A. (2008). The diversity-validity dilemma:Beyond selection—the role of affirmative action.Personnel Psychology, 61, 173–193. doi:10.1111/j.1744-6570.2008.00110.x.

McDaniel, M. A. (2007). Validity generalizationas a test validation approach. In S. M. McPhail(Ed.), Alternative validation strategies: Develop-ing new and leveraging existing validity evidence.(pp. 159–180). Hoboken, NJ: Wiley.

McDaniel, M. A. (2009). Gerrymandering in personnelselection: A review of practice. Human ResourceManagement Review, 19, 263–270. doi:10.1016/j.hrmr.2009.03.004.

McDaniel, M. A. (2010, July). Abolish the uniformguidelines. Paper presented at the annual meetingof the International Personnel Assessment Council,Newport Beach, CA.

McKay, P. F., & McDaniel, M. A. (2006). A reexami-nation of Black–White mean differences in workperformance: More data, more moderators. Journalof Applied Psychology, 91, 538–554. doi:10.1037/0021-9010.91.3.538.

Meade, A. W., & Tonidandel, S. (2010). Not seeingclearly with Cleary: What test bias analyses doand do not tell us. Industrial and OrganizationalPsychology: Perspectives on Science and Prac-tice, 3, 192–205. doi:10.1111/j.1754-9434.2010.01223.x.

Michaels, D., & Monforton, C. (2005). Manufactur-ing uncertainty: Contested science and the pro-tection of the public’s health and environment.American Journal of Public Health, 95, S39–S45.doi:10.2105/AJPH.2004.043059.

Morris, S. B., & Lobsenz, R. E. (2000). Significancetests and confidence intervals for the adverseimpact ratio. Personnel Psychology, 53, 89–111.doi:10.1111/j.1744-6570.2000.tb00195.x.

National Center for Education Statistics. (2010).Digest of education statistics; Table 386. Liter-acy skills of adults, by type of literacy, profi-ciency levels, and selected characteristics: 1992and 2003. Retrieved from http://nces.ed.gov/programs/digest/d09/tables/dt09_386.asp.

O’Boyle, E. H., & McDaniel, M. A. (2008). Criti-cisms of employment testing: A commentary. InR. P. Phelps (Ed.), Correcting fallacies about edu-cational and psychological testing. (pp. 181–197).Washington, DC: American Psychological Associ-ation.

Pearlman, K., Schmidt, F. L., & Hunter, J. E. (1980).Validity generalization results for tests used to pre-dict job proficiency and training success in clericaloccupations. Journal of Applied Psychology, 65,373–406. doi:10.1037/0021-9010.65.4.373.

Phillips, M., Brooks-Gunn, J., Duncan, G. J., Klebanov,P., & Crane, J. (1998). Family background, parent-ing practices, and the Black–White test score gap.In C. Jencks & M. Phillips (Eds.), The Black–Whitetest score gap. Washington, DC: Brookings Institu-tion Press.

Ployhart, R. E., & Holtz, B. C. (2008). The diversity-validity dilemma: Strategies for reducing racioeth-nic and sex subgroup differences and adverseimpact in selection. Personnel Psychology, 61,153–172. doi:10.1111/j.1744-6570.2008.00109.x.

Pyburn, K. M., Jr., Ployhart, R. E., & Kravitz, D. A.(2008). The diversity-validity dilemma: Overviewand legal context. Personnel Psychology, 61,143–151. doi:10.1111/j.1744-6570.2008.00108.x.

Rosenstock, L., & Lee, L. (2002). Attacks on science:The risks to evidence-based policy. AmericanJournal of Public Health, 92, 14–18. doi:10.2105/AJPH.92.1.14.

Roth, P. L., Bevier, C. A., Bobko, P., Switzer, F. S.,& Tyler, P. (2001). Ethnic group differences incognitive ability in employment and educa-tional settings: A meta-analysis. Personnel Psy-chology, 54, 297–330. doi:10.1111/j.1744-6570.2001.tb00094.x.

Roth, P. L., Bobko, P., & Switzer, F. S., III. (2006).Modeling the behavior of the 4/5ths rule for deter-mining adverse impact: Reasons for caution. Jour-nal of Applied Psychology, 91, 507–522. doi:10.1037/0021-9010.91.3.507.

Sackett, P. R., Schmitt, N., Ellingson, J. E., & Kabin,M. B. (2001). High-stakes testing in employment,credentialing, and higher education: Prospectsin a post-affirmative-action world. AmericanPsychologist, 56, 302–318. doi:10.1037/0003-066x.56.4.302.

Schenkel, R. (2010). The challenge of feedingscientific advice into policy-making. Science, 330,1749–1751. doi:10.1126/science.1197503.

Schmidt, F. L. (1988). The problem of group differ-ences in ability test scores in employment selec-tion. Journal of Vocational Behavior, 33, 272–292.doi:10.1016/0001-8791(88)90040-1.

Schmidt, F. L., Gast-Rosenberg, I., & Hunter, J. E.(1980). Validity generalization results for computerprogrammers. Journal of Applied Psychology, 65,643–661. doi:10.1037/0021-9010.65.6.643.

Schmidt, F. L., & Hunter, J. E. (1977). Developmentof a general solution to the problem of validitygeneralization. Journal of Applied Psychology, 62,529–540. doi:10.1037/0021-9010.62.5.529.

Schmidt, F. L., & Hunter, J. E. (1981). Employment test-ing: Old theories and new research findings. Amer-ican Psychologist, 36, 1128–1137. doi:10.1037/0003-066x.36.10.1128.

Schmidt, F. L., & Hunter, J. E. (1998). The validityand utility of selection methods in personnel psy-chology: Practical and theoretical implications of85 years of research findings. Psychological Bul-letin, 124, 262–274. doi:10.1037/0033-2909.124.2.262.

Schmidt, F. L., & Hunter, J. E. (2003). History, devel-opment, evolution, and impact of validity general-ization and meta-analysis methods, 1975–2001.In K. R. Murphy (Ed.), Validity generalization:A critical review (pp. 31–65). Mahwah, NJ:Erlbaum.

Page 21: The Uniform GuidelinesAre a Detriment to the Field of

514 M.A. McDaniel, S. Kepes, and G.C. Banks

Schmidt, F. L., Pearlman, K., & Hunter, J. E. (1980).The validity and fairness of employment and edu-cational tests for Hispanic Americans: A review andanalysis. Personnel Psychology, 33, 705–724. doi:10.1111/j.1744-6570.1980.tb02364.x.

Schmitt, N., & Quinn, A. (2010). Reductions inmeasured subgroup mean differences: What ispossible? In J. L. Outtz (Ed.), Adverse impact:Implications for organizational staffing and highstakes selection (pp. 425–451). New York, NY:Routledge.

Sharf, J. (2006). Letter to Cari M. Dominguez, Chair,Equal Employment Opportunity Commission (May10, 2006). Alexandria, VA: Author.

Sharf, J. (2008, February). Enforcement agencies’response to validity generalization. Paper presentedat the annual meeting of the Personnel TestingCouncil of Metropolitan Washington, Washington,DC.

Shoben, E. W. (1978). Differential pass-fail rates inemployment testing: Statistical proof under TitleVII. Harvard Law Review, 91, 793–813.

Slavin, R. E. (2002). Evidence-based education poli-cies: Transforming educational practice andresearch. Educational Researcher, 31, 15–21. doi:10.3102/0013189X031007015.

Society for Industrial and Organizational Psychology(SIOP). (2003). Principles for the validation anduse of personnel selection procedures (4th ed.).Bowling Green, OH: Author.

Society for Industrial and Organizational Psychol-ogy (SIOP). (n.d.). SIOP bylaws. Retrieved fromhttp://www.siop.org/reportsandminutes/bylaws.pdf.

Steinbrook, R. (2004). Peer review and federal regu-lations. New England Journal of Medicine, 350,103–104. doi:10.1056/NEJMp038230.

Stillwell, R. (2010). Public school graduates anddropouts from the common core of data: Schoolyear 2007-08. NCES 2010-341. Washington,DC: National Center for Education Statistics,Institute of Education Sciences, U.S. Department ofEducation.

U.S. Census Bureau. (2007). Latest SUSB annualdata: U.S. & states, totals. Retrieved from http://www.census.gov/econ/susb/.

Walberg, H. J., & Tsai, S.-L. (1983). Matthew effectsin education. Educational Research Quarterly, 20,359–373. doi:10.2307/1162605.

Wigdor, A. K., & Garner, W. R. (Eds.). (1982). Abilitytesting: Use, consequences, and controversies.Washington, DC: National Academy Press.