Kane Test Specifications (1)

Embed Size (px)

Citation preview

  • 8/13/2019 Kane Test Specifications (1)

    1/12

    Combining Data on Criticality and Frequency in Developing Test Plans for Licensure andCertification ExaminationsAuthor(s): Michael T. Kane, Carole Kingsbury, Dean Colton and Carmen EstesSource: Journal of Educational Measurement, Vol. 26, No. 1 (Spring, 1989), pp. 17-27Published by: National Council on Measurement in EducationStable URL: http://www.jstor.org/stable/1434620.

    Accessed: 25/01/2014 18:15

    Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at.http://www.jstor.org/page/info/about/policies/terms.jsp

    .JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of

    content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms

    of scholarship. For more information about JSTOR, please contact [email protected].

    .

    National Council on Measurement in Educationis collaborating with JSTOR to digitize, preserve and extend

    access toJournal of Educational Measurement.

    http://www.jstor.org

    This content downloaded from 131.94.16.10 on Sat, 25 Jan 2014 18:15:47 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/action/showPublisher?publisherCode=ncmehttp://www.jstor.org/stable/1434620?origin=JSTOR-pdfhttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/stable/1434620?origin=JSTOR-pdfhttp://www.jstor.org/action/showPublisher?publisherCode=ncme
  • 8/13/2019 Kane Test Specifications (1)

    2/12

    Journal of Educational MeasurementSpring 1989, Vol. 26, No. 1, pp. 17-27

    CombiningData onCriticalityandFrequencynDevelopingTestPlans for LicensureandCertificationExaminationsMichael T. Kane

    American College Testing ProgramCaroleKingsburyNational Leaguefor Nursing

    Dean ColtonAmerican College Testing Program

    CarmenEstesAmerican College Testing Program

    Job analysis is a critical component in evaluating the validity of manyhigh-stakes testing programs, particularly those usedfor licensure or certifica-tion. The ratings of criticality and frequency of various activities that arederivedfrom such job analyses can be combined in a number of ways. Thispaper develops a multiplicative model as a natural and effective way tocombineratings offrequency and criticality in order to obtain estimates of therelative importance of different activities for practice. An example of themodel's use is presented. Themultiplicative model incorporatesadjustments toensure that the effective weights offrequency and criticality are appropriate.

    There are several types of high-stakes, large-scale testing programs that aredesigned to assess readiness to engage in some type of work. The examinationsused in making decisions about professional and occupational licensure andcertification are particularly prominent examples of this type of examination.These tests have a direct impact on the work opportunities of the manycandidates for licensure orcertification,and a less direct, but pervasive,influenceon the quality and availability of a variety of importantservices. The content ofthese examinations also tends to exert a strong influence on the content ofeducationalprogramsat both the graduateandundergraduate evels in programsthat preparestudents for these professionsand occupations. For example, one ofthe arguments for introducing certification examinations for teachers in manystates is to upgrade the quality of teacher preparationprograms.Some types ofemployment tests (i.e., those used to evaluate readiness to perform a particularjob rather than readiness for training) are also obvious examples of this kind oftest.Licensure and certification tests are intended to provideassurance that passingcandidates have the knowledge and skills necessary to perform safely andeffectively in some profession or occupation. A major concern in developing

    17

    This content downloaded from 131.94.16.10 on Sat, 25 Jan 2014 18:15:47 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/13/2019 Kane Test Specifications (1)

    3/12

    Kane,Kingsbury,Colton,andEsteslicensure and certification examinations and in evaluating their quality is thevalidity of the proposedinterpretationin terms of readiness for the type of workfor which a license or certification is being awarded.A fairly detailed specifica-tion of the activities performedin the workarea (e.g., the results of an empiricaljob analysis) is generallyan integralpartof any effort to develop validity data forsuch tests (Kane, 1982; Shimberg, 1981).The importanceof empirical job analyses in validating these examinations isreflected in several sets of professional and legal standards/guidelines. Forexample, the Standardsfor Educational and Psychological Testing (AmericanPsychological Association, American Educational Research Association, &National Council on Measurement in Education, 1985) suggest that in validat-ing professional and occupational licensure and certification examinations,primaryrelianceusually must be placed on content-relatedevidence, and that anargument based on content-related evidence should be supported by a jobanalysis.Under a content-relatedstrategy, the detailed definition of the area of activitycan be used directly in developingtest specificationsfor the tests. Alternatively,the definition of the area of activity might be used to develop a criterion ofperformance n the area as a basis forexaminingthe predictivevalidity of the testscores. In either case, it is necessary to translate information about the area ofactivity into specifications for a measurement procedureof some kind. To theextent that job analysis data are used to inform curriculum decisions, anappropriateweighting of curriculumcontent also will dependon an appropriateweighting of differenttypes of activities in the job analysis.Although the importanceof job analysis in examining validity issues is widelyrecognized,the methodsfor developingdetailed descriptionsof workactivity andfor translatingsuch descriptions nto test specificationsare not well developed.Inthis paperwe examine some of the issues involvedin developing specificationsforlicensure and certification tests from information about the frequency andcriticality of specific activities, and proposea method for transforminginforma-tion about frequency and criticality of activities into test specifications. Thediscussion is in terms of licensure examinations, but the general approachalsowould apply to certificationtests and to some kinds of employmenttests.The next section providesa brief discussion of the advantages and disadvan-tages of basic additive and multiplicative models for combining frequency andcriticality. The third section developsa more sophisticated multiplicative model,which makes it possibleto control the relativeimpact of frequencyand criticalityon the final weights. Controllingthe relative impact of frequencyand criticalityis an importantissue, because questionsabout the frequencyof activities tend togeneratemuch morevariabilityin responsesthan questionsaboutcriticality, and,therefore, in the absence of appropriateadjustments, frequency tends to domi-nate criticality in determiningthe weights assigned to activities.The fourth section presentsan example based on a job analysis of the practicepatterns of entry-level registered nurses, and illustrates the usefulness ofcontrolling the impact of frequency and criticality in the multiplicative model.For motivational reasons, it may be advisable to examine the example in the18

    This content downloaded from 131.94.16.10 on Sat, 25 Jan 2014 18:15:47 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/13/2019 Kane Test Specifications (1)

    4/12

    Criticality ndFrequencyfourth section before reading the more technical development in the thirdsection.

    Models for CombiningData on FrequencyandCriticalityOne of the most common approaches to job analysis for licensure andcertification examinations involves the use of activity inventories, or taskinventories (Gael, 1983; McCormick, 1979). Basically, the use of an activity

    inventory involves three steps. First, activity statements are developed andverified as reflectingpotentially importantparts of practice. The list of activitiesshould be as comprehensiveas possible.Second, a questionnairebased on the listof activities is developedand administered to job incumbentsand/or supervisors.In general, the questionnaireasks at least two questionsabout each activity:howoften the activity occurs (its frequency), and how much difference it makes interms of client outcomes if the activity is performedwell or badly (its criticality).Third,data providedby job incumbents are analyzed to weight activities in termsof their overall importancefor practice.The relative importance of any activity in practice will depend on thefrequency of the activity (how often it is performed) and the criticality of theactivity (the differencethat it makes in terms of client outcomes). The results ofthe job analysis can be summarized in terms of the average frequency ofoccurrenceof each activity overrespondentsand the average rating of criticalityover respondents. The central task is then to combine average frequency andaverage criticality in order to get an overall index of the importance of theactivity. We examine several models for combining frequency and criticality indevelopinga test plan.An additive model of the form

    Ii = Ci + Fi, (1)whereIi representsthe importanceof the ith activity, F, representsthe frequencyof the ith activity, and Ci represents the criticality of the ith activity, is thesimplest type of model to use.However, the additive model yields an index of importance that is hard tointerpretin a coherent way. The scales for frequencyand criticality are differentin their interpretation. Adding the number of times an activity occurs to itsperceived consequences results in an index that has no clear interpretation.Themain advantageof the additive model in Equation 1 is that it is simple.A multiplicative model is more statistically complicated (as we shall see) butmakes more sense. We can think of the criticality of an activity as a measure ofthe consequencesthat may result from the activity, on the average, each time theactivity is performed. That is, criticality can be viewed as importance peroccurrence of the activity. The overallimportanceof the activity forpractice thencould be estimated by summing criticality over all occurrencesor, more simply,by multiplying the criticality by the frequency

    I, = C,F,. (2)The multiplicativemodel in Equation2 is a particularlynatural way to combine

    19

    This content downloaded from 131.94.16.10 on Sat, 25 Jan 2014 18:15:47 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/13/2019 Kane Test Specifications (1)

    5/12

    Kane,Kingsbury,Colton,and Estesdata on criticality and frequency in assigning different levels of importance todifferent activities.

    Nominaland EffectiveActivity WeightsOne problem that arises in using any model to combine frequency and

    criticality is that the effective contributions of frequency and criticality toimportance may be quite different from the nominal, or apparent,contributionsof these two variables (see Jarjoura& Brennan, 1982; Wang & Stanley, 1970).That is, although Fi and Ciplay parallel roles in Equation 2, the impact of thesetwo variables on the importanceassigned to different activities is determined bythe statistical propertiesof the two variables. In most job analysis studies, thestatistical propertiesof the frequencyscale and the criticality scale are likely tobe quite different, and therefore the effective contributions of Fi and Ci inEquation2 in general would not be equal.The relative emphasis that should be given to frequency and criticality indetermining importance is a matter of judgment. However, for licensure that isintended to protect the public from harm or unnecessary risk, criticality wouldseem to be of at least as much concern as frequency. As Rakel (1979) hassuggested,

    The temptationo achievecontentvalidity n examinations y matching estitems to the frequencyof problemsencountered n practicecould also becounter-productive.here s ajustifiable eedto testmoreheavilyonproblemsthathavea highmorbidity nd fall intothe uncommon utharmful f missedcategory.Becauseof theirseriousnature, heydeserve reater epresentationnanexaminationhanpractice urveysndicate. p.93)Activities that are critical in the sense that their omission or inadequate

    performance would pose substantial risk to clients are directly related to thepurposes of licensure, even if they have relatively low frequency. By contrast,activities that are performed frequently but have very low criticality would beless important for the protection of the public than their frequency mightsuggest. Although, as noted earlier, the contributions to be made by criticalityand frequency are a matter of judgment rather than an empirical question, itseems clear that the relative contributionsof these two variables should not bedeterminedby the propertiesof data collection procedures.In examining the effective contributions of the two variables in Equation 2, itis convenient to convert Equation 2 into a linear equation by taking the naturallogarithmsof both sides of the equation:

    lnI, = lnCi + lnFi. (3)The effective weights of frequency and criticality then can be found bypartitioningthe variance in lnIi into two parts:var (InIi) = cov (InIi, Inl,)

    = cov (InCi + InFi, Inl,)= cov (lnC,, Inli) + cov (lnFi, lnIi). (4)20

    This content downloaded from 131.94.16.10 on Sat, 25 Jan 2014 18:15:47 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/13/2019 Kane Test Specifications (1)

    6/12

    Criticality ndFrequencyThe first term in Equation 4 can be interpretedas the effective contribution ofcriticality in Equation2 and can be expandedas

    cov (InCi, Inl) = cov (InCi,InCi+ InF,)= var (nC,) + cov (InCi, InFi). (5)

    Similarly, the effective contribution of the frequency in Equation 2 is found byexpandingthe second term in Equation4:cov (InFi, Ini,) = cov (InFi,InCi+ InF,)

    = var (InFi) + cov (lnCi, InF,). (6)We can alter the relativecontributionsof frequencyand criticality by transform-ing one or bothof these two variables. Because it is the relative contributionsthatare significant rather than the absolute values of the variables, it is necessary totransformonly one of the two variables.Of the two variables, it seems natural to transform criticality rather thanfrequency. The frequency scale has a natural interpretation as a count of thenumberof times the activity is performed,and most transformationsof this scalewould interfere with this interpretation. The criticality scale is essentiallyordinal,and any transformationthat did not change the orderingof activities onthe criticality scale would not interfere with its interpretability.Given that we are using a multiplicativemodel, an exponentialtransformationof the criticality scale of the form

    ci = Ca (7)is convenient. Using the transformedcriticality, we can determine the effectivecontributionsof criticality and frequencyin the new model,

    I; = CIF,= CaFi, (8)as we did earlier for Equation 2. Taking the logarithm of Equation 8, we have

    InlI = In(C,Fi)= a InCi+ InFi. (9)

    The varianceof InI then can be expandedasvar (Inli) = cov (Inlj, lnlM)= cov (a InCi+ InFi,Inli)

    = a cov (InC,, InI) + cov (InF,, InI ). (10)The first term on the right side of Equation 10 representsthe contribution of thecriticality variable (transformed)to estimates of importance,and is given by

    a cov (InC,, InIl) = a cov (InCi, a InCi+ InFi)= a2var (InC,) + a cov (InC,, InFi). (11)

    The second term on the right side of Equation 10 representsthe contribution of21

    This content downloaded from 131.94.16.10 on Sat, 25 Jan 2014 18:15:47 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/13/2019 Kane Test Specifications (1)

    7/12

    Kane,Kingsbury,Colton,and Estesthe frequencyvariable to the estimates of importance,and is given by

    cov (InF,,InI') = cov (InF,, a InCi+ InFi)= var (InFi) + a cov (InCi, InF,). (12)

    If we wish the effective weights of criticality and frequency to be equal indetermining overall importance,we can set Equation 12 equal to Equation 11,and solve for the appropriatevalue of a:a2 var (InCi) + a cov (lnCi, InFi) = var (InFi) + a cov (InCi,InFi),

    or a2 var (InCi) = var (InF,)a = [var (lnFi)/var (InCi)]12. (13)

    That is, the contributions of criticality and frequency (relative to the totalvariance of InI;) can be made equal by transforming all criticality values byraising them to the powera, where a is given by Equation 13.Similarly, if we want the effective weight of criticality to be k times that of

    frequency(wherek is any positivevalue), we can set Equation 11equal to k timesEquation 12 and solve the resultingquadratic equation for a.Weights for the Test Plan

    The most obvious way to assign weights in the test plan, based on theimportanceof each activity, would be to make the weights proportionalto theestimated importanceWi= Ii. (14)

    Because the weights, interpretedas proportions,must sum to 1, the constanta inEquation 14 can be determined by setting the sum of the weights over allactivities equal to 1 and solving for a, which yields

    a-=1/ , (15)and the proportionalweight, Wi,assigned to thejth activity, would be equal to

    W=i- I, (16)EIi

    where the value of the Ii's could be found from the basic model in Equation2 orthe more sophisticatedmodel in Equation8.A word of caution is appropriateat this point.The data generatedby empiricaljob analyses, to which these analyses would be applied, are based on whatpractitioners say they are doing. In many cases, it can be argued that thedistributionof effort in currentpracticeis inappropriate,or that future needs willbe different from current needs. Job analysis data describe what is, and notnecessarily what should be. Such data can provideguidance in developing testplansand in designingeducationalprograms,but shouldnot be used mechanical-ly.22

    This content downloaded from 131.94.16.10 on Sat, 25 Jan 2014 18:15:47 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/13/2019 Kane Test Specifications (1)

    8/12

    Criticality ndFrequencyAn Example

    In this section, the implications of adjusting criticality ratings to make theirimpact equivalent to that of the frequencyratings will be discussed in terms of ajob analysis of the entry-level practice of registerednurses. The data reportedinthis section are derived from a job analysis study of registered nurses (Kane,Kingsbury,Colton, & Estes, 1986).As part of this study, responses from 1,375 newly licensed registered nurseswho passed the NCLEX-RN (the licensing examination for registered nurses,preparedby the National Council of State Boards of Nursing) in July 1984 wereanalyzed. In addition to questions about work setting, educational background,and related topics, the participantswere asked to rate 222 activities in terms oftheir frequency of performance and the criticality of the activity for clientwell-being (see Kane et al., 1986, for details on questionnairedevelopmentandsamplingdesign).The range of values for the average frequencies of activities was relativelylarge: Some activities had very low frequencies (e.g., administering CPR),whereas otheractivities had high frequencies (e.g., takingvital signs). Therefore,the variability among activities in their average frequencies tended to be quitelarge. By contrast, the range of criticality ratings tended to be narrow.For the sample of newly licensed registered nurses, the variance of InCi was.084, the variance of InF, was .877, and their covariance was .057. Therefore,from Equations5 and 6, the effective contributionof criticality was .141, and theeffective contributionof frequencywas .934. As expected, the effective contribu-tion of frequency to the index of importancewas much larger than the effectivecontributionof criticality.Using Equation 13, it was foundthat the constant a should be equal to 3.235 inorder for the impact of frequency and criticality to be equal in determiningimportance. Assuming Ii = CQF,,Equation 16 can be used to estimate theunadjustedproportionalweight, Wi,assigned to any activity. Similarly, substi-tuting I' = CiF for Ii, Equation 16 can be used to determine the adjustedproportionalweight, W;,assigned to any activity. The use of II with a = 3.235instead of Ii increases the weighting of activities with relatively high criticalityratings and decreases the weights of activities with relatively low criticalityratings.Before examining data for task statements from the study, it may be useful toconsiderthe effect of adjustingweights on fourhypotheticaltasks. In the study ofentry-level registered nursing practice, the mean frequencyfor the 222 activitieswas 1.33, and the standard deviation was .83; therefore, an activity with anaverage frequency of .50 would be one standard deviation below the mean, andan activity with an average frequency of 2.16 would be one standard deviationabove the mean. For criticality, the average was .73, and the standarddeviationwas .18; therefore, an activity with an average criticality of .55 would be onestandard deviation below the mean criticality, and an activity with an averagecriticality of .91 would be one standard deviation above the mean.Table 1 contains the weights that would be assigned to four hypotheticalactivities if no adjustmentweremade in the criticality ratings (i.e., weights based

    23

    This content downloaded from 131.94.16.10 on Sat, 25 Jan 2014 18:15:47 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/13/2019 Kane Test Specifications (1)

    9/12

    Kane,Kingsbury,Colton,and EstesTable 1

    Weights Assigned to Four Hypothetical ActivitiesUsing I = CF, and I' = CaF with a = 3.235

    I = CF I' CaFAverage Average Average AverageAverage F C = 0.55 C = 0.91 C = 0.55 C = 0.91

    0.50 0.071 0.117 0.031 0.1572.16 0.306 0.506 0.133 0.678

    on I = CF), and if the criticality ratings were adjusted to weight criticality andfrequency equally (i.e., weights basedon I' = CaF). Two of the fourhypotheticalactivities have relatively low frequencies (one standard deviation below themean, or F = .50), and two have relatively high frequencies (one standarddeviationabovethe mean, orF = 2.16). For each value of frequency,one activityhas low criticality (C=.55), and one has high criticality (C=.91). Incomputing the weights in Table 1, the values for I for the four activities weredividedby the sum of the fourvalues of I in orderto obtain proportionalweights.Similarly, the values of I' were dividedby the sum of the four values of I'.There are two general featuresof the two sets of weights in Table 1that shouldbe noted. First, the values of the adjusted weights, W', for the two low criticalityactivities are smaller than the correspondingvalues of the unadjustedweights,Wi,and the values of W, for the two high criticality activities are largerthan thecorrespondingvalues of Wi.The adjusted weights, W', give more emphasis tocriticality than the unadjusted weights, Wi.Second, for the unadjusted weights, the weight (Wi = .306) assigned to thehigh frequency-lowcriticality activity is almost three times as large as the weight(Wi = .117) for the low frequency-highcriticality activity. Because the impact offrequencyon Wi s muchlargerthan the impactof criticality, the weighting of thehigh frequency activities tends to be much larger than the weighting of lowfrequency activities, regardless of the criticality ratings. When the criticalityscale is adjustedso that the impact of frequencyand criticality on importanceisequal, as in the right side of Table 1, the weight, W',of the high frequency-lowcriticality activity is roughly equal to the weight of the low frequency-highcriticality activity.Table 2 presents percentageweights for five pairsof activities (from the studyof entry-level RN practice) representing different values of frequency andcriticality. (Note that the weights in Table 2 are percentageweights basedon the222 activities in the study. The weights and the adjusted weights for the 222activities sum to approximately100.) In each pairof activities, one activity has arelatively low frequency,and one has a relatively high frequency.The criticality24

    This content downloaded from 131.94.16.10 on Sat, 25 Jan 2014 18:15:47 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/13/2019 Kane Test Specifications (1)

    10/12

    Criticality and Frequency

    Table 2Percentage Weights Assigned to Ten Actual Activities

    Using I = CF, and I' = CaF with a = 3.235

    Activity Ratings WeightsActivity C F W W'

    219. Help clients chooserecreationalactivities that fittheir age andcondition

    12. Weigh a client191. Teach a client with

    poor inter-personalskills to communicatemore effectively

    59. Record observationsof behavior thatindicate a client'smood

    28. Administer animmunizing agent211. Suggest revising ordiscontinuing amedication order

    0.39 0.63 0.112 0.022

    0.37 1.94 0.324 0.0570.57 0.55 0.141 0.064

    0.54 2.18 0.532 0.219

    0.76 0.59 0.205 0.1800.76 1.91 0.653 0.573

    14. Evaluate the impactof therapeuticinterventions on aclient's potentialfor suicide

    184. Administer oxygen97. Assess the environ-ment of a suicidal

    client for potentialhazards96. Maintain asepsis forclients at risk

    0.87 0.62 0.242 0.286

    0. 88 2.24 0.885 1.0680.95 0.66 0.283 0.412

    0.96 2.25 0.969 1.422

    25

    This content downloaded from 131.94.16.10 on Sat, 25 Jan 2014 18:15:47 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/13/2019 Kane Test Specifications (1)

    11/12

    Kane,Kingsbury,Colton,and Estesratings in each pair are approximately equal, but the pairs of criticality ratingsincrease from top to bottom in Table 2.

    The weights in Table 2 follow the pattern discussed earlier. For both sets ofweights, the largest weights are for activities with high frequency and highcriticality, and the smallest weights are for activities with low frequencyand lowcriticality. Fora given level of criticality, the activity with a higherfrequencyhasa larger weight than the activity with a lower frequency. Likewise, for a givenfrequencylevel, the weights increase as a functionof criticality.As was the case for the artificial data in Table 1, the majordifference betweenthe unadjusted weights, Wi,and adjustedweights, W,, can be seen most clearlyby comparing high frequency-low criticality activities to low frequency-highcriticality activities. The unadjustedweight, Wi,assigned to activity 12, Weigha client, which has a low criticality but a high frequency, is higher than Wiforactivity 97, Assess the environment of a suicidal client for potential hazards,which has a high criticality but a low frequency. This occurs because theunadjusted weights, Wi, depend mainly on frequency: Note that the activity,Weigh a client, has a largervalue of Withan any of the five activities with lowfrequencies,regardlessof their criticality.By contrast,the adjustedweight, W',for activity 12, Weigh a client, is muchsmaller than its unadjusted weight, Wi,because of its very low criticality rating.Similarly, the value of W\ for activity 97, Assess the environment of a suicidalclient forpotentialhazards, is largerthanthe correspondingvalue of W,becauseof its high criticality rating. Because of these changes, the adjusted weight foractivity 97, which has low frequencyand high criticality, is much largerthan theadjustedweight for activity 12, which has low criticality and high frequency.Given the purpose of licensure, to protect the public, it is desirable that alicensure examination emphasize activities that would pose a serious threat toclients if they were omitted or done improperly. Therefore, it would seemappropriate to give the criticality ratings at least as much emphasis as thefrequency ratings in evaluating overall importance, and this suggests thatadjustedweights, ratherthan unadjusted weights, should be used.

    ConclusionsIn combining data on criticality and frequency as a basis for developing alicensure or certification examination, a multiplicative model would seem to beparticularly appropriate.Criticality ratings provideestimates of the importanceof the activity peroccurrence,and frequencydata provideestimates of the rate ofoccurrenceof the activity in practice. By multiplyingcriticality by frequency,theoverall importanceof the activity in practice can be estimated.In addition, the difference between the nominal weights of these two variablesand their effective weights in determining estimates of importance needs to beconsidered.This is particularlytrue because the variability in average frequen-cies is likely to be much larger than the variability in criticality ratings in taskanalyses of professionalpractice. In this paper,we have outlined proceduresforanalyzing the effective weights of criticality and frequency in a multiplicative

    26

    This content downloaded from 131.94.16.10 on Sat, 25 Jan 2014 18:15:47 PMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp
  • 8/13/2019 Kane Test Specifications (1)

    12/12

    Criticality and Frequencymodel and for controlling the relative impact of frequency and criticality inestimating overall importance.

    ReferencesAmerican Psychological Association, American Educational Research Association, &National Council on Measurement in Education. (1985). Standards for educationalandpsychological testing. Washington, DC: American PsychologicalAssociation.Gael, S. (1983). Job analysis: A guide to assessing work activities. San Francisco:JosseyBass.Jarjoura, D., & Brennan, R. (1982). A variance components model for measurement

    proceduresassociated with a table of specifications. Applied Psychological Measure-ment, 6, 161-171.Kane, M. (1982). The validity of licensure examinations. American Psychologist, 37,911-918.Kane, M., Kingsbury,C., Colton, D., & Estes, C. (1986). A study of nursingpractice androle delineation and job analysis of entry-level performance of registered nurses.Chicago:National Council of State Boardsof Nursing.McCormick, E. (1979). Job analysis: Methods and applications. New York: American

    Management Association.Rakel, R. (1979). Defining competence in specialty practice: The need for relevance. InDefinitionsof competencein specialities of medicine, conferenceproceedings. Chicago:American Board of Medical Specialties.Shimberg, B. (1981). Testing for licensure and certification.American Psychologist, 36,1138-1146.Wang, M., & Stanley, J. (1970). Differential weighting: A review of methods andempiricalstudies. Review of Educational Research, 4, 663-705.

    AuthorsMICHAEL T. KANE, Senior Research Scientist, American College Testing Program,P.O. Box 168, Iowa City, IA 52243. Degrees: BS, Manhattan College; MA, StateUniversity of New York at Stony Brook; MS, PhD, Stanford University. Specializa-tion: measurementtheory.CAROLE KINGSBURY, Director of Test Construction,National League for Nursing,10 Columbus Circle, New York, NY 10019. Degrees: BS, EdM, EdD, ColumbiaUniversity.Specialization: evaluationin nursingeducation and practice.DEAN COLTON, Research Specialist, American College Testing Program, P.O. Box168, Iowa City, IA 52243. Degrees: BS, MA, University of Iowa. Specialization:educational measurement.CARMEN A. ESTES, Director, Program Support and Research, Contract ServicesArea, Test Development Division, American College Testing Program, P.O. Box 168,Iowa City, IA 52243. Degrees: BSN, Universityof Maryland;MS, PennsylvaniaStateUniversity;MEd, Towson State College;PhD, PennsylvaniaState University.Speciali-zations: test development,test specifications,job analysis, role delineationstudies.

    27