A Method for Scaling Scores on Tests that Prove Too Difficult

A Method for Scaling Difficult Tests 597

humidity of 80%. The moisture content is a little less than .5 ounces/100 cubic feet. Heat this air to 75° without changing the moisturecontent. Move along the .5 ounce/100 cubic foot moisture contentline to point F. We note that the relative humidity has been loweredto 20%.b. How is the rate of evaporation of moisture from available watersources dependent upon the relative humidity of the air?Wet 3 or 4 towels, weigh them and hang them in the room which

holds dry air of 75°F. temperature. Turn on the fan. After a periodof time weigh these towels again and note the loss of water due toevaporation. Next compute from the nomograph the moisture con-tent of the air in the model room and the change in temperature dueto evaporation. Does the gain in moisture agree in these two methods?Repeat this experiment in moist air of the same temperature andcompare the rate of evaporation with that in the dry air. Is the num-ber of degrees drop in air temperature proportional to the amountof water evaporated from the towels? Investigate the effect on yourskin of dry heated air in your home in the winter.

A Method for Scaling Scores on Tests that ProveToo Difficult

John Schmitt and Mary GriffinBoston College, Chestnut Hill, Mass. 02167

Anyone who regularly constructs and administers achievementtests knows that it is not unusual, on occasion, to find half of theclass achieving percentage scores below the level normally acceptedas passing for the particular school or school system in question.This article describes a simple method of "curving" scores upwardthat has proved satisfactory to a number of teachers and their pupils.Low test performance may be attributed to several factors. It may

be the primary responsibility of the pupils; it may be due to the con-ditions of instruction (which include the teacher); or it may be dueto the test, itself. In most instances it is probably the result of someinteraction among these factors, since there are few classrooms popu-lated by perfect pupils or optimal instructional conditions (includingperfect teaching) or perfect tests.

Regardless of the cause or causes of low test scores, however, therecan be no legitimate argument with the teacher who maintains thattests set the standards, and pupils should get what they score; norcan anyone contest the position that difficult tests are balanced by

598 School Science and Mathematics

easy ones, so that everything comes out even, eventually. Both ofthese are defensible points of view, and strict adherents of either willprobably find little of value in what follows.

For the teacher seeking a defensible method of raising scores ondifficult tests, several approaches are available. Most statisticianswould recommend some form of standard score�z-scores, T-scores,stanines or scales for which the teacher determines the mean andstandard deviation. Some would recommend "normalizing" the dis-tribution to insure comparability among scores on various tests, andboth of these procedures are statistically sound and equitable.The difficulty is that, although these methods have been known

and recommended for years by college professors, hardly any teachersuse them. The computations are laborious, and most teachers cannotor will not perform them. Many teachers do not understand standardscores, and almost no pupils do. Finally, if a standard scale resemblinga local percentage-grading scale is used, grades above 100 can, anddo, occur, and this is distressing to all concerned.

Teachers with whom the writers have worked are generally satis-fied with the achievement of two objectives, where difficult tests areconcerned. The average score should be raised to a level consistentwith local practice, or some other level determined by the teacher,and a maximum score on the test (all items correct) should yield ascaled score of 100.The method of solution of simultaneous equations is ideally suited

to achieving these goals. In a test of 50 items, for example, where theaverage number of correct responses is 34, and the desired averagescore is 80, the following equations describe the problem :

50a + k = 100

34^ + k === 80

Then, by subtraction:

16a == 20

a == 20/16 == 1.25

And with the value of a known, k can be readily determined from:

50o + k == 100

Thus:

k = 100 - 50o == 100 - (50 X 1.25) == 100 ~ 62.5 == 37.5

The equation to produce the desired scaled-score results, then, is:

Score = L25o + 37.5

And a raw score of 27 items correct on the test (percentage score of

A Method for Scaling Difficult Tests 599

54) would yield a scaled score of:

(1.25 X 27) + 37.5 === 33.75 + 37.5 - 71.25

Following normal rounding procedures, this would be recorded as ascaled score of 71 on the test.

If a is the coefficient by which the raw score on the test is to be multiplied, andthe constant to be added is k, then the following equations can be employed todetermine their respective values:

100 -- Z

M -Xand:

where:k == 100 - aM

Z = the desired average scoreM == the maximum possible score on the testX = the average raw score on the test.

To complicate matters a bit for illustrative purposes, if a test con-sists of 77 items, and average performance on the test is 40 items cor-rect, and the desired scaled-score average is 80, then a can be de-termined from:

100 - 80 20a = �� == � = 0.54

77 - 40 37And:

k === 100 - (0.54 X 77) == 100 - 41.58 == 58.42

The equation for scaling scores on this test, then, is:

Score == 0.54 S + 58.42

For ease of computation this can be simplified to 0.5 S+61.5 withoutdoing too much violence to the original requirements, making theaverage converted score 81.5 instead of the specifically desired 80.

Since the method suggested here involves only linear transforma-tion, no change takes place in the relative positions of individualscores on the distribution in standard-deviation units. A score at themean on the original distribution will be at the mean of the scaleddistribution, and scores will be identical quartiles on both distribu-tions. Pupils seldom raise objections to increasing the distributionmean, since all scores are raised by the transformation except forthose pupils who answered all questions correctly; the latter groupis small or unpopulated on most difficult tests and, in any event, isnot usually given to complaining.

Teachers employing this method should avoid referring to thetransformed scores as percentages. They are not percentages, ofcourse, in any sense, and simply calling them scores will usuallyprove satisfactory.

Documents

A Method for Scaling Scores on Tests that Prove Too Difficult