9
EMPIRICAL TEST OF A MODEL RELATING MAGNITUDE AND CATEGORY SCALES HANNFS EISLER Psychological Laboratory, Uniuersity of Stockholm, Sweden EISLER, H. Empirical test of a model relating magnitude and category scales. Scand. J. Psychol., 1962, 3, 88-96.-The function K= alog(cy + q/k) +B seems to describe the relation between category scale values K and subjective magnitudes cy. The additive constant q/k is obtained from the SDS of the magnitude estimates. The model was empirically c o n h e d for the loudness and softness of white noise scaled by the methods of magnitude estimation and category rating. When sensation is measured by direct methods, two kinds of scales can be obtained, depend- ing on the procedures used (Stevens, 1960): ( I ) the magnitude scale, which presupposes a zero-point on the subjective continuum and is thus a ratio scale; (2) the partition scale in which the zero-point is arbitrary and the scale is only an interval scale. A typical procedure for obtaining a magnitude scale is magnitude estimation; for obtaining a partition scale, the method of category rating has often been used. This paper is concerned with the mathematical relation between these two subjective scales. The problem is by no means new; Stevens & Galanter (1957) demonstrated that for what Stevens (1957) defines as prothetic continua the function relating the category scale to the magnitude scale is concave downward. Since the category scale proved very susceptible to context effects produced by stimulus spacing and the relative number of presentations of the stimuli, Stevens & Galanter concluded that three factors interact in category rating: (I) the intent of the observer to equate intervals, (2) his discrimination, which, being better at the lower end of the range than at the upper end, biases the observer’s estimations, and (3) his expectations regarding the procedure, a factor that usually disposes the observer to give the same number of judgments for each category. The interaction among these factors is so complicated that a quantitative prediction of the category scale appears difficult to make. In most of the continua they investigated, Stevens & Galanter did not use the same set of stimuli when they scaled by magnitude estimation and category rating. The relation between subjective scales may, however, prove invariant with respect to context effects, even if the relation between physical and subjective scales varies with the particular experimental condition. In the present experiments, in which loudness and softness of white noise were scaled, the same set of stimuli served for both scaling procedures. Implied is the assumption that the magnitude scale is not quite invulnerable to context effects either. That this assump- tion may be correct has been demonstrated in an experiment by J. C. Stevens (1958) whose observers gave wider range of magnitude estimations for a small portion of the continuum scaled when he increased the number of stimuli within this portion. 88 Scand.J. Psychol., Vol. 3, rg6a

EMPIRICAL TEST OF A MODEL RELATING MAGNITUDE AND CATEGORY SCALES

Embed Size (px)

Citation preview

Page 1: EMPIRICAL TEST OF A MODEL RELATING MAGNITUDE AND CATEGORY SCALES

EMPIRICAL T E S T O F A M O D E L RELATING M A G N I T U D E A N D CATEGORY SCALES

HANNFS EISLER

Psychological Laboratory, Uniuersity of Stockholm, Sweden

EISLER, H. Empirical test of a model relating magnitude and category scales. Scand. J. Psychol., 1962, 3, 88-96.-The function K = alog(cy + q/k) + B seems to describe the relation between category scale values K and subjective magnitudes cy. The additive constant q/k is obtained from the SDS of the magnitude estimates.

The model was empirically c o n h e d for the loudness and softness of white noise scaled by the methods of magnitude estimation and category rating.

When sensation is measured by direct methods, two kinds of scales can be obtained, depend- ing on the procedures used (Stevens, 1960): (I) the magnitude scale, which presupposes a zero-point on the subjective continuum and is thus a ratio scale; (2) the partition scale in which the zero-point is arbitrary and the scale is only an interval scale. A typical procedure for obtaining a magnitude scale is magnitude estimation; for obtaining a partition scale, the method of category rating has often been used.

This paper is concerned with the mathematical relation between these two subjective scales. The problem is by no means new; Stevens & Galanter (1957) demonstrated that for what Stevens (1957) defines as prothetic continua the function relating the category scale to the magnitude scale is concave downward. Since the category scale proved very susceptible to context effects produced by stimulus spacing and the relative number of presentations of the stimuli, Stevens & Galanter concluded that three factors interact in category rating: ( I ) the intent of the observer to equate intervals, (2) his discrimination, which, being better at the lower end of the range than at the upper end, biases the observer’s estimations, and (3) his expectations regarding the procedure, a factor that usually disposes the observer to give the same number of judgments for each category. The interaction among these factors is so complicated that a quantitative prediction of the category scale appears difficult to make.

In most of the continua they investigated, Stevens & Galanter did not use the same set of stimuli when they scaled by magnitude estimation and category rating. The relation between subjective scales may, however, prove invariant with respect to context effects, even if the relation between physical and subjective scales varies with the particular experimental condition. In the present experiments, in which loudness and softness of white noise were scaled, the same set of stimuli served for both scaling procedures. Implied is the assumption that the magnitude scale is not quite invulnerable to context effects either. That this assump- tion may be correct has been demonstrated in an experiment by J. C. Stevens (1958) whose observers gave wider range of magnitude estimations for a small portion of the continuum scaled when he increased the number of stimuli within this portion.

88 Scand.J. Psychol., Vol. 3, rg6a

Page 2: EMPIRICAL TEST OF A MODEL RELATING MAGNITUDE AND CATEGORY SCALES

MODEL RELATING MAGNITUDE AND CATEGORY SCALES 89

Almost all studies of the relation between category and magnitude scales show that the cate- gory scale lies between a linear and a logarithmic function of the magnitude scale for prothetic continua. Elsewhere I have shown that this relation is to be expected, if (a) the category scale is a function of discrimination alone, and (b) discrimination is described as the linear generaliza- tion of Weber’s law for subjective continua (Eisler, 1962).

Let us assume that

u-kcp+q,

where u =standard deviation of cp, v, -subjective magnitude, k and q are constants. If we now substitute dv, for u and carry out a Fechnerian integration we obtain

K = alog(kcp + q) + r , (1)

where K is the category scale, a and /?’ are constants. Another way of looking at the problem was suggested by an experiment of Torgerson’s (1960)

in which he scaled lightness and darkness of Munsell gray paper chips by the method of magnitude estimation and category rating. As I have shown (Eisler, 1962), the category scale must be a logarithmic function of the magnitude scale

if the following three assumptions hold (i) that a reciprocal (hyperbolic) relation holds between the two magnitude scales; (ii) that a complementary relation holds between the two category scales; and (iii) that the relation between the category scale and its corresponding magnitude scale is the same whether the attribute or its reciprocal is scaled.

The first assumption means that, if one stimulus of a pair is judged twice as light as the other, it ought also to be judged half as dark. The second assumption means that, if a stimulus is judged 2 on a 7-point category scale of lightness, it ought to be judged 6 on the corresponding scale of darkness. The third assumption means that the function relating category scale to magnitude scale ought to be the same for lightness and darkness.

Torgerson’s data seem to throw some light on the disagreement between eqs. (I) and (2 ) . As- sumption (i), requiring a reciprocal relation between the two magnitude scales, does not quite hold. If, however, the zero-point for the magnitude scale is defined as the point where variability vanishes (cf. Thurstone, 1928), that is to say, if

eq. (I), written slightly differently, is recovered:

K-alog(tp + q / k ) +/I. (1)

At the same time, the reciprocal relation that did not hold for YJ can be expected to hold for y‘. In order to test the model and study its invariance as well as invariances of some of the

parameters, three series of experiments were carried out. Each series consisted of four experi- ments: the loudness and softness of white noise were each scaled by the methods of magnitude estimation and category rating. In what I would like to call the basic experimental series, series I, the stimuli were spaced approximately logarithmically, though with a slight bunching at the higher end of the range in order to approximate what Stevens 8z Galanter (1957) refer to as the pure category scale. The series was carried out in an ordinary room whose open door led to a noisy corridor. In series I1 approximately sone spacing was used. Series I11 used the stimulus spacing of series I, but took place in a sound-proofed room. Within each series the same set of stimuli was used for all four experiments. The number of stimuli used (10) and the physical range covered by the stimuli (30 to IOO db re o.oooz pbar) was unchanged between the series.

Scad. J. Psychol., Vol. 3, 1962

Page 3: EMPIRICAL TEST OF A MODEL RELATING MAGNITUDE AND CATEGORY SCALES

90 HANNES EISLER

The following predictions were tested: (I) The intra-individual standard deviations computed from magnitude estimation data

can be described adequately as a linear function of the magnitude scale for loudness and softness in all three conditions, i.e., the linear generalization of Weber’s law holds for the subjective continua studied (a = ky + q).

( 2 ) Category values for loudness and softness are complementary for all three series, i.e., KL+ K,=const. (The subscripts L and S refer to loudness and softness, and the category values are taken for the same stimulus.)

(3) The reciprocal relation yL =c/ys or log yL + logys=const. (where yL and ys are always taken for the same stimulus) will not hold for the magnitude estimates obtained. It will hold for all three series after the magnitude estimates have been ‘corrected’ by adding the constant

(4) The relation expressed in eq. ( I ) K = a l o g ( ~ + q / k ) + ~ will hold for loudness and softness under all three conditions and thus be invariant with respect to stimulus spacing.

(5) The y intercept -(q/k) in the linear generalization of Weber’s law will be related to the threshold. The parameter q/k will be smaller for loudness and unchanged for softness in series I11 (carried out in sound-proofed room) compared with series I and 11.

Predictions (I) to (4) derive from the model, whereas prediction (5) derives from an attempt to interpret the parameter q/k. As is shown below, all the predictions except the last proved correct.

q/k.

PROCEDURES The observers listened to a band of white noise (75 to 2400 cps) through earphones. They

presented themselves with the stimulus by means of a switch after indication by the experi- menter and listened to the noise as long as they wanted. They gave their judgments orally after turning off the noise. In series I11 a loudspeaker had to be used for communication from observer to experimenter, since the observer was alone in the sound-proofed room.

In the magnitude estimation experiments, a stimulus of medium intensity (the same intensity was used for the loudness and for the softness estimations, yo db for series I and 111, 88 db for series 11) was presented to the observer and called 10 (the standard). The observer was asked to estimate a series of noise intensities so that the proportion between the numbers given and 10 reflected the proportion between the stimulus presented and the standard. In the category rating experiments, the observer was presented with the weakest and strongest intensities (30 and IOO db in all three series) and informed that they were called o and 6 (for loudness) or 6 and o (for softness). The observers were asked to assign to each stimulus noise an integer between o and 6 so that subjective intervals between successive numbers were equal.

Each of the experiments of the first two series was started with preliminary tiials since the subjects found it difficult to estimate softness. Most of the subjects slipped occasionally into loudness judgments. Clear mistakes were corrected by the experimenter and the experiment proper was not started (i.e., the judgments were not recorded) until the experimenter was convinced that the observer had developed the appropriate attitude. After that no more correc- tions were made, and every judgment was recorded.

The same twelve observers were used within each series of four experiments and in the first two series. One observer was replaced by a new observer in the third series. No observer took part in more than one experiment on any one day. The order of experiments within a series was different for each observer. In each experiment the 10 stimuli were presented four times in random order that differed for each observer. The intervals between the three series of experiments were I to 2 months; the series were carried out in the order I, 11, I11 and thus were not randomized between observers.

Scand. J . Psychol., Vol. 3, 1962

Page 4: EMPIRICAL TEST OF A MODEL RELATING MAGNITUDE AND CATEGORY SCALES

TA

BL

E

I.

Geo

met

ric

mea

ns o

f su

bjec

tive

mag

nitu

des (y), a

nd t

heir

int

ra-in

divi

dual

SD

s (a),

and

arith

met

ic m

eans

of

cate

gory

val

ues

(K)

for

loud

ness

and

sof

tnes

s of

whi

te n

oise

u&

thre

e ex

peri

men

tal c

ondi

tions

.

Seri

es I

Seri

es 1

1 Se

ries

I11

Soft

ness

L

oudn

ess

db

Lou

dnes

s softness

db

Lou

dnes

s So

ftne

ss

db

re 0.0002

, - re 0.000

2

I - re 0.000

2 - -

pbar

cp

0

K

cp

0

K

pbar

cp

a

K

rp

(I

K

pbar

v

aK

cp

a

K

30

0.671 0.22 0.02

54.43

8.65 5.96

30

0.795 0.24

0.0

0

60.19

7.97 6.00

30

0.863

0.32

0.0

0

52.00

8.51

5.96

40

1.728 0.64 0.52

35.88

7.82 5.46

69

4.064 1.17 1.54 23.68

4.54 4.48

40

1.637 0.91 0.48

35.34

7.60 5.60

50

3.303

1.37 1.23

25.23

6.17 4.96

79

7.289

1.74 2.69

13.55

4.37 3.52

50

3.454

1.31 1.10 26.27

9.29 4.98

60

5.714 1.82 1.79

18.41

4.19 4.46

84 10.40

2.14 3.21

10.03

2.64 2.94

60

5.429

1.71 1.73

19.77

5.96 4.50

70

9.650 2.87 2.50

11.56 2.65 3.75

88 13.95

2.80

3.83

7.820 1.79 2.31

70

9.295

2.35 2.33

12.63

3.15 3.88

80

17.69

3.53 3.50

7.771 2.00 2.71

92

21

.10

5.30 4.73

5.364 1.59 1.46

80

17.44

4.23 3.23

7.327 1.98 2.90

85

25.54

5.08

4.13

4.499 1.47 2.17

94

25.2

5 5.

80 5.06

2.971 1.68

1.33

85

24.49

5.37 3.94

4.761

1.58

2.35

90

35.51 5.17 4.81

2.626

1.00 1.33

96

31.12

5.03 5.46

2.296 0.87 0.79

90

33.45

7.06 4.48

3.129 1.08 1.60

95

51.13 8.24 5.56

1.098 0.56 0.42

98 37.44

6.68

5.85

1.120 0.52

0.25

95

48.44

9.81

5.44

1.530

0.68 0.75

3 k

100

62.41 8.59 5.98

0.409

0.21 0.04

100 44.35

7.03 5.96

0.811 0.35 0.06

IOO

69.75

8.85 5.92

0.662

0.35 0.06

2 3 3 P t, h

Y) 01

b

Page 5: EMPIRICAL TEST OF A MODEL RELATING MAGNITUDE AND CATEGORY SCALES

HANNES EISLER 92

TABLE 2. Number of asstgnments to each category of loudness and softness of white noise under three experimental conditions.

Series I Series I1 Series I11 --- Cate- Loud- Soft- Loud- Soft- Loud- Soft- BOW ness ness ness ness ness ness

0 82 80 48 109 83 65 I 69 59 26 91 69 63 2 57 55 50 79 73 49 3 56 60 58 64 60 63 4 67 62 74 63 66 72 5 71 81 100 26 58 82

48 71 86 6 78 83 124

RESULTS

The results of the 12 experiments are given in Table I. For magnitude estimations, the geometric means were computed for all observers and trials, and for category ratings the arithmetic means.

The computation of the intra-individual standard deviations from magnitude estimation data raised some problems. The distributions are skewed-if the linear generalization of Weber’s law is assumed-slightly so for low values and more markedly for high values. Whereas SDS computed on the magnitude estimations themselves give the better approximation for low values, a logarithmic transformation may give a closer approximation for high values. But since the variability within single observers is relatively small and the logarithmic transform thus would have little effect, the following compromise was decided on, in which one calculating procedure was used throughout. The four estimates given by an observer for a stimulus were multiplied by a factor that rendered the arithmetic mean of the four values I. From these four new values the variance was computed for every stimulus and every observer separately. The variances were averaged over observers, and the square root of the averaged variance was multiplied by the corresponding magnitude in order to obtain the SD (a). (The corresponding procedure for averaging magnitudes would have been to calculate the arithmetic mean for each observer and the geometric mean between observers instead of computing the geometric mean over all values. Because of the comparatively small variability within observers, the results of this procedure would not have been much different.)

Table 2 gives the number of judgments obtained for each category. As can be seen, the category scales obtained for both loudness and softness in series I and I11 are not very different from the ‘pure’ category scale, whereas the sone spacing of series I1 gave a scale that deviated conspicuously from it. [Stevens & Galanter (1957) have suggested that the pure category scale may elicit the same number of judgments for each category after an iterative procedure in which the stimulus spacing is adjusted on the basis of previous category judgments.]

Fig. I shows the standard deviation as a function of subjective magnitude. In all the figures Roman numerals refer to the experimental series and L and S to loudness and softness. The linear generalization of Weber’s law seems to hold pretty well, when we take into

Scand. J . Psychol., VoZ. 3, 1962

Page 6: EMPIRICAL TEST OF A MODEL RELATING MAGNITUDE AND CATEGORY SCALES

MODEL RELATING MAGNITUDE AND CATEGORY SCALES 93

z El 2 > W n m 0 20 LO I L 80

0 20 LO SO

LOUDNESS

SOFTNESS

FIG. I . Intra-individual QDS as a function of subjective magnitude for loudness (L) and softness (S) of white noise under experimental conditions I, I1 and 111. The straight lines are computed by

a variation of the methods of least squares.

account that the scatter around the straight lines is due to second-order variability. The straight lines are calculated by means of a variation of the method of least squares, in which the sum of the squares of the relative deviations Z[(y -y1)/y]2 is minimized rather than the sum of the squares of the absolute deviations Z(y -yJZ (y = experimental and yl =computed value). Because of the great range covered by the dependent variable, the common method of least squares would have practically neglected the points at the lower end of the range.

In the computation of the lines the two extreme points were not included. The justification of this procedure is that the SDS for the extreme stimuli seem too low compared with the other eight stimuli-possibly because they are recognized more often by the observers-which would yield a spuriously low SD. This point of view seems to be strengthened by the data from series 11. Because of the stimulus spacing in series 11, the lowest stimulus is much more conspicuous than the highest, and the point corresponding to it falls far below the lines in both I1 L and I1 S in Fig. I, whereas the point corresponding to the highest stimulus does not deviate noticeably more from the lines than the other eight points.

Table 3 gives the slopes k and y intercept -(q/k) for all magnitude estimation experi- ments. It can be seen that the changes in the parameters with the particular experimental condition are small and probably not significant.

TABLE 3 . Slope k and y intercept q/k for the linear function relating intra-individual SDs to subjective magnitudes for loudness and softness of white noise under three experimental conditions.

Series I Series I1 Series I11 - -- Loud- Soft- Loud- Soft- Loud- Soft-

ness ness ness ness ness ness

k 0.16 0.21 0.17 0.22 0.19 0.24 41k 3.1 1.7 2.7 1.6 3.1 1-3

Scand. J . Psychol., Vol. 3, 19th

Page 7: EMPIRICAL TEST OF A MODEL RELATING MAGNITUDE AND CATEGORY SCALES

94 HANNES EISLER

CATEGORY SCALE OF LOUDNESS

FIG. 2. Category scale of softness as a function of category scale of loudness under three experi- mental conditions. The straight lines have a slope of --I and are fitted by eye.

Fig. 2 illustrates the validity of the complementarity assumption-prediction (2). The straight lines have a slope of - I and are fitted by eye. I t is worth pointing out that the lines do not pass through the points (0; 6) and (6; o), as would be expected, a discrepancy called the ‘end effect’.

Fig. 3 illustrates the validity of the reciprocity assumption-prediction (3). Crosses refer to the magnitude estimates obtained, and circles refer to magnitudes ‘corrected’ by the addition of the parameter qlk. The straight lines have a slope of - I and are fitted by eye.

Testing the correctness of the relation expressed in eq. ( I ) K=alog(y+ q / k ) + p was the main objective of the present investigation. This relation can be regarded in two ways: as a Fechner integral of the generalized Weber law, or as a logarithmic function of the (in some sense) ‘true’ subjective magnitudes (i.e., estimates corrected by the addition of q/k).

These two methods are illustrated in Fig. 4. In illustration of the logarithmic function, the category values are plotted against the logarithm of magnitude values obtained directly (crosses) and ‘corrected’ (circles). In these graphs, the straight lines are fitted by eye. To illustrate the Fechnerian integral, category values have been computed from eq. (I). The parameters a and were calculated in two ways: (u) by assuming that the extreme stimuli would yield exactly the category values o and 6 (solid curves constituting a pure prediction of the category scales from magnitude estimation data); and (a) by the method of least squares (dashed curves). It is clear that eq. ( I ) adequately describes the relation between the two scales, and that this relation is invariant with stimulus spacing-prediction (4).

DISCUSSION

The proposed model is confirmed in that eq. ( I ) seems to describe the category scale rather well. The small systematic discrepancies between the continuous and the dashed

0.5 0.1

I I

-0.2

LOG LOUDNESS

FIG. 3. Log subjective softness as a function of log subjective loudness for experimental condi- tions I, I1 and 111. Crosses refer to the uncorrected magnitudes, whereas magnitudes corrected by adding the parameter q/k are described by circles. The straight lines have a slope of --I and are

fitted by eye.

Scand. J. Psychol., Val. 3, 1962

Page 8: EMPIRICAL TEST OF A MODEL RELATING MAGNITUDE AND CATEGORY SCALES

MODEL RELATING MAGNITUDE AND CATEGORY SCALES 95

X I L W

O I -0.5 0 0.5 1.0 1.5 t.0

LOG SUBJECTIVE LOUDNESS

i LOG SUBJECTIVE SWTNCSS

FIG. 4. The category scale as a function of the magnitude scale for loudness (L) and softness (S) of white noise under three experimental conditions. The relation is described in two ways: (u) A linear relation obtains between the category scale and the log of the magnitude scale corrected by the addition of the parameter q/k (circles). For comparison, the uncorrected values of the magnitude scale are given as crosses in the same plot. The straight lines are fitted by eye. (b) A function that is concave downward relates the category scale to the magnitude scale in linear coordinates. The curves constitute predictions made from magnitude estimation data and their intra- individual SDS. The continuous curve assumes that the extreme stimuli yield exactly categories o

and 6, whereas the dashed curves constitute a least squares fit.

curves in Fig. 4 are probably due to the end effect. The end effect may be related to the finding that the SDS for the two extreme stimuli fall below the lines of Fig. I . The proposed model in its present form makes no provision for the end effect.

Of course, the validity of the model for other continua has still to be investigated. Also problematic at present is the invariance of the model with ( I ) stimulus range, (2) the position

Scand. J. Pzychol., Vol. 3, 1962

Page 9: EMPIRICAL TEST OF A MODEL RELATING MAGNITUDE AND CATEGORY SCALES

96 HANNES EISLER

of the range on the continuum investigated (i.e., whether near or far from the absolute threshold), and (3) number of stimuli.

An interesting problem is raised by an experiment carried out by Judith Rich in this laboratory, in which roughness and smoothness of sandpaper were scaled. Miss Rich’s results indicate that the reciprocal relation between opposite continua improves when the magnitude estimates are carried out without a designated standard stimulus. The model described here would predict a corresponding decrease of the parameter q/k with other relations unchanged. This prediction needs to be tested.

The parameter q/k constitutes an interesting problem. I thought it might be related to the noise level on which the stimulus noise is superimposed. Hence the prediction that q/k would be smaller in the sound-proofed room, where the general noise level is lower. As can be seen from Table 3, q/k seemed to be largely unaffected by the experimental conditions.

This research was supported by a grant from the National Science Foundation (Psycho- Acoustic Laboratory Report PNR-262) and was done while the author was Research Fellow at the Psycho-Acoustic Laboratory, Harvard University, Cambridge, Mass., U.S.A.

R E F E R E N C E S

EISLEX, H. (1962). On the problem of category scales in psychophysics. Scund. J. Psychol., 3, 81-87.

STEVENS, J. C. (1958). Stimulus spacing and the judgment of loudness. J. ex$. Psychol., 56,

STEVENS, S. S. (1957). On the psychophysical law. Psychol. Rev., 64, 153-181.

STEVENS, S. S. (1960, 1961). The psychophysics of sensory function. Amer. Scientist, 48,226- 253. Also in W. A. RO~BNBLITH (Ed.), Sensory communication. New York M I T Press & Wiley. Pp. 1-33.

246-250.

STEVENS, S. S. & GALANTER, E. H., (1957). Ratio scales and category scales for a dozen per- ceptual continua. J. exp. Psychol., 54, 377- 41 I.

THURSTONE, L. L. (1928). The absolute zero in intelligence measurement. Psychol. Rm., 35, 17.5-1 97.

TORGERSON, W. S. (I 960). Quantitative judgment scales. In H. GULLIKSFN & S. MESSICK (Eds.), Psychological scaling. New York Wiley. Pp. 21-31.

Scand. J. Psychol., Vol. 3, 1962