DOCUMENT RESUME ED 419 005 TM 028 302 AUTHOR van der … · 2014. 5. 19. · TITLE Multidimensional Adaptive Testing with a Minimum. Error-Variance Criterion. INSTITUTION Twente Univ.,

DOCUMENT RESUME

ED 419 005 TM 028 302

AUTHOR van der Linden, Wim J.TITLE Multidimensional Adaptive Testing with a Minimum

Error-Variance Criterion.INSTITUTION Twente Univ., Enschede (Netherlands). Faculty of Educational

Science and Technology.REPORT NO RR-97-03PUB DATE 1997-00-00NOTE 20p.

AVAILABLE FROM Faculty of Educational Science and Technology, University ofTwente, P.O. Box 217, 7500 AE Enschede, The Netherlands.

PUB TYPE Reports Evaluative (142)EDRS PRICE MF01/PC01 Plus Postage.DESCRIPTORS Ability; *Adaptive Testing; Algorithms; *Computer Assisted

Testing; Criteria; Foreign Countries; *Maximum LikelihoodStatistics; Selection

IDENTIFIERS *Multidimensionality (Tests)

ABSTRACTThe case of adaptive testing under a multidimensional

logistic response model is addressed. An adaptive algorithm is proposed thatminimizes the (asymptotic) variance of the maximum-likelihood (ML) estimatorof a linear combination of abilities of interest. The item selectioncriterion is a simple expression in closed form. In addition, it is shown howthe algorithm can be adapted if the interest is in a test with a "simpleinformation structure." The statistical properties of the adaptive MLestimator are demonstrated for a two-dimensional item pool with severallinear combinations of the two abilities. (Contains 1 figure and 15references.) (Author/SLD)

********************************************************************************* Reproductions supplied by EDRS are the best that can be made *

* from the original document. *

********************************************************************************

Oo

Multidimensional Adaptive Testing Researchwith a Minimum Error-Variance Criterion Report

97-03

Wim J. van der Linden

U.S. DEPARTMENT OF EDUCATIONOffice of Educational Research and Improvement

EDUCATIONAL RESOURCES INFORMATIONCENTER (ERIC)0.."document has been reproduced as

received from the person or organizationoriginating it.

1:1 Minor changes have been made toimprove reproduction quality.

Points of view or opinions stated in thisdocument do not necessarily representofficial OERI position or policy.

BEST COPY AVAILABLE

2

Multidimensional Adaptive Testing

with a Minimum Error-Variance Criterion

Wim J. van der Linden

3

Abstract

The case of adaptive testing under a multidimensional logistic response model is addressed.

An adaptive algorithm is proposed that minimizes the (asymptotic) variance of the maximum-

likelihood estimator of a linear combination of abilities of interest. The item selection

criterion is a simple expression in closed form. In addition, it is shown how the algorithm can

be adapted if the interest is in a test with a "simple information structure". The statistical

properties of the adaptive ML estimator are demonstrated for a two-dimensional item pool

with several linear combinations of the two abilities.

Multidimensional Adaptive Testing - 2

Multidimensional Adaptive Testing

with a Minimum Error Variance Criterion

Adaptive testing algorithms for item pools calibrated under a unidimensional item

response theory (IRT) model have been well investigated (e.g., Lord, 1980; Wainer, 1990),

and several large-scale testing programs are in the process of introducing adaptive testing as

an alternative to traditional paper-and-pencil tests. Since these programs need large item pools

to guarantee measurement precision, in particular if measures to balance test content and

control item exposure are implemented, violations of the assumption of unidimensionality of

the item pool can be expected. Study of algorithms for adaptive testing under amultidimensional model seems therefore a timely matter.

The present paper is a sequel to van der Linden (1996) in which the problem ofoptimal assembly of a fixed test form from an item pool measuring multiple abilities is

addressed. The emphasis in the work underlying this earlier paper was on an algorithm for

assembling the fixed form to match optimally a set of targets for the (aymptotic) error

variance functions for the abilities subject to a large variety of constraints on the composition

of the test. It is the purpose of the present paper to study the use of the error variance as an

item-selection criterion in tests with an adaptive format. Independent results on

multidimensional adaptive testing are presented in Fan and Hsu (April, 1996) and Segall

(1996). The interest in the former is in investigating the differences between item selection

criteria based on various types of multivariate information measures rather than the error

variance of the estimator(s). Also, these measures are evaluated over random sampling of

correlated abilities. In the latter, the volumes of the confidence ellipsoid and the posterior

credibility ellipsoid are proposed as multivariate item selection criteria. The posterior

credibility ellipsoid is an attractive criterion because it allows for the possibility to build prior

knowledge on dependencies between the ability variables into the item selection procedure.

We will return to this point later in the paper.

The paper is organized as follows: The following section introduces the

multidimensional IRT logistic model used in the presentation of the algorithm and motivates a

linear combination of abilities as the parameter of interest in multidimensional adaptive

testing. The subsequent section discusses the (asymptotic) variance of the estimator of a linear

combination of ability parameters. Then it is proposed to minimize the variance of thisestimator as a criterion for multidimensional adaptive testing, and an adaptive algorithm

minimizing the variance is presented. The algorithm involves expressions of the item

parameters which are easy to evaluate. The last section demonstrates the use of the algorithm

for a two-dimensional item pool and investigates the statistical properties of the adaptive

estimator for various linear combinations of the two abilities.

5

Multidimensional Adaptive Testing 3

Multidimensional Model

Dichotomous response variables U; are used to denote the responses of an examinee to

item i=1,...,n. The variables take the value 1 if the response is correct and the value 0 if it is

incorrect. The model is the following multivariate logistic response function:

exp(ai 'a di)Pi(a) = Prob{U, =110,ai,di} =

1+ exp(ai '0-d;)(1)

where 0 =( ), with -oo < 0, <oo for j=1,...,m, is a vector of m ability variables,

ai with a,i>0 for j=1,...,m, is the vector of loadings of item i on these abilities

(item discriminations), and -.0<di<oo is a scalar representing a linear combination of the

difficulties of the item along the ability dimensions. Detailed information about the model is

given in McKinley and Reckase (1983), Reckase (1985, 1996), and Samejima (1974).

It is assumed that the item parameters a, and d, have already been estimated, and that

the estimates are sufficiently accurate to consider them as the true parameter values. The

parameters can be estimated using the Bayesian methods implemented in the program

TESTFACT (Wilson, Wood, & Gibbons, 1984), or through McDonald's (1996) harmonic

analysis applied to a normal approximation to the logistic function implemented in the

program NOHARM (Fraser & McDonald, 1988).

Parameter of Interest

. It is assumed that the parameter to be estimated by the adaptive testing procedure is a

linear combination of the abilities, a 19, where 2 =( A, ,..., ) is a vector of nonnegative

weights. The choice of this parameter is motivated by the following practical cases:

1. The item pool is intentionally designed to measure more than one ability. However,

the consumers of the test scores only want a single number to be reported. An

obvious example is an item pool for a test to predict a future criterion of success in a

selection problem, where the criterion is multifaceted. In this case, the weights A, are

to reflect the relative importance of the individual abilities with respect to thecriterion.

2. The item pool is designed to measure only one ability but the items are sensitive to

some "nuisance abilities" as well. A well-know example is a test for mathematical

ability depending on verbal abilities required to understand the items. This case can

6 'nor 7, 0 FY AVAILABLE


be dealt with by setting the weights Ai =0 for all nuisance abilities. As will become

clear below, this measure does not neutralize the effect of the values of the nuisance

parameters on the variance of the estimator of the intended ability but does allow for

direct minimization of this variance.

3. Even though the item pool measures several abilities, different sections of the test

may be required to be maximally informative with respect different abilities, for

example, because identifiable subsections of the tests are to be used for diagnosing

individual abilities. As will be shown below, an adaptive test with a "simple ability

structure" over the ability space can be realized by choosing different values for the

weights A. at different stages of the procedure. Note that this case is not equivalent

to the one of choosing items from different unidimensional item pools; rather than

rotating the selection of the items across unidimensional pools the weights A, are

rotated across different preselected values while selecting items from a

multidimensional pool.

For an extended description of the above cases of multidimensional testing using the format of

a fixed-form test, see van der Linden (1996).

Ability Estimation

It is assumed that A 'a is estimated by the method of MLE. As is well known, for

MLE it holds that

A '0 = A '6. (2)

For a response pattern (u1,...,un), the likelihood of 6 is defined as

L(0;u1,...,un,a1,...,an,d1,...,d.) = f Prob{U, = u,I0,ai ,c1,} (3)i=1

The joint MLE of 0,, j=1,...,m, is the vector of values maximizing this likelihood. The

likelihood equations are obtained setting the partial derivatives of the log of (3) equal to zero:

a InL n

a 0- IL In p

I

(6) + (1- th)ln(1 -pi (0)) = 0, j=1,...,m,,

7

(4)

Using

La79, Pi (6)[1- Pi (0)i,

the likelihood equations can be written as


(5)

Pi (2)] = 0, j=1,...,m,. (6)

which is the common form known to exist for a model belonging to the exponential family

(Andersen, 1980, sect. 3.2). The system can be solved using Newton's method.

Bo-o _[}{(00_,)) TI 100-0)

where H(B` -D) is the Hessian of the log-likelihood function with elements

a' lnL - -/a, a h P P, (0)] , g,h = 1,...,m.ae ggaeb

(7)

(8)

and y(0(1-1) is the gradient of the log-likelihood function, i.e., the vector with the first

derivatives set equal to zero in (6), both evaluated at step t-l. Substitution of the results from

(7) into (2) gives the MLE of A '0.

Variance of A '0

The asymptotic covariance matrix of the MLE of 0 is given by the inverse of Fisher's

information matrix

I(0 ) (-Ea2InL(0;UI,...,Un,a1,...,an,c11,...,dn)),

aegaoh(9)

with I, 0; U1, , tin ai,.. , an , di ,...,d.) being the likelihood statistics associated with the

random response vector and 0 p and 0, any two components of 0 . From (8), it follows that

8


-ED2 lnL(0; Ul, , ar, , ,..., dr,)

Ea', ad, (12A1 p, (0)] .aegaeh 1=1

Standard techniques for matrix inversion yield the (asymptotic) covariance matrix

V a Var(610 ) = I(0)1,

(10)

where the determinant of the information matrix, 140)1, is assumed to be not vanishing. For

the linear combination A '0

Var( A '010) = A 'VA. (12)

For a model with two ability parameters, 0 = ( 01,82 ), and (Al ,A2) (A,1- A) , the

result in (12) boils down to

where

Var(A.6,+ (I- a..)6218,,e2)

A2var(6,1e.,02) + (1-A)2var(621e,,e2) + 2A.(1--)cov(6,,6210,02)n n

= [A? y P, (01,02){ 1 (61,02)} + (1- A.)2E4Pi(ei,02){ 1 (01,02)}1=1 1=1

+ 2,1(1 A) E ail a,2 P, (01,02) { 1 Pi (01,02) } VII(01,02)1,1=1

n n

140,,e2)1 = [E4P,(8,,e2){ P, (e,,e2)}1[E4 (e,,e2){ P, (01,82) }]1=1 1=1

[E ail ai2 P,(01,02){ Pi (o, ,e2)}121=1

Note that for n=2 the two items should not be parallel, i.e., it should not hold that

ali=a21,

a12 =a22,

d1=d2,

9

(13)

(14)

(15)


because then the determinant in (13) vanishes.

Adaptive Testing Algorithm

For notational convenience, the adaptive testing algorithm is also presented for the

case of two ability parameters. The following definitions are needed: The items in the pool are

indexed by 1=1,...,1. The adaptive testing procedure is assumed to be stopped after n items

have been administered. The order of the items in the test is indexed by k=1,...,n. Thus, ik is

the index of the item in the pool administered as the kth item in the test. Suppose k-1 items

have been selected. Let Sk={ ii,,ik-I} denote this set of items. Then, Rk= \Sk is the set

of items rejected so far, and item ik has to be selected from this set. Finally, let 6; and 62 be

the estimators of 01 and 02 after k items have been administered.

The kth item is selected according to the following criterion:

mink, {Var(A 6; + (1 -A.)(k 16:`-' ,111-1)} , (16)

that is, the item is selected to minimize the variance of A + (1- A) 62 evaluated at the

current estimates, which is (13) for (el , 02) = (-1,6k2-1)

To implement the criterion, define

k-I

= 14, Pig (61k 1,6Z ')( Pit ,6,`g=1

+ Pi, (61'1,612'1)f 1- P (6;4,612'4)/

k-1vj Ea 01C-I ,62i){ (61k-1,62k-I))

g=1

+ a?213,,(6r ,6 2') 1 Pi (61" ,6 12'4)}

k-I

Wi = ay aid Pi, (6i-1,6Z-1){ 1- Pis (6r ,6Z-1)}g=1

+ a,,21),, (611 ,612) I ,V21))

The criterion can thus be expressed in closed form as

..,17 0

(17)

(18)

(19)


ik = min,{[A2vi+(1-A) u -2A(1-A)w1/[u )2]; j e Rk} (20)

In order to select the next item, for each item in Rk one term is added to the sums in (17)-(19).

These term involves both the parameters a11, ap, and d, and the probability P;(01,02), where

the last quantity is evaluated at the current estimates e,=6," and 02=62" . The item

minimizing the expression in (20) is selected.

Algorithm. The algorithm can be summarized as follows:

1. Choose a value for A reflecting the validity of the test;

2. Select item i, and i2 according to some external criterion;

3. Estimate e, and 02 using the MLEs from (7);

4. Enter the values of the parameters of item ii and i2 into (17)-(19);

5. Evaluate (17)-(19) for i E R2, and select the minimizer of (20);

6. Repeat Steps 3-5 for k=3,...,n.

In selecting the items in Step 2, the item parameters should be avoided to approximate

the conditions in (15) because of instability in (20). As long as the examinee produces

responses which are uniformly correct or incorrect, the MLEs of e have to be bounded by

well-chosen value.

Simple Structure. As already noted, different sections of the test may be required to

have good measurement qualities with respect to different abilities ("simple ability structure").

For two abilities, let ni and n2 be the required numbers of items in the two sections. The best

method to obtain error variances of 6, and 62 proportional to the ratio of ni to n2 seems to set

A equal to I n, times and to 0 n2 times while alternating between the two sections from the

beginning of the test seems. It should be noted, however, that with a multidimensional item

pool the responses to each item contribute to the variance of 6, as well as 62, and therefore

both variances must be calculated over all no-n2 items in the test. Therefore, either variance

may become more favorable than strictly required.

Numerical Examples

A pool of 500 items was simulated drawing random values for the parameters ad and

a,2 from U(0.0,1.3) and for d, from U(-1.3,1.3). The ranges of these distributions correspond

roughly to the ranges of the parameter values in a two-dimensional ACT Assessment Program

VEST COPY HAMA


Mathematics Item Pool used in van der Linden (1966) to study the performance of a linear

programming model for assembling fixed-form tests. The adaptive algorithm was applied to

simulated responses to a 50-item test of examinees with abilities on a two-dimensional grid

defined by 01,02=-2.0, -1.8, ..., 2.0. Because the model in (1) permits MLE only when at least

three responses are available, the full adaptive procedure was started only after responses to

the first three items were simulated. The first two items were defined to have the same

parameter values (a11,a21,cli)=(1.2,0.1,0.0) and (0.1,1.2,0.0) for all examinees. The third item

was selected from the pool applying (13) to simulated responses on the first two items. The

log-likelihood function for the model in (1) is known to have an occasional unbounded

maximum. In such cases, which happened predominantly for the combination of short test

lengths and extreme values of (01,02), the ability estimates were truncated at ±2. For each

combination of ability values, 100 replications were produced. The study was repeated for

=0.250, 0.375, and 0.500 (larger values of A were omitted because of symmetry).

Figure 1 shows the estimated bias and mean-squared error (MSE) of A 61; + (1- A) 012

as a function of el and 02 for k=10, 30, and 50 items and the different values of A . The

dominant impression from these plots is that test length is a decisive factor but the choice of a

value for weight A. hardly has any effect on the behavior of the estimator. At 10 items the

estimator has a unfavorable MSE for all values of A 01+ (/ - A.)02. At the extremes, the large

MSE is in part due to an intolerably large bias in the estimator. However, for 30 items the

procedure seems to work reasonably well, and at 50 items both the MSE and the bias seem to

be ignorable for all practical purposes. The fact that the results are robust with respect to

implies that from a statistical point of view the size of A is hardly important when setting up

a program of multidimensional adaptive testing and that this factor can vary freely across

applications without having any impact on the statistical properties of the ability estimator.

Conclusions

The main conclusion from this study is that use of the procedure in this paper seems

practically feasible, provided the number of items in the test is set not too small. For short

tests, the ML estimators of 0, and 02 are strongly biased and unstable, even when combined

into a linear combination as in this paper. In this case, it seems better to resort to a Bayesian

procedure as the one in Segall (1996). If empirical information about the correlation between

the ability variables is available from external sources, a Bayesian procedure allows for the

possibility of building this information into the (multivariate) prior distribution for the

abilities. As a consequence, the adaptive estimator can be expected to stabilize quicker as a

function of the test length.

1 2


Figure 1.

Bias and MSE functions of A..6`,'+ (1- A.,) 6i for =0.250, 0.375, 0.500 (k=10, 30, 50)

15k = 50 Bias o .

0.51

...44,..,4.. 44..."4.+,........r to ....404:'4,14.40....4...'4...Ar.4 4 A ,..... ,'..' .. ***,4111.1: .S."

.. 4,Z740..4,4,4k, .4404....... 4........ 1*.jr....? ftr...,4.O. ........ 41.4,614"..... 4,,,W .rAb.... ,ht,,OP4:4$1,44:41,4*Z4IptiNiterirIfiliettgr

041r4 litir Al* NI,4Nlity.4,41, 04., gril,,,Nr,..t, .

40.4111%, 404" 44r4.4p-

1 0

el 15 20

4,.......4.......- ....- .....424,41.t.4.7,...,

.410 r: -4:41,,,P4.'4`.......794 A....Ir.., 40 ,....."4.;47 4 V 4 .r.ot 94' ..

1 ..."4.44....444.-.4.4444.J...44,...4.4144.74-,4. 44 4. 4.4444. 4.04-r-. 4....4-4k = 30

....4,,.........r.4,..+47,...4...."Bias ° 50

v., '.4. ' . 11 4.4 * ANI. 1114 -.4110.40.41.* tli:::"-iNNO*41Nrilt4r. ''''. r4:41.4,-4-r,01,41..40,442r41,4r4: 20

0.5 --44.PNittpte...4b.,_414,44....tra.......,."40.t, ..... 41,. 15.047.4.4.,. AN,. .....,74.41 No, ...

aqz4.474teir**to,

15

= .250

1 . 5

20 MSE 10.

109

10

e 1 15

..444.0"...v.,.....7:41rzo.--04-4T.,

;Air4 6.4'4F ZAP' .Atirw ir. 4,41'0,1

''''''''''''''''' '" 44..it-r-At4

1.010grit ."41:4-47" 1wit

4'N0vtv ...Bias .0-'*44At,,,A4V-

lic-tt'..,;,-;;41:01,0-1,44.4si.--0.5k = 10 o- .. ,a4,4r --4,-,........-7 0- 15

- 1 ' -41.4,P Mr411. 4f4r

20

10e2

20

20

1 0 02

..........

..,. 4,.......,,........s.- - - 2 0

tire, .4- 4.44.4,44,7t' 4

.,.*.44 . 4,-

4.11,,,:4.tr-1......

WIPP

0 er ,,,,i):44,_474,4:41t7

44.11g61,0141PWW474.4 ,..... ...,t4 .44..47._-4.4.44.4,4...744,-4.44"/Ar:

ij 15

10 .41142A14. ,,4;5

0

131 15 20

1.0 02

44:14..--4kirok-..ok.-46-...t--41.7:4-1,..ir-4:24,4041e44,444..--4:7-44V12.F.42 47,p rAttr42.4- 4 ."..4

"V'skitt,".:',..144APP.441Peer-&474',ft *agbie-: ".!::::4:1"41...7.4k4"t '-

feIN.14-4411t0t0.*46',..w.r4,4:--tr.* -..-cs-er4144.4443.-t=1;pk7141P4rt,..404,4**-Zr-e.:firk's'4`4P4,04k4r441:44f6134,42,41

44.4.-.4,-....., 4,-4.

'IP*, 40440* allr41°44.4:,

k = 50

k = 30

k = 10

0.15Bias

-0.

ef.t*".......-,.:1,14::::,4..,........,4,..'''''.47447..74.4:elk05:7421.44:tir-41.".*......- ...tr....MK.

..."414740**,Al*44404,4"4"474,441:***'`7"110,

--4104.4.-riokeltir.41,47. 0

20

15


= .375

1.5

20 MSE 10.5

1 0 02

st;`,.....

....444.44141.--747474-A-440.44.:&I.04..40....,,, ..,40,&,04,3/47;»4,.W.,11.2,40,,r4,06,700 '..o..7.,.....14 40.ilpip.4.4.1.44.040-4.-,NrApizezt., "..,,,,,,tr

-.2.7,44,r4,.y4p,24704, ."r-. aw,.....opr.-.44214.41afttirtir

...,. , .-... . 4 .:....

-"*-4-4r 47.-....,Ii774,=4.....7"4:.1,4ifeite1:74::-Ir,,,- ...

Ai,gr:45.42"--1PraAiorlr'4%747.10. 4101, .11,

v4i7 VIPAr4P-<4:0414.404firlitt-,1111111%Vr=4.--.0

4,40P-14,4*04-woo "..4,,,...."-i.-1,-,..."774. 4r4; `. ,iftio ir

1 twIr414:44 1 ". 24404 14e4 448-74,°'....- . 4.,."4"4147.T»..-.1,44ilira*,

Jur Ater10

/31 15

MST COPY AVAILME

20

5

20

'02

46-7<r.row

7444:744117"242:Stet........., ...,,,,,.........,..... -44.

41,44:4144)1P*4 :1%47,4%.14:3/41 1342 ,,,,,,r4,4044, 40..... ,, .44 +104 . /4 ;474,4t..., rZliii iti 'kg t .4*il V; tp7; %'41' littlIter#40.4.41741114,4r4a4,121//tN,4',

10111P.

. .10. 41i7pt VOA..44V,. .. 474 rA, 1...A/4r ..*QV

0

10e 15

1 20

5

20

15

10 02

,Ar. . 4 7 1 .. . w . ..4716-4......... ... AV, 40. ..4......,

..."147;et4:17,.., 4..0,10,4423.1.420.4.4 Wairew4 4:.;$14%4 it*. tr.47:4 it 'If: r, 43 I7

Alliele:47-4,4112.41,'"411::..;44, ,4 V :W4 47:06'4.411.474P.4:144,4, '

\If 4.11Nr4l,r4,11..,4t=4*.t..ApPeli410, ',Or 411. ANII, * ,r4reg 4...4r4r....*..441,4044. 4tiff Wirit4.4747411,4741 Ak. 41,

46:04/, 40.4.41,4V4Niqr'W 94,4r4W411,4**

'KV 4,11,41,414141, 4.4:PNI,


= .500

0 .0k = 5 0 Bias

-0.5-1

4.40r,r,4.4....,71,. ....r...,................ ........._.4.---------4-4-4,4...4.44.-__-,4, ..., ...... ,41,,,, ..... ........,4-,--......-....-.. A..., -..,....., -. -..

..&,-...............4;442,41".4-4.4r 4..4.-... ..... ...., ...... .......... 4....,...7.-......".........4.4.,...4....-2.4,,IN...i,r4,47,.....-.....--....4"...40:40.....4,-...t.-4,14.4....1,44-44-41:002,44.474.

,3/44.r.r..z.fo4,4,4,-",10.410.47044-.4,20 z.

--%. 414Aril,

1015

1 20

15

2

100

4#724:4,....

-.1``....rva. .....liplp....+2.:4042441r4-447.7....4,144:147..."...P.s,aw40.7.744,,P40,-,4,..- INfr 44'+4.W;411"04N.Z.P., ""'"46

04, .I. 4* 4.4.4..... "......4,4, ...4,../4,1/4k,4, 4414, 414, A., 440,6;- ,,,pr ..... -rOmr,r,lolike!

412.464.

,,....,4404.4ptom/

+. o,''. ....

......,-...-4.4....47.:2-...4...-.........4-.4,-.. 4......4.-........14-,1,....e..4;,40,..4.4*.:41,444'...,...4

A..."....,42,:-...-41..-4,--4.--4-4--.0.047-1.4 CV: A":"41,4 ee,e4r4 Oatt.........t.o.....42.0.42.,14-.0. ...,,Noaut"itI.

Op

-6-..4., .......... .44,4".4,4...A.1,,,te4.04,7,. ...r."'-.wite474 *04** 4.41.,, 4.4.Nt,, 0-4,

41%, it... .14.0.4474:907

........-...7.-.,

..1.....4...-..-z..1.::-.),-.....,.1....._.,.........,_....w........... a 41 ......4.....---444+-.424111,444,0a, 4-ze,

',..40,441'4.4.....4--4.4,'-;;;74.4, 44,4-41u7442.474r474,...--4,4,

.1,-AV.t,*k . 10 lip...., .....,....041,

...44,tt....4.......he ., __.....,...0..94...74.0

Bias as 0 O -1***-20.-4"...A4e4.44.4*44011%- o . 5 ,_ ....."4....-- 443/4,11.2.4

4IP W.kobibt,'40.4*-..4, 4' 4" 4- a. 1110,11gr44041,

dv 41,404,, 4.4iktir litta,41*'''.41,411:firNI,

1 . 5

M S E 10.

0

,ty.Zki.t;;.:Ai.:,.t..ptzr.t_..t.,1pp.,Otor

4r4,44,4" ''' . Al'474"' ...

tlilitlitlit.).-475.41,41.117;17:44111:1112rr441,41,4**4474iit 1 5

*-,t+,N.4P4F4r4r-t'4r4tw

1 0

44"4570-444-Lirli 10 0 2

*411 e15

20


Multidimensional IRT has a tradition of using a multivariate information measure as a

tool for analyzing a test searching the ability space for directions in which information is

maximal. These directions then define the ability composite which the test is assumed to

measure best (Ackerman, 1994; Fan, 1996; Reckase & McKinley, 1991). The orientation in

the current paper has been different. The composite was defined to be the parameter of

interest. Next, the adaptive algorithm was used to have uniform measurement accuracy across

the ability space. As demonstrated by the flat MSE functions for k=30 and 50 in Figure 1, it is

possible indeed to have the same favorable measurement precision for all ability points and

not only for those points that "define the composite".

16


References

Ackerman, T.A. (1994). Creating a test information profile for a two-dimensional

latent space. Applied Psychological Measurement, 18, 257-275.

Andersen, E.B. (1980). Discrete statistical models with social science applications.

Amsterdam: North-Holland.

Fan, M., & Hsu, Y. (April, 1996). Multidimensional computer adaptive testing. Paper

presented at the annual meeting of the American Educational Testing Association, New York

City, NY:

Fraser, C. & McDonald, R.P (1988). NOHARM II: A FORTRAN program for fitting

unidimensional and multidimensional normal ogive models of latent trait theory. Armindale,

Australia: Centre for Behavioral Studies, University of New England.

Lord, F.M. (1980). Applications if item response theory to practical testing problems.

Hillsdale, NJ: Erlbaum.

McDonald, R.P. (1966). Normal-ogive multidimensional model. In W.J. van der

Linden & R.K. Hambleton (Eds.), Handbook of modern item response theory. New York City,

NY: Springer-Verlag.

McKinley, R.L., & Reckase, M.N. (1983). An extension of the two-parameter logistic

model to the multidimensional latent space (Research report ONR 83-2). Iowa City, IA:

American College Testing.

Reckase, M.D. (1985). The difficulty of test items that measure more than one ability.

Applied Psychological Measurement, 9, 401-412.

Reckase, M.D. (1966). A linear logistic multidimensional model for dichotomous item

response data. In W.J. van der Linden & R.K. Hambleton (Eds.), Handbook of modem item

response theory (pp. 271-286). New York City, NY: Springer-Verlag.

Reckase, M.D., & McKinley, R.L. (1991). The discriminating power of items that

measure more than one dimension. Applied Psychological Measurement, 15, 361-374.

Samejima, F. (1974). Normal ogive model for the continuous response level in the

multidimensional latent space. Psychometrika, 39, 111-121.

Segall, D.O. (1996). Multidimensional adaptive testing. Psychometrika 61, 331-335.

van der Linden, W.J. (1996). Assembling tests for the measurement of multiple traits.

Applied Psychological Measurement, 20, 373-388.

Wainer, H. (Ed.) (1990). Computerized adaptive testing: A primer. Hillsdale, NJ:

Erlbaum.

Wilson, D. Wood, R, & Gibbons, R. (1984) TESTFACT: Test scoring, item statistics,

and item factor analysis. Mooresville, IN: Scientific Software.

17 3E07 CON AVAIIA ILE


Author Note

The author is most indebted to Wim M.M. Tie len for his computational support in preparing

. the numerical examples and to Cees A.W. Glas for his comments on an earlier version of this

paper. Send all correspondence to: W.J. van der Linden, Department of EducationalMeasurement and Data Analyses, University of Twente, P.O. Box 217, 7500 AE Enschede,

The Netherlands. Email: [email protected].

Titles of Recent Research Reports from the Department ofEducational Measurement and Data Analysis.

University of Twente, Enschede,The Netherlands.

RR-97-03 W.J. van der Linden, Multidimensional Adaptive Yesting with a Minimum Error-Variance Criterion

RR-97-02 W.J. van der Linden, A Procedure for Empirical Initialization of AdaptiveTesting Algorithms

RR-97-01 W.J. van der Linden & Lynda M. Reese, A Model for Optimal ConstrainedAdaptive Testing

RR-96-04 C.A.W. Glas & A.A. Beguin, Appropriateness of IRT Observed Score Equating

RR-96-03 C.A.W. Glas, Testing the Generalized Partial Credit ModelRR-96-02 C.A.W. Glas, Detection of Differential Item Functioning using Lagrange

Multiplier TestsRR-96-01 W.J. van der Linden, Bayesian Item Selection Criteria for Adaptive TestingRR-95-03 W.J. van der Linden, Assembling Tests for the Measurement of Multiple Abilities

RR-95-02 W.J. van der Linden, Stochastic Order in Dichotomous Item Response Modelsfor Fixed Tests, Adaptive Tests, or Multiple Abilities

RR-95-01 W.J. van der Linden, Some decision theory for course placementRR-94-17 H.J. Vos, A compensatory model for simultaneously setting cutting scores for

selection-placement-mastery decisionsRR-94-16 H.J. Vos, Applications of Bayesian decision theory to intelligent tutoring systemsRR-94-15 H.J. Vos, An intelligent tutoring system for classifying students into Instructional

treatments with mastery scoresRR-94-13 W.J.J. Veerkamp & M.P.F. Berger, A simple and fast item selection procedure

for adaptive testingRR-94-12 R.R. Meijer, Nonparametric and group-based person-fit statistics: A validity

study and an empirical exampleRR-94-10 W.J. van der Linden & M.A. Zwarts, Robustness of judgments in evaluation

researchRR-94-9 L.M.W. Alckermans, Monte Carlo estimation of the conditional Rasch model

RR-94-8 R.R. Meijer & K. Sijtsma, Detection of aberrant item score patterns: A review ofrecent developments

RR-94-7 W.J. van der Linden & R.M. Luecht, An optimization model for test assembly tomatch observed-score distributions

RR-94-6 W.J.J. Veerkamp & M.P.F. Berger, Some new item selection criteria for adaptivetesting

RR-94-5 R.R. Meijer, K. Sijtsma & I.W. Molenaar, Reliability estimation for singledichotomous items

1 9 3207 COPY AVAZA Dr

RR-94-4 M.P.F. Berger & W.J.J. Veerkamp, A review of selection methods for optimaldesign

RR-94-3 W.J. van der Linden, A conceptual analysis of standard setting in large-scaleassessments

RR-94-2 W.J. van der Linden & H.J. Vos, A compensatory approach to optimal selectionwith mastery scores

RR-94-1 R.R. Meijer, The influence of the presence of deviant item score patterns on thepower of a person-fit statistic

RR-93-1 P. Westers & H. Kelderman, Generalizations of the Solution-Error Response-Error Model

RR-91-1 H. Kelderman, Computing Maximum Likelihood Estimates of Loglinear Modelsfrom Marginal Sums with Special Attention to Loglinear Item Response Theory

RR-90-8 M.P.F. Berger & D.L. Knol, On the Assessment of Dimensionality in

Multidimensional Item Response Theory ModelsRR-90-7 E. Boekkooi-Timminga, A Method for Designing IRT-based Item BanksRR-90-6 J.J. Adema, The Construction of Weakly Parallel Tests by Mathematical

ProgrammingRR-90-5 J.J. Adema, A Revised Simplex Method for Test Construction ProblemsRR-90-4 J.J. Adema, Methods and Models for the Construction of Weakly Parallel TestsRR-90-2 H. Tobi, Item Response Theory at subject- and group-levelRR-90-1 P. Westers & H. Kelderman, Differential item functioning in multiple choice

items

Research Reports can be obtained at costs, Faculty of Educational Science and Technology,University of Twente, Mr. J.M.J. Nelissen, P.O. Box 217, 7500 AE Enschede, The Netherlands.

2 0

U.S. DEPARTMENT OF EDUCATIONorric of Educational A ch and improvement tOERA

Educational RYbOUICe Information Cantor (ERIC)

NOTICE,

REPRODUCTION BASIS

This document is covered by a signed "Reproduction Release(Blanket)" form (on file within the ERIC system), encompassing allor classes of documents from its source organization and, therefore,does not require a "Specific Document" Release form.

This document is Federally-funded, or carries its own permission toreproduce, or is otherwise in the public domain and, therefore, maybe reproduced by ERIC without a signed Reproduction Releaseform (either "Specific Document" or "Blanket").

PO.

Documents

DOCUMENT RESUME ED 419 005 TM 028 302 AUTHOR van der … · 2014. 5. 19. · TITLE Multidimensional Adaptive Testing with a Minimum. Error-Variance Criterion. INSTITUTION Twente Univ.,