Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
DOCUMENT RESUME
ED 419 005 TM 028 302
AUTHOR van der Linden, Wim J.TITLE Multidimensional Adaptive Testing with a Minimum
Error-Variance Criterion.INSTITUTION Twente Univ., Enschede (Netherlands). Faculty of Educational
Science and Technology.REPORT NO RR-97-03PUB DATE 1997-00-00NOTE 20p.
AVAILABLE FROM Faculty of Educational Science and Technology, University ofTwente, P.O. Box 217, 7500 AE Enschede, The Netherlands.
PUB TYPE Reports Evaluative (142)EDRS PRICE MF01/PC01 Plus Postage.DESCRIPTORS Ability; *Adaptive Testing; Algorithms; *Computer Assisted
Testing; Criteria; Foreign Countries; *Maximum LikelihoodStatistics; Selection
IDENTIFIERS *Multidimensionality (Tests)
ABSTRACTThe case of adaptive testing under a multidimensional
logistic response model is addressed. An adaptive algorithm is proposed thatminimizes the (asymptotic) variance of the maximum-likelihood (ML) estimatorof a linear combination of abilities of interest. The item selectioncriterion is a simple expression in closed form. In addition, it is shown howthe algorithm can be adapted if the interest is in a test with a "simpleinformation structure." The statistical properties of the adaptive MLestimator are demonstrated for a two-dimensional item pool with severallinear combinations of the two abilities. (Contains 1 figure and 15references.) (Author/SLD)
********************************************************************************* Reproductions supplied by EDRS are the best that can be made *
* from the original document. *
********************************************************************************
Oo
Multidimensional Adaptive Testing Researchwith a Minimum Error-Variance Criterion Report
97-03
Wim J. van der Linden
U.S. DEPARTMENT OF EDUCATIONOffice of Educational Research and Improvement
EDUCATIONAL RESOURCES INFORMATIONCENTER (ERIC)0.."document has been reproduced as
received from the person or organizationoriginating it.
1:1 Minor changes have been made toimprove reproduction quality.
Points of view or opinions stated in thisdocument do not necessarily representofficial OERI position or policy.
BEST COPY AVAILABLE
2
Multidimensional Adaptive Testing
with a Minimum Error-Variance Criterion
Wim J. van der Linden
3
Abstract
The case of adaptive testing under a multidimensional logistic response model is addressed.
An adaptive algorithm is proposed that minimizes the (asymptotic) variance of the maximum-
likelihood estimator of a linear combination of abilities of interest. The item selection
criterion is a simple expression in closed form. In addition, it is shown how the algorithm can
be adapted if the interest is in a test with a "simple information structure". The statistical
properties of the adaptive ML estimator are demonstrated for a two-dimensional item pool
with several linear combinations of the two abilities.
Multidimensional Adaptive Testing - 2
Multidimensional Adaptive Testing
with a Minimum Error Variance Criterion
Adaptive testing algorithms for item pools calibrated under a unidimensional item
response theory (IRT) model have been well investigated (e.g., Lord, 1980; Wainer, 1990),
and several large-scale testing programs are in the process of introducing adaptive testing as
an alternative to traditional paper-and-pencil tests. Since these programs need large item pools
to guarantee measurement precision, in particular if measures to balance test content and
control item exposure are implemented, violations of the assumption of unidimensionality of
the item pool can be expected. Study of algorithms for adaptive testing under amultidimensional model seems therefore a timely matter.
The present paper is a sequel to van der Linden (1996) in which the problem ofoptimal assembly of a fixed test form from an item pool measuring multiple abilities is
addressed. The emphasis in the work underlying this earlier paper was on an algorithm for
assembling the fixed form to match optimally a set of targets for the (aymptotic) error
variance functions for the abilities subject to a large variety of constraints on the composition
of the test. It is the purpose of the present paper to study the use of the error variance as an
item-selection criterion in tests with an adaptive format. Independent results on
multidimensional adaptive testing are presented in Fan and Hsu (April, 1996) and Segall
(1996). The interest in the former is in investigating the differences between item selection
criteria based on various types of multivariate information measures rather than the error
variance of the estimator(s). Also, these measures are evaluated over random sampling of
correlated abilities. In the latter, the volumes of the confidence ellipsoid and the posterior
credibility ellipsoid are proposed as multivariate item selection criteria. The posterior
credibility ellipsoid is an attractive criterion because it allows for the possibility to build prior
knowledge on dependencies between the ability variables into the item selection procedure.
We will return to this point later in the paper.
The paper is organized as follows: The following section introduces the
multidimensional IRT logistic model used in the presentation of the algorithm and motivates a
linear combination of abilities as the parameter of interest in multidimensional adaptive
testing. The subsequent section discusses the (asymptotic) variance of the estimator of a linear
combination of ability parameters. Then it is proposed to minimize the variance of thisestimator as a criterion for multidimensional adaptive testing, and an adaptive algorithm
minimizing the variance is presented. The algorithm involves expressions of the item
parameters which are easy to evaluate. The last section demonstrates the use of the algorithm
for a two-dimensional item pool and investigates the statistical properties of the adaptive
estimator for various linear combinations of the two abilities.
5
Multidimensional Adaptive Testing 3
Multidimensional Model
Dichotomous response variables U; are used to denote the responses of an examinee to
item i=1,...,n. The variables take the value 1 if the response is correct and the value 0 if it is
incorrect. The model is the following multivariate logistic response function:
exp(ai 'a di)Pi(a) = Prob{U, =110,ai,di} =
1+ exp(ai '0-d;)(1)
where 0 =( ), with -oo < 0, <oo for j=1,...,m, is a vector of m ability variables,
ai with a,i>0 for j=1,...,m, is the vector of loadings of item i on these abilities
(item discriminations), and -.0<di<oo is a scalar representing a linear combination of the
difficulties of the item along the ability dimensions. Detailed information about the model is
given in McKinley and Reckase (1983), Reckase (1985, 1996), and Samejima (1974).
It is assumed that the item parameters a, and d, have already been estimated, and that
the estimates are sufficiently accurate to consider them as the true parameter values. The
parameters can be estimated using the Bayesian methods implemented in the program
TESTFACT (Wilson, Wood, & Gibbons, 1984), or through McDonald's (1996) harmonic
analysis applied to a normal approximation to the logistic function implemented in the
program NOHARM (Fraser & McDonald, 1988).
Parameter of Interest
. It is assumed that the parameter to be estimated by the adaptive testing procedure is a
linear combination of the abilities, a 19, where 2 =( A, ,..., ) is a vector of nonnegative
weights. The choice of this parameter is motivated by the following practical cases:
1. The item pool is intentionally designed to measure more than one ability. However,
the consumers of the test scores only want a single number to be reported. An
obvious example is an item pool for a test to predict a future criterion of success in a
selection problem, where the criterion is multifaceted. In this case, the weights A, are
to reflect the relative importance of the individual abilities with respect to thecriterion.
2. The item pool is designed to measure only one ability but the items are sensitive to
some "nuisance abilities" as well. A well-know example is a test for mathematical
ability depending on verbal abilities required to understand the items. This case can
6 'nor 7, 0 FY AVAILABLE
Multidimensional Adaptive Testing 4
be dealt with by setting the weights Ai =0 for all nuisance abilities. As will become
clear below, this measure does not neutralize the effect of the values of the nuisance
parameters on the variance of the estimator of the intended ability but does allow for
direct minimization of this variance.
3. Even though the item pool measures several abilities, different sections of the test
may be required to be maximally informative with respect different abilities, for
example, because identifiable subsections of the tests are to be used for diagnosing
individual abilities. As will be shown below, an adaptive test with a "simple ability
structure" over the ability space can be realized by choosing different values for the
weights A. at different stages of the procedure. Note that this case is not equivalent
to the one of choosing items from different unidimensional item pools; rather than
rotating the selection of the items across unidimensional pools the weights A, are
rotated across different preselected values while selecting items from a
multidimensional pool.
For an extended description of the above cases of multidimensional testing using the format of
a fixed-form test, see van der Linden (1996).
Ability Estimation
It is assumed that A 'a is estimated by the method of MLE. As is well known, for
MLE it holds that
A '0 = A '6. (2)
For a response pattern (u1,...,un), the likelihood of 6 is defined as
L(0;u1,...,un,a1,...,an,d1,...,d.) = f Prob{U, = u,I0,ai ,c1,} (3)i=1
The joint MLE of 0,, j=1,...,m, is the vector of values maximizing this likelihood. The
likelihood equations are obtained setting the partial derivatives of the log of (3) equal to zero:
a InL n
a 0- IL In p
I
(6) + (1- th)ln(1 -pi (0)) = 0, j=1,...,m,,
7
(4)
Using
La79, Pi (6)[1- Pi (0)i,
the likelihood equations can be written as
Multidimensional Adaptive Testing 5
(5)
Pi (2)] = 0, j=1,...,m,. (6)
which is the common form known to exist for a model belonging to the exponential family
(Andersen, 1980, sect. 3.2). The system can be solved using Newton's method.
Bo-o _[}{(00_,)) TI 100-0)
where H(B` -D) is the Hessian of the log-likelihood function with elements
a' lnL - -/a, a h P P, (0)] , g,h = 1,...,m.ae ggaeb
(7)
(8)
and y(0(1-1) is the gradient of the log-likelihood function, i.e., the vector with the first
derivatives set equal to zero in (6), both evaluated at step t-l. Substitution of the results from
(7) into (2) gives the MLE of A '0.
Variance of A '0
The asymptotic covariance matrix of the MLE of 0 is given by the inverse of Fisher's
information matrix
I(0 ) (-Ea2InL(0;UI,...,Un,a1,...,an,c11,...,dn)),
aegaoh(9)
with I, 0; U1, , tin ai,.. , an , di ,...,d.) being the likelihood statistics associated with the
random response vector and 0 p and 0, any two components of 0 . From (8), it follows that
8
Multidimensional Adaptive Testing - 6
-ED2 lnL(0; Ul, , ar, , ,..., dr,)
Ea', ad, (12A1 p, (0)] .aegaeh 1=1
Standard techniques for matrix inversion yield the (asymptotic) covariance matrix
V a Var(610 ) = I(0)1,
(10)
where the determinant of the information matrix, 140)1, is assumed to be not vanishing. For
the linear combination A '0
Var( A '010) = A 'VA. (12)
For a model with two ability parameters, 0 = ( 01,82 ), and (Al ,A2) (A,1- A) , the
result in (12) boils down to
where
Var(A.6,+ (I- a..)6218,,e2)
A2var(6,1e.,02) + (1-A)2var(621e,,e2) + 2A.(1--)cov(6,,6210,02)n n
= [A? y P, (01,02){ 1 (61,02)} + (1- A.)2E4Pi(ei,02){ 1 (01,02)}1=1 1=1
+ 2,1(1 A) E ail a,2 P, (01,02) { 1 Pi (01,02) } VII(01,02)1,1=1
n n
140,,e2)1 = [E4P,(8,,e2){ P, (e,,e2)}1[E4 (e,,e2){ P, (01,82) }]1=1 1=1
[E ail ai2 P,(01,02){ Pi (o, ,e2)}121=1
Note that for n=2 the two items should not be parallel, i.e., it should not hold that
ali=a21,
a12 =a22,
d1=d2,
9
(13)
(14)
(15)
Multidimensional Adaptive Testing 7
because then the determinant in (13) vanishes.
Adaptive Testing Algorithm
For notational convenience, the adaptive testing algorithm is also presented for the
case of two ability parameters. The following definitions are needed: The items in the pool are
indexed by 1=1,...,1. The adaptive testing procedure is assumed to be stopped after n items
have been administered. The order of the items in the test is indexed by k=1,...,n. Thus, ik is
the index of the item in the pool administered as the kth item in the test. Suppose k-1 items
have been selected. Let Sk={ ii,,ik-I} denote this set of items. Then, Rk= \Sk is the set
of items rejected so far, and item ik has to be selected from this set. Finally, let 6; and 62 be
the estimators of 01 and 02 after k items have been administered.
The kth item is selected according to the following criterion:
mink, {Var(A 6; + (1 -A.)(k 16:`-' ,111-1)} , (16)
that is, the item is selected to minimize the variance of A + (1- A) 62 evaluated at the
current estimates, which is (13) for (el , 02) = (-1,6k2-1)
To implement the criterion, define
k-I
= 14, Pig (61k 1,6Z ')( Pit ,6,`g=1
+ Pi, (61'1,612'1)f 1- P (6;4,612'4)/
k-1vj Ea 01C-I ,62i){ (61k-1,62k-I))
g=1
+ a?213,,(6r ,6 2') 1 Pi (61" ,6 12'4)}
k-I
Wi = ay aid Pi, (6i-1,6Z-1){ 1- Pis (6r ,6Z-1)}g=1
+ a,,21),, (611 ,612) I ,V21))
The criterion can thus be expressed in closed form as
..,17 0
(17)
(18)
(19)
Multidimensional Adaptive Testing - 8
ik = min,{[A2vi+(1-A) u -2A(1-A)w1/[u )2]; j e Rk} (20)
In order to select the next item, for each item in Rk one term is added to the sums in (17)-(19).
These term involves both the parameters a11, ap, and d, and the probability P;(01,02), where
the last quantity is evaluated at the current estimates e,=6," and 02=62" . The item
minimizing the expression in (20) is selected.
Algorithm. The algorithm can be summarized as follows:
1. Choose a value for A reflecting the validity of the test;
2. Select item i, and i2 according to some external criterion;
3. Estimate e, and 02 using the MLEs from (7);
4. Enter the values of the parameters of item ii and i2 into (17)-(19);
5. Evaluate (17)-(19) for i E R2, and select the minimizer of (20);
6. Repeat Steps 3-5 for k=3,...,n.
In selecting the items in Step 2, the item parameters should be avoided to approximate
the conditions in (15) because of instability in (20). As long as the examinee produces
responses which are uniformly correct or incorrect, the MLEs of e have to be bounded by
well-chosen value.
Simple Structure. As already noted, different sections of the test may be required to
have good measurement qualities with respect to different abilities ("simple ability structure").
For two abilities, let ni and n2 be the required numbers of items in the two sections. The best
method to obtain error variances of 6, and 62 proportional to the ratio of ni to n2 seems to set
A equal to I n, times and to 0 n2 times while alternating between the two sections from the
beginning of the test seems. It should be noted, however, that with a multidimensional item
pool the responses to each item contribute to the variance of 6, as well as 62, and therefore
both variances must be calculated over all no-n2 items in the test. Therefore, either variance
may become more favorable than strictly required.
Numerical Examples
A pool of 500 items was simulated drawing random values for the parameters ad and
a,2 from U(0.0,1.3) and for d, from U(-1.3,1.3). The ranges of these distributions correspond
roughly to the ranges of the parameter values in a two-dimensional ACT Assessment Program
VEST COPY HAMA
Multidimensional Adaptive Testing 9
Mathematics Item Pool used in van der Linden (1966) to study the performance of a linear
programming model for assembling fixed-form tests. The adaptive algorithm was applied to
simulated responses to a 50-item test of examinees with abilities on a two-dimensional grid
defined by 01,02=-2.0, -1.8, ..., 2.0. Because the model in (1) permits MLE only when at least
three responses are available, the full adaptive procedure was started only after responses to
the first three items were simulated. The first two items were defined to have the same
parameter values (a11,a21,cli)=(1.2,0.1,0.0) and (0.1,1.2,0.0) for all examinees. The third item
was selected from the pool applying (13) to simulated responses on the first two items. The
log-likelihood function for the model in (1) is known to have an occasional unbounded
maximum. In such cases, which happened predominantly for the combination of short test
lengths and extreme values of (01,02), the ability estimates were truncated at ±2. For each
combination of ability values, 100 replications were produced. The study was repeated for
=0.250, 0.375, and 0.500 (larger values of A were omitted because of symmetry).
Figure 1 shows the estimated bias and mean-squared error (MSE) of A 61; + (1- A) 012
as a function of el and 02 for k=10, 30, and 50 items and the different values of A . The
dominant impression from these plots is that test length is a decisive factor but the choice of a
value for weight A. hardly has any effect on the behavior of the estimator. At 10 items the
estimator has a unfavorable MSE for all values of A 01+ (/ - A.)02. At the extremes, the large
MSE is in part due to an intolerably large bias in the estimator. However, for 30 items the
procedure seems to work reasonably well, and at 50 items both the MSE and the bias seem to
be ignorable for all practical purposes. The fact that the results are robust with respect to
implies that from a statistical point of view the size of A is hardly important when setting up
a program of multidimensional adaptive testing and that this factor can vary freely across
applications without having any impact on the statistical properties of the ability estimator.
Conclusions
The main conclusion from this study is that use of the procedure in this paper seems
practically feasible, provided the number of items in the test is set not too small. For short
tests, the ML estimators of 0, and 02 are strongly biased and unstable, even when combined
into a linear combination as in this paper. In this case, it seems better to resort to a Bayesian
procedure as the one in Segall (1996). If empirical information about the correlation between
the ability variables is available from external sources, a Bayesian procedure allows for the
possibility of building this information into the (multivariate) prior distribution for the
abilities. As a consequence, the adaptive estimator can be expected to stabilize quicker as a
function of the test length.
1 2
Multidimensional Adaptive Testing - 10
Figure 1.
Bias and MSE functions of A..6`,'+ (1- A.,) 6i for =0.250, 0.375, 0.500 (k=10, 30, 50)
15k = 50 Bias o .
0.51
...44,..,4.. 44..."4.+,........r to ....404:'4,14.40....4...'4...Ar.4 4 A ,..... ,'..' .. ***,4111.1: .S."
.. 4,Z740..4,4,4k, .4404....... 4........ 1*.jr....? ftr...,4.O. ........ 41.4,614"..... 4,,,W .rAb.... ,ht,,OP4:4$1,44:41,4*Z4IptiNiterirIfiliettgr
041r4 litir Al* NI,4Nlity.4,41, 04., gril,,,Nr,..t, .
40.4111%, 404" 44r4.4p-
1 0
el 15 20
4,.......4.......- ....- .....424,41.t.4.7,...,
.410 r: -4:41,,,P4.'4`.......794 A....Ir.., 40 ,....."4.;47 4 V 4 .r.ot 94' ..
1 ..."4.44....444.-.4.4444.J...44,...4.4144.74-,4. 44 4. 4.4444. 4.04-r-. 4....4-4k = 30
....4,,.........r.4,..+47,...4...."Bias ° 50
v., '.4. ' . 11 4.4 * ANI. 1114 -.4110.40.41.* tli:::"-iNNO*41Nrilt4r. ''''. r4:41.4,-4-r,01,41..40,442r41,4r4: 20
0.5 --44.PNittpte...4b.,_414,44....tra.......,."40.t, ..... 41,. 15.047.4.4.,. AN,. .....,74.41 No, ...
aqz4.474teir**to,
15
= .250
1 . 5
20 MSE 10.
109
10
e 1 15
..444.0"...v.,.....7:41rzo.--04-4T.,
;Air4 6.4'4F ZAP' .Atirw ir. 4,41'0,1
''''''''''''''''' '" 44..it-r-At4
1.010grit ."41:4-47" 1wit
4'N0vtv ...Bias .0-'*44At,,,A4V-
lic-tt'..,;,-;;41:01,0-1,44.4si.--0.5k = 10 o- .. ,a4,4r --4,-,........-7 0- 15
- 1 ' -41.4,P Mr411. 4f4r
20
10e2
20
20
1 0 02
..........
..,. 4,.......,,........s.- - - 2 0
tire, .4- 4.44.4,44,7t' 4
.,.*.44 . 4,-
4.11,,,:4.tr-1......
WIPP
0 er ,,,,i):44,_474,4:41t7
44.11g61,0141PWW474.4 ,..... ...,t4 .44..47._-4.4.44.4,4...744,-4.44"/Ar:
ij 15
10 .41142A14. ,,4;5
0
131 15 20
1.0 02
44:14..--4kirok-..ok.-46-...t--41.7:4-1,..ir-4:24,4041e44,444..--4:7-44V12.F.42 47,p rAttr42.4- 4 ."..4
"V'skitt,".:',..144APP.441Peer-&474',ft *agbie-: ".!::::4:1"41...7.4k4"t '-
feIN.14-4411t0t0.*46',..w.r4,4:--tr.* -..-cs-er4144.4443.-t=1;pk7141P4rt,..404,4**-Zr-e.:firk's'4`4P4,04k4r441:44f6134,42,41
44.4.-.4,-....., 4,-4.
'IP*, 40440* allr41°44.4:,
k = 50
k = 30
k = 10
0.15Bias
-0.
ef.t*".......-,.:1,14::::,4..,........,4,..'''''.47447..74.4:elk05:7421.44:tir-41.".*......- ...tr....MK.
..."414740**,Al*44404,4"4"474,441:***'`7"110,
--4104.4.-riokeltir.41,47. 0
20
15
Multidimensional Adaptive Testing - 11
= .375
1.5
20 MSE 10.5
1 0 02
st;`,.....
....444.44141.--747474-A-440.44.:&I.04..40....,,, ..,40,&,04,3/47;»4,.W.,11.2,40,,r4,06,700 '..o..7.,.....14 40.ilpip.4.4.1.44.040-4.-,NrApizezt., "..,,,,,,tr
-.2.7,44,r4,.y4p,24704, ."r-. aw,.....opr.-.44214.41afttirtir
...,. , .-... . 4 .:....
-"*-4-4r 47.-....,Ii774,=4.....7"4:.1,4ifeite1:74::-Ir,,,- ...
Ai,gr:45.42"--1PraAiorlr'4%747.10. 4101, .11,
v4i7 VIPAr4P-<4:0414.404firlitt-,1111111%Vr=4.--.0
4,40P-14,4*04-woo "..4,,,...."-i.-1,-,..."774. 4r4; `. ,iftio ir
1 twIr414:44 1 ". 24404 14e4 448-74,°'....- . 4.,."4"4147.T»..-.1,44ilira*,
Jur Ater10
/31 15
MST COPY AVAILME
20
5
20
'02
46-7<r.row
7444:744117"242:Stet........., ...,,,,,.........,..... -44.
41,44:4144)1P*4 :1%47,4%.14:3/41 1342 ,,,,,,r4,4044, 40..... ,, .44 +104 . /4 ;474,4t..., rZliii iti 'kg t .4*il V; tp7; %'41' littlIter#40.4.41741114,4r4a4,121//tN,4',
10111P.
. .10. 41i7pt VOA..44V,. .. 474 rA, 1...A/4r ..*QV
0
10e 15
1 20
5
20
15
10 02
,Ar. . 4 7 1 .. . w . ..4716-4......... ... AV, 40. ..4......,
..."147;et4:17,.., 4..0,10,4423.1.420.4.4 Wairew4 4:.;$14%4 it*. tr.47:4 it 'If: r, 43 I7
Alliele:47-4,4112.41,'"411::..;44, ,4 V :W4 47:06'4.411.474P.4:144,4, '
\If 4.11Nr4l,r4,11..,4t=4*.t..ApPeli410, ',Or 411. ANII, * ,r4reg 4...4r4r....*..441,4044. 4tiff Wirit4.4747411,4741 Ak. 41,
46:04/, 40.4.41,4V4Niqr'W 94,4r4W411,4**
'KV 4,11,41,414141, 4.4:PNI,
Multidimensional Adaptive Testing 12
= .500
0 .0k = 5 0 Bias
-0.5-1
4.40r,r,4.4....,71,. ....r...,................ ........._.4.---------4-4-4,4...4.44.-__-,4, ..., ...... ,41,,,, ..... ........,4-,--......-....-.. A..., -..,....., -. -..
..&,-...............4;442,41".4-4.4r 4..4.-... ..... ...., ...... .......... 4....,...7.-......".........4.4.,...4....-2.4,,IN...i,r4,47,.....-.....--....4"...40:40.....4,-...t.-4,14.4....1,44-44-41:002,44.474.
,3/44.r.r..z.fo4,4,4,-",10.410.47044-.4,20 z.
--%. 414Aril,
1015
1 20
15
2
100
4#724:4,....
-.1``....rva. .....liplp....+2.:4042441r4-447.7....4,144:147..."...P.s,aw40.7.744,,P40,-,4,..- INfr 44'+4.W;411"04N.Z.P., ""'"46
04, .I. 4* 4.4.4..... "......4,4, ...4,../4,1/4k,4, 4414, 414, A., 440,6;- ,,,pr ..... -rOmr,r,lolike!
412.464.
,,....,4404.4ptom/
+. o,''. ....
......,-...-4.4....47.:2-...4...-.........4-.4,-.. 4......4.-........14-,1,....e..4;,40,..4.4*.:41,444'...,...4
A..."....,42,:-...-41..-4,--4.--4-4--.0.047-1.4 CV: A":"41,4 ee,e4r4 Oatt.........t.o.....42.0.42.,14-.0. ...,,Noaut"itI.
Op
-6-..4., .......... .44,4".4,4...A.1,,,te4.04,7,. ...r."'-.wite474 *04** 4.41.,, 4.4.Nt,, 0-4,
41%, it... .14.0.4474:907
........-...7.-.,
..1.....4...-..-z..1.::-.),-.....,.1....._.,.........,_....w........... a 41 ......4.....---444+-.424111,444,0a, 4-ze,
',..40,441'4.4.....4--4.4,'-;;;74.4, 44,4-41u7442.474r474,...--4,4,
.1,-AV.t,*k . 10 lip...., .....,....041,
...44,tt....4.......he ., __.....,...0..94...74.0
Bias as 0 O -1***-20.-4"...A4e4.44.4*44011%- o . 5 ,_ ....."4....-- 443/4,11.2.4
4IP W.kobibt,'40.4*-..4, 4' 4" 4- a. 1110,11gr44041,
dv 41,404,, 4.4iktir litta,41*'''.41,411:firNI,
1 . 5
M S E 10.
0
,ty.Zki.t;;.:Ai.:,.t..ptzr.t_..t.,1pp.,Otor
4r4,44,4" ''' . Al'474"' ...
tlilitlitlit.).-475.41,41.117;17:44111:1112rr441,41,4**4474iit 1 5
*-,t+,N.4P4F4r4r-t'4r4tw
1 0
44"4570-444-Lirli 10 0 2
*411 e15
20
Multidimensional Adaptive Testing 13
Multidimensional IRT has a tradition of using a multivariate information measure as a
tool for analyzing a test searching the ability space for directions in which information is
maximal. These directions then define the ability composite which the test is assumed to
measure best (Ackerman, 1994; Fan, 1996; Reckase & McKinley, 1991). The orientation in
the current paper has been different. The composite was defined to be the parameter of
interest. Next, the adaptive algorithm was used to have uniform measurement accuracy across
the ability space. As demonstrated by the flat MSE functions for k=30 and 50 in Figure 1, it is
possible indeed to have the same favorable measurement precision for all ability points and
not only for those points that "define the composite".
16
Multidimensional Adaptive Testing - 14
References
Ackerman, T.A. (1994). Creating a test information profile for a two-dimensional
latent space. Applied Psychological Measurement, 18, 257-275.
Andersen, E.B. (1980). Discrete statistical models with social science applications.
Amsterdam: North-Holland.
Fan, M., & Hsu, Y. (April, 1996). Multidimensional computer adaptive testing. Paper
presented at the annual meeting of the American Educational Testing Association, New York
City, NY:
Fraser, C. & McDonald, R.P (1988). NOHARM II: A FORTRAN program for fitting
unidimensional and multidimensional normal ogive models of latent trait theory. Armindale,
Australia: Centre for Behavioral Studies, University of New England.
Lord, F.M. (1980). Applications if item response theory to practical testing problems.
Hillsdale, NJ: Erlbaum.
McDonald, R.P. (1966). Normal-ogive multidimensional model. In W.J. van der
Linden & R.K. Hambleton (Eds.), Handbook of modern item response theory. New York City,
NY: Springer-Verlag.
McKinley, R.L., & Reckase, M.N. (1983). An extension of the two-parameter logistic
model to the multidimensional latent space (Research report ONR 83-2). Iowa City, IA:
American College Testing.
Reckase, M.D. (1985). The difficulty of test items that measure more than one ability.
Applied Psychological Measurement, 9, 401-412.
Reckase, M.D. (1966). A linear logistic multidimensional model for dichotomous item
response data. In W.J. van der Linden & R.K. Hambleton (Eds.), Handbook of modem item
response theory (pp. 271-286). New York City, NY: Springer-Verlag.
Reckase, M.D., & McKinley, R.L. (1991). The discriminating power of items that
measure more than one dimension. Applied Psychological Measurement, 15, 361-374.
Samejima, F. (1974). Normal ogive model for the continuous response level in the
multidimensional latent space. Psychometrika, 39, 111-121.
Segall, D.O. (1996). Multidimensional adaptive testing. Psychometrika 61, 331-335.
van der Linden, W.J. (1996). Assembling tests for the measurement of multiple traits.
Applied Psychological Measurement, 20, 373-388.
Wainer, H. (Ed.) (1990). Computerized adaptive testing: A primer. Hillsdale, NJ:
Erlbaum.
Wilson, D. Wood, R, & Gibbons, R. (1984) TESTFACT: Test scoring, item statistics,
and item factor analysis. Mooresville, IN: Scientific Software.
17 3E07 CON AVAIIA ILE
Multidimensional Adaptive Testing 15
Author Note
The author is most indebted to Wim M.M. Tie len for his computational support in preparing
. the numerical examples and to Cees A.W. Glas for his comments on an earlier version of this
paper. Send all correspondence to: W.J. van der Linden, Department of EducationalMeasurement and Data Analyses, University of Twente, P.O. Box 217, 7500 AE Enschede,
The Netherlands. Email: [email protected].
Titles of Recent Research Reports from the Department ofEducational Measurement and Data Analysis.
University of Twente, Enschede,The Netherlands.
RR-97-03 W.J. van der Linden, Multidimensional Adaptive Yesting with a Minimum Error-Variance Criterion
RR-97-02 W.J. van der Linden, A Procedure for Empirical Initialization of AdaptiveTesting Algorithms
RR-97-01 W.J. van der Linden & Lynda M. Reese, A Model for Optimal ConstrainedAdaptive Testing
RR-96-04 C.A.W. Glas & A.A. Beguin, Appropriateness of IRT Observed Score Equating
RR-96-03 C.A.W. Glas, Testing the Generalized Partial Credit ModelRR-96-02 C.A.W. Glas, Detection of Differential Item Functioning using Lagrange
Multiplier TestsRR-96-01 W.J. van der Linden, Bayesian Item Selection Criteria for Adaptive TestingRR-95-03 W.J. van der Linden, Assembling Tests for the Measurement of Multiple Abilities
RR-95-02 W.J. van der Linden, Stochastic Order in Dichotomous Item Response Modelsfor Fixed Tests, Adaptive Tests, or Multiple Abilities
RR-95-01 W.J. van der Linden, Some decision theory for course placementRR-94-17 H.J. Vos, A compensatory model for simultaneously setting cutting scores for
selection-placement-mastery decisionsRR-94-16 H.J. Vos, Applications of Bayesian decision theory to intelligent tutoring systemsRR-94-15 H.J. Vos, An intelligent tutoring system for classifying students into Instructional
treatments with mastery scoresRR-94-13 W.J.J. Veerkamp & M.P.F. Berger, A simple and fast item selection procedure
for adaptive testingRR-94-12 R.R. Meijer, Nonparametric and group-based person-fit statistics: A validity
study and an empirical exampleRR-94-10 W.J. van der Linden & M.A. Zwarts, Robustness of judgments in evaluation
researchRR-94-9 L.M.W. Alckermans, Monte Carlo estimation of the conditional Rasch model
RR-94-8 R.R. Meijer & K. Sijtsma, Detection of aberrant item score patterns: A review ofrecent developments
RR-94-7 W.J. van der Linden & R.M. Luecht, An optimization model for test assembly tomatch observed-score distributions
RR-94-6 W.J.J. Veerkamp & M.P.F. Berger, Some new item selection criteria for adaptivetesting
RR-94-5 R.R. Meijer, K. Sijtsma & I.W. Molenaar, Reliability estimation for singledichotomous items
1 9 3207 COPY AVAZA Dr
RR-94-4 M.P.F. Berger & W.J.J. Veerkamp, A review of selection methods for optimaldesign
RR-94-3 W.J. van der Linden, A conceptual analysis of standard setting in large-scaleassessments
RR-94-2 W.J. van der Linden & H.J. Vos, A compensatory approach to optimal selectionwith mastery scores
RR-94-1 R.R. Meijer, The influence of the presence of deviant item score patterns on thepower of a person-fit statistic
RR-93-1 P. Westers & H. Kelderman, Generalizations of the Solution-Error Response-Error Model
RR-91-1 H. Kelderman, Computing Maximum Likelihood Estimates of Loglinear Modelsfrom Marginal Sums with Special Attention to Loglinear Item Response Theory
RR-90-8 M.P.F. Berger & D.L. Knol, On the Assessment of Dimensionality in
Multidimensional Item Response Theory ModelsRR-90-7 E. Boekkooi-Timminga, A Method for Designing IRT-based Item BanksRR-90-6 J.J. Adema, The Construction of Weakly Parallel Tests by Mathematical
ProgrammingRR-90-5 J.J. Adema, A Revised Simplex Method for Test Construction ProblemsRR-90-4 J.J. Adema, Methods and Models for the Construction of Weakly Parallel TestsRR-90-2 H. Tobi, Item Response Theory at subject- and group-levelRR-90-1 P. Westers & H. Kelderman, Differential item functioning in multiple choice
items
Research Reports can be obtained at costs, Faculty of Educational Science and Technology,University of Twente, Mr. J.M.J. Nelissen, P.O. Box 217, 7500 AE Enschede, The Netherlands.
2 0
U.S. DEPARTMENT OF EDUCATIONorric of Educational A ch and improvement tOERA
Educational RYbOUICe Information Cantor (ERIC)
NOTICE,
REPRODUCTION BASIS
This document is covered by a signed "Reproduction Release(Blanket)" form (on file within the ERIC system), encompassing allor classes of documents from its source organization and, therefore,does not require a "Specific Document" Release form.
This document is Federally-funded, or carries its own permission toreproduce, or is otherwise in the public domain and, therefore, maybe reproduced by ERIC without a signed Reproduction Releaseform (either "Specific Document" or "Blanket").
PO.