26
Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

Estimating Growth when Content Specifications Change:A Multidimensional IRT Approach

Mark D. ReckaseTianli LiMichigan State University

Page 2: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

The Problem

State curriculum frameworks often change from one grade to the next reflecting the addition of new instructional content. For example, at grade 7 algebra may be introduced as an

instructional goal. At grade 6, algebra is not an important component of the

curriculum. Tests at the two grades reflect the instructional content

so the 6th grade test does not include algebra and the 7th grade test does.

How can the score scales of these tests be linked?

Page 3: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

Research Questions

What do changes on the linked score scale mean, when the scale is produced using the usual unidimensional IRT models?

Can multidimensional IRT be used to form vertical scales? If so, how do the results compare to the unidimensional results?

Page 4: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

The Approach

State testing data were analyzed using multidimensional IRT to develop a realistic model for the test data at two grade levels.

The results of the real data analyses were idealized to create the specifications for simulating the tests at two grade levels.

Simulate data with known structure to determine how unidimensional and multidimensional procedures function.

Page 5: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

The Simulated Data Design

Grade 6 – two major constructsArithmeticProblem Solving

Grade 7 – three major constructsArithmeticProblem SolvingAlgebra

Page 6: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

Simulated Test Structure

Test Level Algebra Arithmetic Problem Solving

Total

Grade 6 0 17 (4) 23 (6) 40 (10)

Grade 7 11 (0) 11 (4) 18 (6) 40 (10)

Note: The numbers in parentheses are the common items between the two forms of the tests.

Page 7: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

Mean Vectors at each Grade Level

Class Level Algebra Arithmetic ProblemSolving

Grade 6Grade 7

-1.5 (-1.50)0 (.03)

.5 (.51)

.7 (.73)-.2 (-.21)0 (.01)

Note: Values in parentheses are the observed means from the simulated data

Page 8: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

Covariance MatricesCovariance Matrix for Grade 6

Algebra Arithmetic Problem Solving

Algebra .25 (.25) 0 (.00) 0 (.00)

Arithmetic 0 (.00) .8 (.84) .7 (.76)

Problem Solving

0 (.00) .7 (.76) 1.2 (1.29)

Covariance Matrix for Grade 7

Algebra Arithmetic Problem Solving

Algebra 1 (1.05) .4 (.42) .6 (.64)

Arithmetic .4 (.42) .6 (.60) .3 (.32)

Problem Solving

.6 (.64) .3 (.32) 1 (1.02)Note: Values in parentheses are estimated from the simulated data.

Page 9: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

Orientation of Items

-2-1.5 -1

-0.50 0.5

1

-1

0

1

2-2

-1.5

-1

-0.5

0

0.5

1

1.5

1

2

3

Page 10: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

Effect Size Built into Data

Algebra ArithmeticProblem Solving

1.9 .26 .21

Page 11: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

Unidimensional Basisfor Comparison Imagine that the full set of 70 items from both

test levels are administered to the students at both grade levels.

The matrix of 2000 + 2000 students from the two grades by 70 items can be analyzed with the unidimensional models to serve as a basis for comparison for the vertical scaling result.

Analyze the matrix using 2pl and Rasch model.

Page 12: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

2PL Solution

-2

-1

0

1

2

-1

0

1

2-2

-1

0

1

2

1

2

3

Page 13: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

Rasch Model Solution

-2

-1

0

1

2

-1

0

1

2-3

-2

-1

0

1

2

1

2

3

Page 14: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

Vertical Scaling Analysis

Common-item concurrent calibration BILOGMG

Off grade items coded as not reachedBoth 2pl and Rasch model used for analysis

Determine effect size of difference in mean of two grade levels

Page 15: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

Vertically Scaled Effect Sizes

2PL Model70 Items

Rasch Model

70 Items

2PL ModelConcurrent

Rasch Model

Concurrent

Mean (SD)Grade 6

-.54 (.78) -.42 (.93) -.22 (1.16) -.14 (1.06)

Mean (SD)Grade 7

.56 (1.13) .45 (1.15) .26 (1.20) .21 (1.38)

Effect Size 1.13 .83 .41 .28

Page 16: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

Vertically Scaled Effect Sizes

Linked effect size is smaller than full data effect size.

Rasch effect size is less than 2pl effect size.

Full data set effect size is less than modeled effect size.

Page 17: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

Alternative Linking Method

Common-item, separate calibration

Common item parameter relationship was poor

-2 -1.5 -1 -0.5 0 0.5 1-2

-1.5

-1

-0.5

0

0.5

1

b-parameters Grade xb-

para

met

ers

Gra

de x

+ 1

Page 18: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

MIRT Analysis

Full data analysis with TESTFACTThree dimensional analysisDetermine effect size for each dimensionCorrelate each estimated with the

generating s to determine meaning of the results.

Page 19: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

MIRT Effect Sizes

θ1 θ2 θ3

Mean (SD) Total

.01 (.95) -.01 (.90) .05 (.72)

Mean (SD) 6 -.57 (.54) .16 (.99) .03 (.74)

Mean (SD) 7 .60 (.90) -.19 (.77) .06 (.69)

Effect Size 1.56 -.40 .05

Page 20: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

Correlation between Trueand Estimated Values

Est θ1 Est θ2 Est θ3

True θ1 .92 -.08 .02

True θ2 .47 .50 -.18

True θ3 .46 .80 -.03

Page 21: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

Interpretation of MIRT Solution

Results are difficult to interpret because of the default procedures in TESTFACT.

Solution needs to be rotated to have axes align with content dimensions.

Current solution shows that is related to algebra and shows the big algebra effect.

is a combination of arithmetic and problem solving with the emphasis on problem solving. Most likely it has the sign of the a-parameters

reversed.

Page 22: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

Concurrent MIRT Analysis

Use concurrent calibration of data from the two grade levels.Three dimensional solutionNo rotation

Determine effect sizes and correlations with true values.

Page 23: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

Concurrent MIRT Calibration

θ1 θ2 θ3

Mean (SD) Total

.06 (.75) -.09 (.57) -.38 (1.01)

Mean (SD) 6 -.02 (.87) -.29 (.56) .18 (.64)

Mean (SD) 7 .14 (.59) .10 (.50) -.94 (.99)

Effect Size .22 .74 -1.34

Page 24: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

Concurrent MIRT Calibration

Est θ1 Est θ2 Est θ3

True θ1 .16 .57 -.87

True θ2 .54 .02 -.40

True θ3 .77 -.05 -.43

Page 25: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

Concurrent MIRT Calibration

Scale on Dimension 3 is reversed and it has a large effect size (algebra).

Dimension 1 is most related to arithmetic and problem solving with a moderate effect size.

Dimension 2 is moderately related to algebra and has a large effect size.

The overall result gives a reasonable estimate of effects, but the dimensions need to be rotated to match the constructs.

Page 26: Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

Conclusions

Unidimensional linking of the two level tests underestimate the effect size.

Rasch model gives a smaller effect size than the two parameter logistic model.

MIRT solution shows promise. Need to determine how to rotate solution to match

constructs. TESTFACT has problems converging on estimates

because of mismatch between assumptions and reality.