Vocabulary Matching Research

7/29/2019 Vocabulary Matching Research

1/12

JOURNAL OF LEARNING DISABILITIES

VOLUME 38, NUMBER 4, JULY/AUGUST 2005, PAGES 353363

Curriculum-Based Measurementin the Content Areas:

Vocabulary Matching as an Indicatorof Progress in Social Studies Learning

Christine A. Espin, Jongho Shin, and Todd W. Busch

Abstract

The purpose of this study was to examine the reliability and validity of curriculum-based measures as indicator of growth in content-

area learning. Participants were 58 students in 2 seventh-grade social studies classes. CBM measures were student- and administrator-

read vocabulary-matching probes. Criterion measures were performance on a knowledge test, the social studies subtest of the Iowa Test

of Basic Skills (ITBS), and student grades. Both the student- and examiner-read measures reflected change in performance; however, onlythe student-read measure resulted in interindividual differences in growth rates. Significant relations were found between the growth

rates generated by the student-read vocabulary measure and student course grades, ITBS scores, and growth on the knowledge test.

These results support the validity of a vocabulary-matching measure as an indicator of student learning in the content areas. The results

are discussed in terms of the use of CBM as a system for monitoring performance and evaluating interventions for students with learn-

ing disabilities in content-area classrooms.

O

ne of the most important yetmost difficult components of

education is the measurementof change. By measuring change in

performance, teachers can reliablyevaluate student learning and the ef-

fects of instructional interventions on

that learning. Yet despite its impor-

tance, change measurement is rarelythe focus of educational assessment,

where the measurement of perfor-

mance at a single point in time is the

dominant approach. In few other areasof education is this emphasis more

prevalent than in the field of learning

disabilities (LD), where the identifica-tion of students for services is often

based on the discrepancy between twosingle scoresan intelligence score

and an achievement score.

The lack of attention given to

change measurement in education hasbeen due in part to the difficulties as-

sociated with measuring change in

performance, including a lack of statis-

tical methods for handling multiple

data points (Willet, 1989) and a lack of

assessment tools available for produc-

ing repeated measures within shorttime periods (Francis, Shaywitz, Stue-

bing, Shaywitz, & Fletcher, 1994). With

regard to statistical methods, recent

developments have opened up newpossibilities for incorporating stu-

dents change in performance as part

of educational assessment (see Bryk &

Raudenbush, 1987, 1992). Francis et al.

(1994) illustrated the use of these ad-vanced statistical procedures in the

area of LD. These authors and others

(D. Fuchs, Fuchs, McMaster, & Al

Otaiba, 2003; L. S. Fuchs & Fuchs, 1998)have proposed that change measure-

ment be involved in defining and di-

agnosing LD as well as in determining

students responses to interventions.With regard to the availability of as-

sessment tools, there exists a measure-

ment system specifically designed to

measure change in student perfor-mance by producing repeated mea-

sures within short time periods. This

system of measurement, referred to as

curriculum-based measurement (CBM),

has a strong body of research to sup-port its validity and reliability.

Curriculum-BasedMeasurement

Curriculum-based measurement is anongoing data collection system that is

designed to provide teachers with in-

formation on student progress and on

the effects of instructional interventionson that progress. The measures devel-

oped for use as a part of CBM are sim-ple, efficient, easy to understand, and

inexpensive, and allow for repeatedmeasurement of student performance

over time (Deno, 1985). More than 25

years of research have supported the

validity and reliability of CBM mea-sures as indicators of performance for

elementary school students in the ba-

sic skill areas of reading, mathematics,

spelling, and written expression. Cor-


2/12

JOURNAL OF LEARNING DISABILITIES354

relations between CBM indicators and

a variety of criterion measures gener-

ally range from .60 to .90, and test

retest and alternate-form reliabilitiesare generally above .80 (see Marston,

1989, for a review). The treatment

validity of CBM measures at the ele-mentary school level has also beensupported. When teachers use CBM

measures to evaluate and modify their

instruction, student achievement im-

proves (L. S. Fuchs, Deno, & Mirkin,1984; L. S. Fuchs, Fuchs, & Hamlett,

1989a, 1989b, 1989c; L. S. Fuchs, Fuchs,

Hamlett, & Allinder, 1991; L. S. Fuchs,

Fuchs, Hamlett, & Ferguson, 1992; L. S.Fuchs, Fuchs, Hamlett, & Stecker, 1990;

Stecker & Fuchs, 2000; Wesson et al.,

1988). Most recently, CBM has been

combined with statistical techniquessuch as hierarchical linear modeling

(HLM) to generate student growth

curves and to use these growth curves

to answer questions about the relationbetween student progress and instruc-

tional variables (Compton, 2000; Shin,

Deno, & Espin, 2000).

The initial work in the area ofCBM was conducted at the elementary

school level; later, this work was ex-

tended to the secondary school level

(see Espin & Tindal, 1998, for a review).With that extension came an interest in

the development of CBM measures in

content areas such as social studies and

science.

CBM in the Content Areas

The initial research on the develop-

ment of curriculum-based measure-

ment in the content areas was con-

ducted by Tindal and Nolet (see Nolet

& Tindal, 1993, 1994, 1995; Tindal &Nolet, 1995, 1996). Tindal and Nolet

identified the critical thinking skills (e.g.,

explanation of concepts, illustration offacts) needed to understand and use

content-area information and created

measures to represent these thinking

skills. The measures were appropriatefor determining student learning within

a given unit of study, but they were less

appropriate for showing growth across

study units (see Tindal & Nolet, 1995).

Espin and Deno (1993a, 1993b, 1994

1995) and Espin and Foegen (1996) took

a somewhat different approach and fo-cused on the identification of a single

indicator that would represent general

performance in the content areas. Thefirst step in this research was to estab-lish the reliability and validity of a sin-

gle indicator for predicting student

performance on content-area tasks.

Espin and Deno (1993b, 19941995)examined the validity of two measures

as potential indicators of performance

on content-area study tasks in En-

glish and science. Tenth-grade partici-pants read aloud for 1 minute from

English and science textbook passages.

In addition, participants were given 10

minutes to complete a vocabulary-matching task with terms selected

from each passage. The criterion task

was a study task in which students

searched through the text for answersto comprehension questions. Correla-

tions between the predictor and crite-

rion measures were in the low moder-

ate to moderate range (r = .37.44).Correlations were similar for vocabu-

lary matching and for reading aloud

from text, but in a regression analysis,

vocabulary matching accounted for thelargest proportion of variance in the

criterion task, with reading aloud not

adding to the variance.

In a follow-up study, Espin and

Foegen (1996) compared the validity ofthree CBM measures for predicting

student performance on three criterion

tasks. CBM measures were reading

aloud from text, vocabulary matching,and maze selection. The maze selection

measure was included in this study be-

cause it was known to be a good pre-

dictor of reading performance at theelementary school level (Espin, Deno,

Maruyama, & Cohen, 1989; L. S. Fuchs,

Fuchs, & Maxwell, 1988). The criterion

measures in the study were represen-tative of the tasks required of students

in content-area classes and included

comprehension, acquisition, and reten-

tion of content information. Participantsin the study were 186 middle school

students. The results revealed moder-

ate to moderately strong correlations

between the CBM and criterion tasks

(r = .52.65). Once again, in a regres-

sion analysis, vocabulary matching ac-counted for the greatest proportion

of variance in the criterion tasks, with

neither of the other measures con-tributing substantially to the variance.

Although the results of Espin and

Denos (1993b, 19941995) and Espin

and Foegens (1996) studies suggested

that vocabulary matching was a validindicator of content-area performance,

neither study had been conducted in

an actual content-area classroom. In

our research, we wished to extendthis early work to examine vocabulary

matching in a middle school social

studies classroom. We conducted two

related studies. In the first, we ex-amined the technical adequacy of a

vocabulary-matching measure as an

indicator ofperformance in social stud-

ies; in the second, we examined thetechnical adequacy of the same vocab-

ulary-matching measure as an indica-

tor ofprogress in social studies.

In this article, we report the re-sults of the progress study. We begin,

however, by summarizing the results

of the performance study (see Espin,

Busch, Shin, & Kruschwitz, 2001, fordetails). Participants in the performance

study were 58 seventh-grade students

from two social studies classrooms.

Based on the results of previous re-

search, only the vocabulary-matchingmeasure was included in this study. In

order to examine the role of reading

in the prediction of performance, two

versions of the vocabulary-matchingmeasure were compared: a student-

read version, in which students read

words and definitions to themselves,

and an examiner-read version, in whichthe examiner read the words and defi-

nitions to the students. Criterion mea-

sures were a research-made pre- and

post- knowledge test, social studiesgrades, and scores on the social studies

subtest of a standardized achieve-

ment test.

The results of the performancestudy revealed that alternate-form reli-

abilities for the vocabulary-matching


3/12

VOLUME 38, NUMBER 4, JULY/AUGUST 2005 355

measures ranged from .58 to .87, with

a mean reliability of .70 for student-

and administrator-read forms. Relia-

bility was increased to .84 and .78 forstudent- and administrator-read probes,

respectively, by combining scores across

the two probes. Analysis with respectto the validity of the measures lentsupport to the use of both measures as

indicators of student performance in

social studies. Correlations between

vocabulary matching and the knowl-edge and standardized achievement

tests ranged from .56 to .84. Correla-

tions with class grades were lower,

ranging from .27 to .51, in part due tothe restricted range of scores for course

grades (most students earned grades

of C to A); however, the correlation be-

tween the student-read probe andcourse grades was moderately strong

(r = .51). Finally, although the sample

of students with LD was small, perfor-

mance on the vocabulary-matchingprobe differentiated students with and

without LD.

Purpose and ResearchQuestions

The results of Espin et al. (2001) con-firmed the validity of vocabulary match-

ing as an indicator ofperformance. Thisresult is not surprising, given the liter-

ature on the importance of vocabulary

knowledge for both reading com-

prehension and content-area learning(e.g., Baumann & Kameenui, 1991; Beck

& McKeown, 1991; Blachowicz, 1991;

Blachowicz & Fisher, 2000; Konopak,

1989; Nagy & Scott, 2000; Scruggs,Mastropieri, & Boon, 1998). However,

the 2001 study did not address a key

question: Would vocabulary matchingprove to be a reliable and valid indica-tor of studentprogress? In other words,

would the growth trajectories pro-

duced by repeated measurement on

alternate forms of the vocabulary-matching probes adequately model

student growth and learning in social

studies? A review of the literature re-

veals that the answer to this question isnot obvious; that is, despite the recog-

nition of the importance of vocabulary

for reading comprehension and content-

area learning, the way in which vocab-

ulary terms are learned and the rela-tion between such learning and the

comprehension and acquisition of text

material is not clear (see Baumann &Kameenui, 1991; Beck & McKeown,1991; Blachowicz & Fisher, 2000; Nagy

& Scott, 2000). Thus, in this second

study, we wished to explore the ques-

tion of whether student performanceon the vocabulary measures would

changeand whether this change would

occur concomitantly with learning.

We addressed the following tworesearch questions in the study:

1. What is the validity of vocabulary

matching as an indicator of prog-ress (i.e., learning) in a social stud-

ies class?

2. Does the validity of vocabulary

matching differ as a function of theadministration format (i.e., student

vs. administrator read)?

To address these questions, three is-sues were examined: (a) the sensitivity

of the vocabulary-matching measures

to improvement in student perfor-

mance over time; (b) the sensitivity ofthe vocabulary-matching measures to

interindividual differences in growth

rates; (c) the validity of the growth

rates generated by the vocabulary-matching measures with respect to

course grades, performance on a stan-

dardized achievement test, and im-

provement on a content knowledge

test.

Method

Participants

Participants in this study were 58

seventh-grade students (32 boys and

26 girls; mean age = 13.6 years). These

students had also participated in theearlier study on vocabulary matching

as a performance measure (Espin et al.,

2001). Students were recruited from

two social studies classes in a suburbanschool in the Midwest. The majority of

participants were European American

(95%), with a small percentage of stu-

dents who were African American or

Asian American (5%); 28% of the schoolpopulation received free or reduced-

price lunches.

Five of the participants were iden-tified as having LD according to dis-trict standards (4 boys and 1 girl; mean

age = 13.4 years). The identification

standards for LD included a history of

underachievement, a discrepancy be-tween ability and achievement, and an

information-processing deficit. All five

students were receiving services in

reading and written expression, andone was receiving additional services

in mathematics. Percentile scores for

the students with LD on the Iowa Test of

Basic Skills, form K, Level 12 (ITBS;Hoover, Hieronymus, Frisbie, & Dun-

bar, 1993) were as follows: 30.4 (range =

346) on the Reading Vocabulary sub-

test, 26.8 (range = 249) on the ReadingTotal subtest, and 13.2 (range = 120)

on the Social Studies subtest.

Procedure

During the winter and spring quarters,

students were tested weekly with two

types of vocabulary-matching probes:student read and administrator read.

The student-read version of the probe

consisted of 22 vocabulary terms, in-

cluding two distractors, and 20 defini-tions. Terms and definitions were cho-

sen at random from a master list of 146

terms created from the social studies

textbook and the teachers lecturenotes. Definitions were modified if nec-

essary so that each definition would

have fewer than 15 words. Vocabulary

terms appeared on the left side of the

page and were arranged alphabeticallyto help students easily locate terms.

Definitions appeared on the right side

of the page. Students were given 5minutes to read the terms and defini-

tions and to match each term with its

definition.

The administrator-read version ofthe vocabulary probe was developed

from the same set of terms and defini-

tions as the student-read version. On


4/12


the administrator-read version, only the

vocabulary terms were given. The test

administrator read the definitions, and

the students identified which termsmatched the definition being read. Def-

initions were read one at a time,

with 15-second intervals between eachitem. Each probe lasted a total of 5 min-utes.

Vocabulary-matching probes were

administered weekly by the third au-

thor. Students were given the twotypes of probes consecutively each

week. To control for order effects, the

order in which the probes were given

was alternated each week. The numberof correct matches on each probe was

tallied and used in data analysis. In

total, each individual completed 11

administrator-read and 11 student-read probes.

In addition to the vocabulary-

matching probes, students were ad-

ministered a knowledge test at pre-and posttest to measure the amount

learned during the study. The knowl-

edge test was composed of 36 ques-

tions in the areas of sociology, psychol-ogy, and geography. Questions were

developed on the basis of textbook

content, teachers lecture notes, and

teacher-made worksheets and tests.The social studies teacher was asked to

review all items on the knowledge test

to ensure that the items matched the

information presented to the students

in class.

Items were classified into two

types of questions: applied (27 items)and factual (9 items).Applied questions

were those in which students were

asked to apply social concepts andprinciples to specific social events orphenomena. Factual questions were

those in which students were asked to

make simple one-to-one relations be-

tween concepts and events (see Table 1for examples of these two types of

questions). A heavier emphasis was

placed on the applied questions to

ensure that the relation between thevocabulary-matching tasks and the

knowledge test would not be solely a

function of the similarity in the task

requirements.Following the development of the

knowledge test items, the items were

given to a graduate student in special

education, who was not involved inthe study, and to the social studies

teacher involved in the study. The

graduate student and social studies

teacher were asked to classify eachitem as either applied or factual. Inter-

rater agreement between each rater

and the third author was calculated by

dividing agreements by agreementsplus disagreements. Interrater agree-

ment was .95 and .89 for the graduate

student and the social studies teacher,

respectively. Items that were not cor-

rectly classified by the graduate stu-

dent or the social studies teacher were

modified. Students were given theknowledge test at the beginning and

end of the study. The number of correct

answers was used for data analysis.Students social studies grades

and their scores on the ITBS were also

collected. Social studies grades repre-

sented students mean grade in the

class across three grading periods. Let-ter grades were assigned to each stu-

dent. For our purposes, the letter grades

were converted to numeric values on a

13-point scale, with 13 representing anA+, 12 an A, 11 an A, 10 a B+, and so

on. A score of F was assigned a 0. If

students failed a class, they were able

to retake it in a 4-week makeup ses-sion. If students passed this makeup

session, they were assigned a grade of

P. These passing grades were assigned

a value of 1. Course grades were basedequally on homework, quizzes, unit

tests, and current events reporting.

Scores on the ITBS were obtained

from students school records. Stu-dents completed the ITBS the spring

prior to the beginning of the study.

Form K, Level 12 of the ITBS was

administered. The Social Studies sub-test of the ITBS consists of 42 questions

covering history, geography, eco-

nomics, political science, sociology/

anthropology, and related social sci-

ences (e.g., ethics, human values) Stan-dard scores were used for all analyses.

The internal consistency of the ITBS, as

reported in the technical manual,

ranges from .61 to .93. Salvia and Ys-seldyke (1998) reported that the items

of the ITBS were reviewed for content

fit and item bias by field experts and

then tested on a large sample across theUnited States. The results of this test-

ing were used for final sample selec-

tion.

Statistical Analysis

Hierarchical linear models (HLM)were used to address three items with

respect to the vocabulary-matching

measures: (a) sensitivity to improve-

TABLE 1

Examples of Applied and Factual Questions on the Knowledge Test

Question type Example

Applied

Factual

Jos comes from a working class home. He married Judy who is very

wealthy and moves into an upper class neighborhood. Joss change

in status is an exmple ofa. mobility

b. sanctions

c. mores

d. primary group

The process by which a member learns the rules of his or her group is

called

a. socialization

b. community

c. role play

d. mobility


5/12


ment of student performance over

time, (b) sensitivity to interindividual

differences in growth rates, and (c) va-

lidity of the growth rates produced bythe measures with respect to the crite-

rion measures. To address the first two

issues, unconditional HLM modelswere employed to examine the sensi-tivity of vocabulary probes for mea-

suring student growth over time and

for revealing interindividual differ-

ences in growth rates. In these analy-ses, the significance of the mean

growth rates and growth parameter

variances estimated by each type of vo-

cabulary measure were statistically in-vestigated. To address the third issue,

course grades, scores on the Social

Science subtest of the ITBS, and per-

formance change on the content knowl-edge test were used as Level 2 vari-

ables in HLM to examine the validity

of the growth measures estimated on

the vocabulary probes. In this analysis,the relations between growth rates es-

timated on the vocabulary probes and

the criterion measures were statisti-

cally examined.

Results

Sensitivity to ImprovementOver Time and IndividualDifferences

The first step in our analysis was to

determine whether student- and ad-

ministrator-read vocabulary-matching

measures were sensitive to improve-ment over time, and whether they

revealed interindividual differences

in growth rates. Descriptive statistics

of repeated measures of students

performance on the student- andadministrator-read probes are dis-

played in Table 2. Observed mean

scores on both types of vocabularyprobes increased over time. Moreover,

interindividual differences in student

performance increased over time on

the student-read probes, as evidencedby the increase in standard deviations.

The improvement in performance

scores and the interindividual differ-

ences in growth rates were further ex-

amined using hierarchical linear mod-

els (Bryk & Raudenbush, 1987, 1992).

Specifically, the statistical significanceof the mean growth rate and of the

growth parameter variance estimated

by each type of vocabulary measurewas tested. The statistical test of thesignificance of the mean growth rate

addressed the question of whether the

growth rate for the entire group of stu-

dents was statistically different from anull growth rate (i.e., growth rate of

zero). The statistical test of the signifi-

cance of the growth parameter vari-

ance addressed the question whetherindividual students differed in their

rates of growth over time.

We hypothesized that as a group,

the students would improve signifi-cantly in social studies knowledge

over the school year due to the teach-

ers instruction. Moreover, we hypoth-

esized that students would not sharethe same growth rates because of indi-

vidual characteristics (e.g., intelligence,

prior knowledge, and motivation to

learn). To test the validity of these hy-potheses, the following unconditional

models were used in this study:

Yti = 0i + 1i Week= eti(within-individual model) and

0i = 00 + u0i, 1i = 10 + u1i(between-individual models),

where Yti is the observed score for stu-

dent i at time t, 0i the intercept of the

growth line for student i, 1i the weekly

growth rate for student i, 00 the meanintercept for the entire group of stu-

dents, 10 the mean growth rate for the

entire group of students, eti the error re-

lated to student i, and u0i and u1i the

random errors related to the mean in-tercept and growth rate, respectively.

In the analysis, the intercept was cen-

tered at the first occasion of data col-lection; therefore, it showed individual

differences in the students vocabulary

knowledge at the beginning of the

study.The statistical test of the signifi-

cance of the mean growth rates revealed

that the mean growth rates estimated

by both student- and administrator-

read vocabulary probes were statisti-

cally different from a null growth rate;

that is to say, they were both sensitivefor detecting significant improvement

of students performance over time

(see Table 3). The mean growth rate es-timated by the student-read vocabu-lary probe showed an increase of .65

correct matches per week, whereas the

mean growth rate estimated by the

administrator-read probe showed anincrease of .22 correct matches per

week.

The statistical test of the signifi-

cance of the growth parameter variancerevealed that the growth parameter

variance estimated by the student-read

vocabulary probe was statistically dif-

ferent from no variability in studentsgrowth rates (see Table 4); that is, there

were individual differences in growth

rates on the student-read measure. In

contrast, the growth parameter vari-ance estimated by the administrator-

read probe was not statistically differ-

ent from no variability, indicating that

all students shared the same growthrate (i.e., the mean growth rate). Thus,

the results of this analysis revealed

that only the student-read vocabulary-

matching probe was sensitive enoughto reveal interindividual differences in

growth rates among students. Given

this finding, only the student-read

probe was entered into future analyses

(see Bryk & Raudenbush, 1987, 1992).Regarding the performance at the

beginning of the study, the mean inter-

cepts were statistically significant for

both types of probes (see Table 3). Themean intercept for the administrator-

read probe, however, was slightly

higher than that for the student-read

probe. Both types of probes also sensi-tively reflected the existence of in-

terindividual differences in vocabulary

knowledge at the beginning of the

study (see Table 4).

Validity of Growth Rates

The validity of the growth rates esti-

mated by the student-read vocabulary-

matching measure was examined by


6/12


investigating the relations betweenthe growth rates generated by the

vocabulary-matching measures and

the residualized gain scores on the con-

tent knowledge test, course grads insocial studies, and scores on the Social

Science subtest of the ITBS. Means andstandard deviations for the criterion

measures were as follows: knowledgepretest, M = 20.27, SD = 5.07; knowl-

edge posttest, M = 24.86, SD = 5.62;

ITBS, M = 218.72, SD = 32.27; social

studies grades,M = 9.38, SD = 3.24.The three criterion measures were

separately included in the between-

individual model as a Level 2 predictor

because the main interest of our analy-sis was in examining the direct relation

between growth rates on the student-

read vocabulary probe and each crite-

rion measure, not the relative contri-bution of the criterion measures to

predicting the students growth rates.

The three between-individual models

used in the analysis were as follows:

1i = 10 + 11 (GainScore)i + u1i1i = 10 + 11 (CourseGrade)i + u1i1i = 10 + 11 (ITBS)i + u1i

where 1i is the linear growth rate for

student i, 10 the mean growth rate forthe entire group of students, 11 theregression coefficient showing the re-

lation between growth rates on the

student-read vocabulary probe and

each corresponding criterion measure,and u1i the random error related to the

mean growth rate.

Prior to examining the relations

between growth rates and criterionmeasures, the reliability of the growth

rate parameter was explored. This was

done to ensure that the relations be-

tween growth rates and criterion mea-sures were examined reliably. The re-

liability of the growth parameter in

HLM is defined as the proportion of

observed variance of the parameter totrue variance. Low parameter reliabil-

ity (e.g., less than .30) indicates that

estimates of the growth parameter are

unstable and that their relations toother variables cannot be examined

in a dependable way. The reliability of

TABLE 3

Sensitivity of Student- and Administrator-Read Vocabulary Probes for

Revealing Growth Over Time (Fixed-Effect Model)

Probe/effect Coefficient Standard error ta p

Student-read

Intercept (00) 5.16 .46 11.23 .00

Mean growth (10) .65 .06 10.70 .00

Administrator-read

Intercept (00) 7.98 .46 17.24 .00

Mean growth (10) .22 .04 5.86 .00

TABLE 4

Sensitivity of Student- and Administrator-Read Vocabulary Probes for Revealing

Interindividual Differences in Growth Rates (Random-Effect Model)

Probe/effect Variance 2 p

Student-read

Intercept (00) 9.07 222.01 .00

Mean growth (10) .12 133.06 .00

Administrator-read

Intercept (00) 9.84 276.97 .00

Mean growth (10) .01 59.00 .37

Note. Chi-square df= 56, N= 57.

TABLE 2

Means and Standard Deviations for Student- and Administrator-Read

Vocabulary-Matching Probes

Student-read Administrator-read

Probe n M SD M SD

1 53 5.23 3.41 8.41 3.83

2 54 6.02 3.48 8.75 4.33

3 53 7.11 5.35 6.25 3.61

4 44 6.00 4.09 10.80 4.52

5 48 7.04 4.13 8.62 4.23

6 53 9.83 5.24 10.64 4.86

7 49 9.35 4.99 8.62 4.87

8 50 8.84 5.26 8.88 4.51

9 50 9.78 5.14 10.24 4.41

10 52 14.10 5.38 9.47 3.80

11 51 9.71 5.67 10.88 4.20


7/12


the growth rate parameter was .52 in

the null model, indicating that 52% of

the total growth rate parameter vari-

ance estimated by the student-read vo-cabulary probe could be attributed to

the true parameter variance (see Bryk

& Raudenbush, 1987, 1992). This resultfor the reliability of the parameter vari-ance suggests that the relations be-

tween growth rates and criterion mea-

sures would be examined reliably in

this study.The results of the validity analy-

ses revealed that the growth rates esti-

mated by the student-read vocabulary

probe were significantly related toresidualized gain scores on the knowl-

edge test, to students course grades in

social studies, and to the ITBS Social

Studies test scores (see Table 5). Inother words, students who had larger

gain scores on the knowledge test,

higher course grades, and higher test

scores on the ITBS also had highergrowth rates on the student-read vo-

cabulary probe. These results support

the validity of the student-read vocab-

ulary measure as an indicator ofgrowth in learning.

Discussion

The results of this study indicate that

only the student-read version of the

vocabulary-matching probe produced

growth trajectories that were valid andreliable predictors of student perfor-

mance in social studies. Both the stu-

dent- and administrator-read versions

of the vocabulary-matching probesproduced significant group growth

rates, although the student-read mea-

sure revealed greater growth over

time. However, only the student-read

vocabulary-matching measure wassensitive to interindividual differences

in growth rates. Because we can as-

sume that not all students participat-ing in the study had identical growth

rates, our findings imply that only the

student-read version is sufficiently

sensitive to growth over time.Examination of Table 2 may help

to explain the contrast between the two

measures in terms of their sensitivity to

interindividual differences in growth

rates. As illustrated in Table 2, stan-

dard deviations for the student-read

probes tended to increase graduallyacross the duration of the study,

whereas standard deviations for the

administrator-read probes did not. Ifthe vocabulary-matching measureswere sensitive to individual changes in

performance, one would expect the

standard deviations for the measures

to increase over the course of the year,as some students learn more whereas

others learn less. The restricted vari-

ability in scores for the administrator-

read probes most likely served to ar-tificially restrict the variability in the

slopes for the administrator-read scores,

leading to a lack of sensitivity to inter-

individual differences. In other words,reading the probes aloud to the stu-

dents produced less individual varia-

tion in performance as the year pro-

gressed.Conceptually, our results imply

that reading is an important factor in

the measurement of student perfor-

mance and progress in the contentareas. Based on previous research

(Espin et al., 2001), we had expected no

differences in the validity of the growth

trajectories created by the two types ofvocabulary-matching measures. How-

ever, the vocabulary-matching task

that incorporated reading was more

sensitive to overall learning than the

measure that removed reading as a fac-tor. Recall, however, that we conducted

this study in the classroom of only one

teacher. It is possible that reading may

be a more important factor in this

teachers classroom than in other class-

rooms. It will be important in future

research to replicate these findings

across different teachers and studentsand to directly examine the role of

reading.

Once we had established that thestudent-read measure was sensitive

both to group growth and to individ-

ual differences in growth over time, we

examined the reliability and validity of

the growth trajectories created by thestudent-read measure. The results re-

vealed that the growth trajectories cre-

ated by the student-read measure were

both reliable and valid. The growth tra-jectories were stable and proved to be

significantly related to growth and

performance on other criterion mea-

sures, including gain scores on theknowledge test, course grades, and

scores on the ITBS. In other words,

students who demonstrated greater

growth on the student-read vocabulary-matching measure also showed more

growth on the knowledge test, had

higher course grades, and had higher

scores on the ITBS. This is the patternof relations we would expect if the

vocabulary-matching measures were

valid measures of performance and

progress.

Examples of ProgressMonitoring

Although our research supports the

technical adequacy of vocabulary-

matching as an indicator of perfor-

mance and progress in the contentareas, it does not address the treatment

TABLE 5Relation Between Growth Rates on Student-Read Vocabulary

Probes and Criterion Measures

Criterion measure Coefficient Standard error ta p

Knowledge test gain score .010 .005 2.00 .05

Course grades .053 .017 3.22 .00

ITBS Social Science scores .002 .001 2.34 .02

Note. ITBS = Iowa Test of Basic Skills(Hoover, Hieronymus, Frisbie, & Dunbar, 1993).adf= 53.


8/12


validity (Messick, 1994) of the mea-

sure; that is, our research does not ad-

dress the effect that the use of progress

monitoring might have on teacher in-struction and student performance. In

this section, we illustrate the ways in

which CBM measures could be used in

the content areas to aid special educa-

tion teachers in their decision making.

Research is needed to address theeffects of such implementation on

teacher instruction and student perfor-

mance, especially for students with LD

who spend a large portion of their

school day in general education classes

in the content areas (Lovitt, Plavins, &Cushing, 1999; Wagner, 1990).

At the beginning of the school

year, the special and general educationteacher would administer vocabulary-matching probes to all students in the

classroom. These data would be used

by the general education teacher to

identify students who might experi-ence difficulty in the class, and by the

special education teacher to evaluate

the appropriateness of the class for his

or her students. Following this initialassessment, students who are identi-

fied as at risk for difficulty in class

would be monitored by the special or

general education teacher to evaluatestudent learning in the content class.

One way to evaluate the learning

of the students with disabilities would

be to compare it to the learning of theirpeers without disabilities. For exam-

ple, in Figures 1 through 3, the scores

for three students with LD from our

study are graphed with the mean scorefor all students without LD. Slope lines

indicating rates of progress are dis-

played for each student and for the

class mean. The data from our studyindicate that Student 1 is performing

successfully in this social studies class;

as Figure 1 shows, level and rate of per-

formance for Student 1 are commensu-

rate with that of nondisabled peers.Student 2 is also performing success-

fully; although Student 2s level of per-

formance is below that of nondisabled

peers, the rate of growth is equal tothat of nondisabled peers (see Fig-

ure 2). Student 3, in contrast, is not per-

forming successfully in this social

studies class; both level and rate of per-formance for this student are substan-

tially below that of the class mean and,

for that matter, below that of other

peers with LD. The graph in Figure 3indicates a need for additional accom-

modations or modifications for Stu-

dent 3. If such accommodations or

modifications do not result in im-proved growth, an alternative place-

ment could be considered.

FIGURE 1. Progress graph and trendline for student with LD: Level and growth

rate of performance commensurate with that of peers.

FIGURE 2. Progress graph and trendline for student with LD: Level below that of

peers, but growth rate of performance commensurate with that of peers.

NumberofCorrectMatches

Numbe

rofCorrectMatches


9/12


This example illustrates how

content-area and special education

teachers can use progress monitoring

data to make instructional decisions re-garding students with LD in content-

area classes. Progress monitoring pro-

vides a data source different from butcomplementary to the typical evalua-tion based on course grades. Course

grades often reflect factors other than

learning, such as attendance, behavior,

and homework completion (e.g., Mil-ler, Leinhardt, & Zigmond, 1988), and

the meaning of course grades is some-

times unclear, especially when there

have been modifications in the gradingsystem (Olson, 1989; Rojewski, Pollard,

& Meers, 1991). Progress monitoring,

on the other hand, focuses solely on

learning and answers the questionwhether students good behavior, hard

work, and homework completion, and

teachers accommodations and modifi-

cations are having positive effects on

student learning.

Conclusion

In conclusion, our results support the

use of a student-read vocabulary-

matching probe as an indicator of stu-

dent learning in social studies. Takenin combination with the earlier results

of Espin et al. (2001), our results in-

dicate that a simple vocabulary-

matching measure can be used as anindicator of student performance and

progress over time in social studies.

This measure can be administered to

students in groups, takes only 5 min-

utes to administer, and can be scoredrelatively quickly.

On a more general level, our re-

sults provide further support for the

use of CBM measures as measures ofchange. As indicated by Francis et al.

(1994), L. S. Fuchs and Fuchs (1998),

and D. Fuchs et al. (2003), such mea-

sures have potential for use in decisionmaking for students with LD. That is,

the measures can be used not only to

determine to what extent students are

discrepant from their peers at a singletimepoint, but also to examine to what

extent students are progressing rela-

tive to their peers. Students who arediscrepant both in performance and

progress would be those most in need

of intensive interventions.

Our study is only a first step inthe research on the implementation of

progress monitoring procedures in

content-area classes. Several questionsremain, including (a) Will special andcontent-area teachers be willing to im-

plement and rely on progress monitor-

ing data? (b) Can progress monitoring

data serve as a conduit for communi-cation and collaboration between gen-

eral and special education teachers?

and (c) Will the implementation of

progress monitoring procedures resultin improved achievement for students

at risk and students with learning dis-

abilities?

ABOUT THE AUTHORS

Christine A. Espin, PhD, is a professor in the

Department of Educational Psychology at the

University of Minnesota. Her research focuses

on the development of progress-monitoring pro-

cedures in reading, written expression, and

content-area learning for secondary school stu-

dents with learning disabilities. Jongho Shin,

PhD, is an assistant professor in the Depart-

ment of Education at Seoul National Univer-

sity. His current research interests include read-

ing comprehension, learning strategies, and mo-

tivation. Todd W. Busch, PhD, is an assistant

professor of special populations at Minnesota

State University, Mankato. His current inter-

ests include teacher training, progress monitor-

ing for secondary-level students, and reading

comprehension. Address: Jongho Shin, Depart-

ment of Education, Seoul National University,Shinrim-Dong Kwanak-Gu, Seoul 151-748,

Korea; e-mail: [email protected]

AUTHORS NOTES

1. The research reported here was funded in

part by the Guy Bond Foundation, Univer-

sity of Minnesota.

2. We wish to thank the teachers, administra-

tors, and students of the Maplewood schools

for their participation in the study. We wish

to thank Dana Frederick for assistance in

data coding and Ron Kruschwitz for helpwith the data collection. Finally, we wish to

acknowledge the Netherlands Institute for

Advanced Study in the Humanities and So-

cial Sciences for its support in the prepara-

tion of this article.

REFERENCES

Baumann, J. F., & Kameenui, E. J. (1991).

Research on vocabulary instruction: Ode

to Voltaire. In J. Flood, J. M. Jensen, D.

NumberofCorrect

Matches

FIGURE 3. Progress graph and trendline for student with LD: Level and growth

rate of performance below that of peers.


10/12


Lapp, & J. R. Squire (Eds.),Handbook of re-

search on teaching the English language arts

(pp. 604632). New York: MacMillan.

Beck, I., & McKeown, M. (1991). Conditions

of vocabulary acquisition. In R. Barr,

M. L. Kamil, P. B. Mosenthal, & P. D. Pear-

son (Eds.), Handbook of reading research

(Vol. II, pp. 784814), New York: Long-man.

Blachowicz, C. L. Z. (1991). Vocabulary in-

struction in content classes for special

needs learners: Why and how?Journal of

Reading, Writing, and Learning Disabilities

International, 7, 297308.

Blachowicz, C. L. Z. & Fisher, P. (2000). Vo-

cabulary instruction. In M. L. Kamil, P. B.

Mosenthal, P. D. Pearson, & R. Barr

(Eds.), Handbook of reading research (Vol.

III, pp. 503524), Mahwah, NJ: Erlbaum.

Bryk, A. S., & Raudenbush, S. W. (1987). Ap-

plication of hierarchical linear models to

assessing change. Psychological Bulletin,

101, 147158.

Bryk, A. S., & Raudenbush, S. W. (1992).Hi-

erarchical linear models: Applications and

data analysis methods. Newbury Park, CA:

Sage.

Compton, D. L. (2000). Modeling the growth

of decoding skills in first-grade children.

Scientific Studies of Reading, 4, 219259.

Deno, S. L. (1985). Curriculum-based mea-

surement: The emerging alternative. Ex-

ceptional Children, 52, 219232.

Espin, C. A., Busch, T., Shin, J., & Krusch-

witz, R. (2001). Curriculum-based mea-sures in the content areas: Validity of

vocabulary-matching measures as indi-

cators of performance in social studies.

Learning Disabilities Research & Practice,

16, 142151.

Espin, C. A., & Deno, S. L. (1993a). Content-

specific and general reading disabilities

of secondary-level students: Identifica-

tion and educational relevance. The Jour-

nal of Special Education, 27, 321337.

Espin, C. A., & Deno, S. L. (1993b). Perfor-

mance in reading from content-area text

as an indicator of achievement. Remedial

and Special Education, 14(6), 4759.Espin, C. A., & Deno, S. L. (19941995).

Curriculum-based measures for secon-

dary students: Utility and task specificity

of text-based reading and vocabulary

measures for predicting performance on

content-area tasks. Diagnostique, 20, 121

142.

Espin, C. A., Deno, S. L., Maruyama, G., &

Cohen, C. (1989, April). The Basic Aca-

demic Skills Samples: An instrument for

screening and identifying children at risk for

failure in the regular education classroom.

Paper presented at the American Educa-

tional Research Association Meeting, San

Francisco, CA.

Espin, C. A., & Foegen, A. (1996). Validity of

three general outcome measures for pre-

dicting secondary students performanceon content-area tasks. Exceptional Chil-

dren, 62, 497514.

Espin, C. A., & Tindal, G. (1998). Curricu-

lum-based measurement for secondary

students. In M. R. Shinn (Ed.), Advanced

applications of curriculum-based measure-

ment (pp. 214253). New York: Guilford

Press.

Francis, D. J., Shaywitz, S. E., Stuebing,

K. K., Shaywitz, B. A., & Fletcher, J. M.

(1994). Measurement of change: Assess-

ing behavior over time and within devel-

opmental context. In G. R. Lyon (Ed.),

Frames of reference for the assessment of

learning disabilities: New views on measure-

ment issues (pp. 2958). Baltimore: Brookes.

Fuchs, D., Fuchs, L. S., McMaster, K. N., &

Al Otaiba, S. (2003). Identifying children

at risk for reading failure: Curriculum-

based measurement and the dual-

discrepancy approach. In H. L. Swanson,

K. R. Harris, & S. Graham (Eds.), Hand-

book of learning disabilities (pp. 431449).

New York: Guilford Press.

Fuchs, L. S., Deno, S. L., & Mirkin, P. (1984).

Effects of frequent curriculum-based

measurement and evaluation on peda-gogy, student achievement, and student

awareness of learning. American Educa-

tional Research Journal, 21, 449460.

Fuchs, L. S., & Fuchs, D. (1998). Treatment

validity: A unifying concept for recon-

ceptualizing the identification of learning

disabilities. Learning Disabilities Research

& Practice, 13, 204219.

Fuchs, L. S., Fuchs, D., & Hamlett, C. L.

(1989a). Effects of alternative goal struc-

tures within curriculum-based measure-

ment. Exceptional Children, 55, 429438.


(1989b). Effects of instrumental use ofcurriculum-based measurement to en-

hance instructional programs. Remedial

and Special Education, 10(2), 4352.


(1989c). Monitoring reading growth

using student recalls: Effects of two

teacher feedback systems.Journal of Edu-

cation Research, 83, 103111.

Fuchs, L. S., Fuchs, D., Hamlett, C. L., &

Allinder, R. M. (1991). The contribution of

skills analysis within curriculum-based

measurement in spelling. Exceptional

Children, 57, 443452.

Fuchs, L. S., Fuchs, D., Hamlett, C. L., &

Ferguson, C. (1992). Effects of expert sys-

tem consultation within curriculum-

based measurement, using a reading maze

task. Exceptional Children, 58, 436450.Fuchs, L. S., Fuchs, D., Hamlett, C. L., &

Stecker, P. M. (1990). The role of skills

analysis in curriculum-based measure-

ment in math. School Psychology Review,

19, 622.

Fuchs, L. S., Fuchs, D., & Maxwell, L. (1988).

The validity of informal reading compre-

hension measures. Remedial and Special

Education, 9(2), 2028.

Hoover, H., Hieronymus, A., Frisbie, D., &

Dunbar, S. (1993). Iowa tests of basic skills.

Chicago: Riverside.

Konopak, B. C. (1989). Effects of inconsid-

erate text on eleventh graders vocabu-

lary learning. Reading Psychology, 10, 339

355.

Lovitt, T. C., Plavins, M., & Cushing, S.

(1999). What do pupils with disabilities

have to say about their experience in high

school? Remedial and Special Education, 20,

6776, 83.

Marston, D. B. (1989). A curriculum-based

measurement approach to assessing aca-

demic performance: What it is and why

do it. In M. R. Shinn (Ed.), Curriculum-

based measurement: Assessing special chil-

dren (pp. 1878). New York: Guilford Press.Messick, S. (1994). The interplay of evi-

dence and consequences in the validation

of performance assessments. Educational

Researcher, 23(2), 1323.

Miller, S. E., Leinhardt, G., & Zigmond, N.

(1988). Influencing engagement through

accommodations: An ethnographic study

of at-risk students. American Educational

Research Journal, 25, 465487.

Nagy, W. E., & Scott, J. A. (2000). Vocabulary

processes. In M. L. Kamil, P. B. Mosen-

thal, P. D. Pearson, & R. Barr (Eds.),Hand-

book of reading research (Vol. III). Mahwah,

NJ: Erlbaum.Nolet, V., & Tindal, G. (1993). Special edu-

cation in content area classes: Develop-

ment of a model and practical proce-

dures. Remedial and Special Education, 14,

3648.

Nolet, V., & Tindal, G. (1994). Instruction

and learning in middle school science

classes: Implications for students with

learning disabilities. The Journal of Special

Education, 28, 166187.


11/12


Nolet, V., & Tindal, G. (1995). Essays as

valid measures of learning in middle

school science classes. Learning Disability

Quarterly, 18, 311324.

Olson, G. H. (1989). On the validity of perfor-

mance grades: The relationship between

teacher-assigned grades and standard mea-

sures of subject matter acquisition. (ERIC

Document Reproduction Service No. ED

307 290).

Rojewski, J. W., Pollard, R. R., & Meers,

G. D. (1991). Grading mainstreamed spe-

cial needs students: Determining prac-

tices and attitudes of secondary voca-

tional educators using a qualitative

approach. Remedial and Special Education,

12(1), 715, 28.

Salvia, J., & Ysseldyke, J. (1998).Assessment

(6th ed.). Boston: Houghton Mifflin.

Scruggs, T. E., Mastropieri, M. A., & Boon,

R. (1998). Science education for students

with disabilities: A review of recent re-

search. Studies in Science Education, 32,

2144.

Shin, J., Deno, S. L., & Espin, C. A. (2000).

Technical adequacy of the maze task for

curriculum-based measurement of read-

ing growth. The Journal of Special Educa-

tion, 34, 164172.

Stecker, P. M., & Fuchs, L. S. (2000). Ef-

fecting superior achievement using

curriculum-based measurement: The im-

portance of individual progress monitor-

ing. Learning Disabilities Research & Prac-

tice, 15, 128134.

Tindal, G., & Nolet, V. (1995). Curriculum-

based measurement in middle and high

schools: Critical thinking skills in content

areas. Focus on Exceptional Children, 27(7),

122.

Tindal, G., & Nolet, V. (1996). Serving stu-

dents in middle school content classes: A

heuristic study of critical variables link-

ing instruction and assessment. The Jour-

nal of Special Education, 29, 414432.

Wagner, M. (1990). The school programs and

school performance of secondary students

classified as learning disabled: Findings from

the National Longitudinal Transition Study

of Special Education Students. Paper pre-

sented at the annual meeting of the

American Educational Research Associa-

tion, Boston.

Wesson, C., Deno, S., Mirkin, P., Maruyama,

G., Sevcik, B., Skiba, R., et al. (1988). A

causal analysis of the relationships

among on-going measurement and eval-

uation, the structure of instruction, and

student achievement. Journal of Special

Education, 22(3), 330343.

Willet, J. B. (1989). Questions and answers

in the measurement of change. Review of

Research in Education, 15, 345421.

The six covers of this volume year of the Journal of Learning Dis-abilities feature original artwork by Gabriel Lovett, RachaelSeger, and Caitlin Zirkelbach. We plan to continue showcasingthe artwork of individuals with learning disabilities on JLD cov-ers; therefore, we are now soliciting art for the 2006 issue covers.

Eligibility. Individuals with learning disabilities of any ageare encouraged to submit their original work for consideration.The artwork may be a painting, drawing, photograph, sculpture,computer-generated graphic, or any comparable medium. Workmust not exceed a maximum of 11" by 17"; three-dimensionalpieces must not exceed 10 pounds. Two entries per participantmay be submitted.

Submissions. Each entry must include:

the artists name, age, address, andcontact information

the title of the work the specific medium used (computer-generated pieces

should include step-by-step information on softwareused)

the size of the work

All artwork, including photographic images, must be theoriginal work of the submitting artist. Signed photo releases

must accompany any work that includesphoto images of people.

The actual submission of the art should be a color reproduc-tion (which will not be returned) in one of the following formats:

color laser print photograph slide (35 mm) saved as an EPS or TIFF file on Zip disk, CD, or regular

3 12" floppy disk

The winner(s) may be asked to send in original art, whichwill be returned.

Judging. Work will be judged based on originality, creativeuse of materials, and overall composition and design. The age ofthe artist will be taken into account.

Entries should be postmarked by October 1, 2005. Judgingwill take place on or about October 15, and artists will be noti-fied of our selection by December 1, 2005. Entries, requests formore information, or questions should be directed to Judith K.Voress, Periodicals Director, PRO-ED, 8700 Shoal Creek Blvd.,Austin, TX 78757-6897; 512/451-3246, ext. 630; FAX: 512/302-9129; e-mail: [email protected]. PRO-ED assumes no re-sponsibility for entries damaged in the mail.

Cover Art for 2006 Journal of Learning DisabilitiesSought


12/12

Documents

Vocabulary Matching Research