Upload
amandamazzio
View
235
Download
0
Embed Size (px)
Citation preview
7/29/2019 Vocabulary Matching Research
1/12
JOURNAL OF LEARNING DISABILITIES
VOLUME 38, NUMBER 4, JULY/AUGUST 2005, PAGES 353363
Curriculum-Based Measurementin the Content Areas:
Vocabulary Matching as an Indicatorof Progress in Social Studies Learning
Christine A. Espin, Jongho Shin, and Todd W. Busch
Abstract
The purpose of this study was to examine the reliability and validity of curriculum-based measures as indicator of growth in content-
area learning. Participants were 58 students in 2 seventh-grade social studies classes. CBM measures were student- and administrator-
read vocabulary-matching probes. Criterion measures were performance on a knowledge test, the social studies subtest of the Iowa Test
of Basic Skills (ITBS), and student grades. Both the student- and examiner-read measures reflected change in performance; however, onlythe student-read measure resulted in interindividual differences in growth rates. Significant relations were found between the growth
rates generated by the student-read vocabulary measure and student course grades, ITBS scores, and growth on the knowledge test.
These results support the validity of a vocabulary-matching measure as an indicator of student learning in the content areas. The results
are discussed in terms of the use of CBM as a system for monitoring performance and evaluating interventions for students with learn-
ing disabilities in content-area classrooms.
O
ne of the most important yetmost difficult components of
education is the measurementof change. By measuring change in
performance, teachers can reliablyevaluate student learning and the ef-
fects of instructional interventions on
that learning. Yet despite its impor-
tance, change measurement is rarelythe focus of educational assessment,
where the measurement of perfor-
mance at a single point in time is the
dominant approach. In few other areasof education is this emphasis more
prevalent than in the field of learning
disabilities (LD), where the identifica-tion of students for services is often
based on the discrepancy between twosingle scoresan intelligence score
and an achievement score.
The lack of attention given to
change measurement in education hasbeen due in part to the difficulties as-
sociated with measuring change in
performance, including a lack of statis-
tical methods for handling multiple
data points (Willet, 1989) and a lack of
assessment tools available for produc-
ing repeated measures within shorttime periods (Francis, Shaywitz, Stue-
bing, Shaywitz, & Fletcher, 1994). With
regard to statistical methods, recent
developments have opened up newpossibilities for incorporating stu-
dents change in performance as part
of educational assessment (see Bryk &
Raudenbush, 1987, 1992). Francis et al.
(1994) illustrated the use of these ad-vanced statistical procedures in the
area of LD. These authors and others
(D. Fuchs, Fuchs, McMaster, & Al
Otaiba, 2003; L. S. Fuchs & Fuchs, 1998)have proposed that change measure-
ment be involved in defining and di-
agnosing LD as well as in determining
students responses to interventions.With regard to the availability of as-
sessment tools, there exists a measure-
ment system specifically designed to
measure change in student perfor-mance by producing repeated mea-
sures within short time periods. This
system of measurement, referred to as
curriculum-based measurement (CBM),
has a strong body of research to sup-port its validity and reliability.
Curriculum-BasedMeasurement
Curriculum-based measurement is anongoing data collection system that is
designed to provide teachers with in-
formation on student progress and on
the effects of instructional interventionson that progress. The measures devel-
oped for use as a part of CBM are sim-ple, efficient, easy to understand, and
inexpensive, and allow for repeatedmeasurement of student performance
over time (Deno, 1985). More than 25
years of research have supported the
validity and reliability of CBM mea-sures as indicators of performance for
elementary school students in the ba-
sic skill areas of reading, mathematics,
spelling, and written expression. Cor-
7/29/2019 Vocabulary Matching Research
2/12
JOURNAL OF LEARNING DISABILITIES354
relations between CBM indicators and
a variety of criterion measures gener-
ally range from .60 to .90, and test
retest and alternate-form reliabilitiesare generally above .80 (see Marston,
1989, for a review). The treatment
validity of CBM measures at the ele-mentary school level has also beensupported. When teachers use CBM
measures to evaluate and modify their
instruction, student achievement im-
proves (L. S. Fuchs, Deno, & Mirkin,1984; L. S. Fuchs, Fuchs, & Hamlett,
1989a, 1989b, 1989c; L. S. Fuchs, Fuchs,
Hamlett, & Allinder, 1991; L. S. Fuchs,
Fuchs, Hamlett, & Ferguson, 1992; L. S.Fuchs, Fuchs, Hamlett, & Stecker, 1990;
Stecker & Fuchs, 2000; Wesson et al.,
1988). Most recently, CBM has been
combined with statistical techniquessuch as hierarchical linear modeling
(HLM) to generate student growth
curves and to use these growth curves
to answer questions about the relationbetween student progress and instruc-
tional variables (Compton, 2000; Shin,
Deno, & Espin, 2000).
The initial work in the area ofCBM was conducted at the elementary
school level; later, this work was ex-
tended to the secondary school level
(see Espin & Tindal, 1998, for a review).With that extension came an interest in
the development of CBM measures in
content areas such as social studies and
science.
CBM in the Content Areas
The initial research on the develop-
ment of curriculum-based measure-
ment in the content areas was con-
ducted by Tindal and Nolet (see Nolet
& Tindal, 1993, 1994, 1995; Tindal &Nolet, 1995, 1996). Tindal and Nolet
identified the critical thinking skills (e.g.,
explanation of concepts, illustration offacts) needed to understand and use
content-area information and created
measures to represent these thinking
skills. The measures were appropriatefor determining student learning within
a given unit of study, but they were less
appropriate for showing growth across
study units (see Tindal & Nolet, 1995).
Espin and Deno (1993a, 1993b, 1994
1995) and Espin and Foegen (1996) took
a somewhat different approach and fo-cused on the identification of a single
indicator that would represent general
performance in the content areas. Thefirst step in this research was to estab-lish the reliability and validity of a sin-
gle indicator for predicting student
performance on content-area tasks.
Espin and Deno (1993b, 19941995)examined the validity of two measures
as potential indicators of performance
on content-area study tasks in En-
glish and science. Tenth-grade partici-pants read aloud for 1 minute from
English and science textbook passages.
In addition, participants were given 10
minutes to complete a vocabulary-matching task with terms selected
from each passage. The criterion task
was a study task in which students
searched through the text for answersto comprehension questions. Correla-
tions between the predictor and crite-
rion measures were in the low moder-
ate to moderate range (r = .37.44).Correlations were similar for vocabu-
lary matching and for reading aloud
from text, but in a regression analysis,
vocabulary matching accounted for thelargest proportion of variance in the
criterion task, with reading aloud not
adding to the variance.
In a follow-up study, Espin and
Foegen (1996) compared the validity ofthree CBM measures for predicting
student performance on three criterion
tasks. CBM measures were reading
aloud from text, vocabulary matching,and maze selection. The maze selection
measure was included in this study be-
cause it was known to be a good pre-
dictor of reading performance at theelementary school level (Espin, Deno,
Maruyama, & Cohen, 1989; L. S. Fuchs,
Fuchs, & Maxwell, 1988). The criterion
measures in the study were represen-tative of the tasks required of students
in content-area classes and included
comprehension, acquisition, and reten-
tion of content information. Participantsin the study were 186 middle school
students. The results revealed moder-
ate to moderately strong correlations
between the CBM and criterion tasks
(r = .52.65). Once again, in a regres-
sion analysis, vocabulary matching ac-counted for the greatest proportion
of variance in the criterion tasks, with
neither of the other measures con-tributing substantially to the variance.
Although the results of Espin and
Denos (1993b, 19941995) and Espin
and Foegens (1996) studies suggested
that vocabulary matching was a validindicator of content-area performance,
neither study had been conducted in
an actual content-area classroom. In
our research, we wished to extendthis early work to examine vocabulary
matching in a middle school social
studies classroom. We conducted two
related studies. In the first, we ex-amined the technical adequacy of a
vocabulary-matching measure as an
indicator ofperformance in social stud-
ies; in the second, we examined thetechnical adequacy of the same vocab-
ulary-matching measure as an indica-
tor ofprogress in social studies.
In this article, we report the re-sults of the progress study. We begin,
however, by summarizing the results
of the performance study (see Espin,
Busch, Shin, & Kruschwitz, 2001, fordetails). Participants in the performance
study were 58 seventh-grade students
from two social studies classrooms.
Based on the results of previous re-
search, only the vocabulary-matchingmeasure was included in this study. In
order to examine the role of reading
in the prediction of performance, two
versions of the vocabulary-matchingmeasure were compared: a student-
read version, in which students read
words and definitions to themselves,
and an examiner-read version, in whichthe examiner read the words and defi-
nitions to the students. Criterion mea-
sures were a research-made pre- and
post- knowledge test, social studiesgrades, and scores on the social studies
subtest of a standardized achieve-
ment test.
The results of the performancestudy revealed that alternate-form reli-
abilities for the vocabulary-matching
7/29/2019 Vocabulary Matching Research
3/12
VOLUME 38, NUMBER 4, JULY/AUGUST 2005 355
measures ranged from .58 to .87, with
a mean reliability of .70 for student-
and administrator-read forms. Relia-
bility was increased to .84 and .78 forstudent- and administrator-read probes,
respectively, by combining scores across
the two probes. Analysis with respectto the validity of the measures lentsupport to the use of both measures as
indicators of student performance in
social studies. Correlations between
vocabulary matching and the knowl-edge and standardized achievement
tests ranged from .56 to .84. Correla-
tions with class grades were lower,
ranging from .27 to .51, in part due tothe restricted range of scores for course
grades (most students earned grades
of C to A); however, the correlation be-
tween the student-read probe andcourse grades was moderately strong
(r = .51). Finally, although the sample
of students with LD was small, perfor-
mance on the vocabulary-matchingprobe differentiated students with and
without LD.
Purpose and ResearchQuestions
The results of Espin et al. (2001) con-firmed the validity of vocabulary match-
ing as an indicator ofperformance. Thisresult is not surprising, given the liter-
ature on the importance of vocabulary
knowledge for both reading com-
prehension and content-area learning(e.g., Baumann & Kameenui, 1991; Beck
& McKeown, 1991; Blachowicz, 1991;
Blachowicz & Fisher, 2000; Konopak,
1989; Nagy & Scott, 2000; Scruggs,Mastropieri, & Boon, 1998). However,
the 2001 study did not address a key
question: Would vocabulary matchingprove to be a reliable and valid indica-tor of studentprogress? In other words,
would the growth trajectories pro-
duced by repeated measurement on
alternate forms of the vocabulary-matching probes adequately model
student growth and learning in social
studies? A review of the literature re-
veals that the answer to this question isnot obvious; that is, despite the recog-
nition of the importance of vocabulary
for reading comprehension and content-
area learning, the way in which vocab-
ulary terms are learned and the rela-tion between such learning and the
comprehension and acquisition of text
material is not clear (see Baumann &Kameenui, 1991; Beck & McKeown,1991; Blachowicz & Fisher, 2000; Nagy
& Scott, 2000). Thus, in this second
study, we wished to explore the ques-
tion of whether student performanceon the vocabulary measures would
changeand whether this change would
occur concomitantly with learning.
We addressed the following tworesearch questions in the study:
1. What is the validity of vocabulary
matching as an indicator of prog-ress (i.e., learning) in a social stud-
ies class?
2. Does the validity of vocabulary
matching differ as a function of theadministration format (i.e., student
vs. administrator read)?
To address these questions, three is-sues were examined: (a) the sensitivity
of the vocabulary-matching measures
to improvement in student perfor-
mance over time; (b) the sensitivity ofthe vocabulary-matching measures to
interindividual differences in growth
rates; (c) the validity of the growth
rates generated by the vocabulary-matching measures with respect to
course grades, performance on a stan-
dardized achievement test, and im-
provement on a content knowledge
test.
Method
Participants
Participants in this study were 58
seventh-grade students (32 boys and
26 girls; mean age = 13.6 years). These
students had also participated in theearlier study on vocabulary matching
as a performance measure (Espin et al.,
2001). Students were recruited from
two social studies classes in a suburbanschool in the Midwest. The majority of
participants were European American
(95%), with a small percentage of stu-
dents who were African American or
Asian American (5%); 28% of the schoolpopulation received free or reduced-
price lunches.
Five of the participants were iden-tified as having LD according to dis-trict standards (4 boys and 1 girl; mean
age = 13.4 years). The identification
standards for LD included a history of
underachievement, a discrepancy be-tween ability and achievement, and an
information-processing deficit. All five
students were receiving services in
reading and written expression, andone was receiving additional services
in mathematics. Percentile scores for
the students with LD on the Iowa Test of
Basic Skills, form K, Level 12 (ITBS;Hoover, Hieronymus, Frisbie, & Dun-
bar, 1993) were as follows: 30.4 (range =
346) on the Reading Vocabulary sub-
test, 26.8 (range = 249) on the ReadingTotal subtest, and 13.2 (range = 120)
on the Social Studies subtest.
Procedure
During the winter and spring quarters,
students were tested weekly with two
types of vocabulary-matching probes:student read and administrator read.
The student-read version of the probe
consisted of 22 vocabulary terms, in-
cluding two distractors, and 20 defini-tions. Terms and definitions were cho-
sen at random from a master list of 146
terms created from the social studies
textbook and the teachers lecturenotes. Definitions were modified if nec-
essary so that each definition would
have fewer than 15 words. Vocabulary
terms appeared on the left side of the
page and were arranged alphabeticallyto help students easily locate terms.
Definitions appeared on the right side
of the page. Students were given 5minutes to read the terms and defini-
tions and to match each term with its
definition.
The administrator-read version ofthe vocabulary probe was developed
from the same set of terms and defini-
tions as the student-read version. On
7/29/2019 Vocabulary Matching Research
4/12
JOURNAL OF LEARNING DISABILITIES356
the administrator-read version, only the
vocabulary terms were given. The test
administrator read the definitions, and
the students identified which termsmatched the definition being read. Def-
initions were read one at a time,
with 15-second intervals between eachitem. Each probe lasted a total of 5 min-utes.
Vocabulary-matching probes were
administered weekly by the third au-
thor. Students were given the twotypes of probes consecutively each
week. To control for order effects, the
order in which the probes were given
was alternated each week. The numberof correct matches on each probe was
tallied and used in data analysis. In
total, each individual completed 11
administrator-read and 11 student-read probes.
In addition to the vocabulary-
matching probes, students were ad-
ministered a knowledge test at pre-and posttest to measure the amount
learned during the study. The knowl-
edge test was composed of 36 ques-
tions in the areas of sociology, psychol-ogy, and geography. Questions were
developed on the basis of textbook
content, teachers lecture notes, and
teacher-made worksheets and tests.The social studies teacher was asked to
review all items on the knowledge test
to ensure that the items matched the
information presented to the students
in class.
Items were classified into two
types of questions: applied (27 items)and factual (9 items).Applied questions
were those in which students were
asked to apply social concepts andprinciples to specific social events orphenomena. Factual questions were
those in which students were asked to
make simple one-to-one relations be-
tween concepts and events (see Table 1for examples of these two types of
questions). A heavier emphasis was
placed on the applied questions to
ensure that the relation between thevocabulary-matching tasks and the
knowledge test would not be solely a
function of the similarity in the task
requirements.Following the development of the
knowledge test items, the items were
given to a graduate student in special
education, who was not involved inthe study, and to the social studies
teacher involved in the study. The
graduate student and social studies
teacher were asked to classify eachitem as either applied or factual. Inter-
rater agreement between each rater
and the third author was calculated by
dividing agreements by agreementsplus disagreements. Interrater agree-
ment was .95 and .89 for the graduate
student and the social studies teacher,
respectively. Items that were not cor-
rectly classified by the graduate stu-
dent or the social studies teacher were
modified. Students were given theknowledge test at the beginning and
end of the study. The number of correct
answers was used for data analysis.Students social studies grades
and their scores on the ITBS were also
collected. Social studies grades repre-
sented students mean grade in the
class across three grading periods. Let-ter grades were assigned to each stu-
dent. For our purposes, the letter grades
were converted to numeric values on a
13-point scale, with 13 representing anA+, 12 an A, 11 an A, 10 a B+, and so
on. A score of F was assigned a 0. If
students failed a class, they were able
to retake it in a 4-week makeup ses-sion. If students passed this makeup
session, they were assigned a grade of
P. These passing grades were assigned
a value of 1. Course grades were basedequally on homework, quizzes, unit
tests, and current events reporting.
Scores on the ITBS were obtained
from students school records. Stu-dents completed the ITBS the spring
prior to the beginning of the study.
Form K, Level 12 of the ITBS was
administered. The Social Studies sub-test of the ITBS consists of 42 questions
covering history, geography, eco-
nomics, political science, sociology/
anthropology, and related social sci-
ences (e.g., ethics, human values) Stan-dard scores were used for all analyses.
The internal consistency of the ITBS, as
reported in the technical manual,
ranges from .61 to .93. Salvia and Ys-seldyke (1998) reported that the items
of the ITBS were reviewed for content
fit and item bias by field experts and
then tested on a large sample across theUnited States. The results of this test-
ing were used for final sample selec-
tion.
Statistical Analysis
Hierarchical linear models (HLM)were used to address three items with
respect to the vocabulary-matching
measures: (a) sensitivity to improve-
TABLE 1
Examples of Applied and Factual Questions on the Knowledge Test
Question type Example
Applied
Factual
Jos comes from a working class home. He married Judy who is very
wealthy and moves into an upper class neighborhood. Joss change
in status is an exmple ofa. mobility
b. sanctions
c. mores
d. primary group
The process by which a member learns the rules of his or her group is
called
a. socialization
b. community
c. role play
d. mobility
7/29/2019 Vocabulary Matching Research
5/12
VOLUME 38, NUMBER 4, JULY/AUGUST 2005 357
ment of student performance over
time, (b) sensitivity to interindividual
differences in growth rates, and (c) va-
lidity of the growth rates produced bythe measures with respect to the crite-
rion measures. To address the first two
issues, unconditional HLM modelswere employed to examine the sensi-tivity of vocabulary probes for mea-
suring student growth over time and
for revealing interindividual differ-
ences in growth rates. In these analy-ses, the significance of the mean
growth rates and growth parameter
variances estimated by each type of vo-
cabulary measure were statistically in-vestigated. To address the third issue,
course grades, scores on the Social
Science subtest of the ITBS, and per-
formance change on the content knowl-edge test were used as Level 2 vari-
ables in HLM to examine the validity
of the growth measures estimated on
the vocabulary probes. In this analysis,the relations between growth rates es-
timated on the vocabulary probes and
the criterion measures were statisti-
cally examined.
Results
Sensitivity to ImprovementOver Time and IndividualDifferences
The first step in our analysis was to
determine whether student- and ad-
ministrator-read vocabulary-matching
measures were sensitive to improve-ment over time, and whether they
revealed interindividual differences
in growth rates. Descriptive statistics
of repeated measures of students
performance on the student- andadministrator-read probes are dis-
played in Table 2. Observed mean
scores on both types of vocabularyprobes increased over time. Moreover,
interindividual differences in student
performance increased over time on
the student-read probes, as evidencedby the increase in standard deviations.
The improvement in performance
scores and the interindividual differ-
ences in growth rates were further ex-
amined using hierarchical linear mod-
els (Bryk & Raudenbush, 1987, 1992).
Specifically, the statistical significanceof the mean growth rate and of the
growth parameter variance estimated
by each type of vocabulary measurewas tested. The statistical test of thesignificance of the mean growth rate
addressed the question of whether the
growth rate for the entire group of stu-
dents was statistically different from anull growth rate (i.e., growth rate of
zero). The statistical test of the signifi-
cance of the growth parameter vari-
ance addressed the question whetherindividual students differed in their
rates of growth over time.
We hypothesized that as a group,
the students would improve signifi-cantly in social studies knowledge
over the school year due to the teach-
ers instruction. Moreover, we hypoth-
esized that students would not sharethe same growth rates because of indi-
vidual characteristics (e.g., intelligence,
prior knowledge, and motivation to
learn). To test the validity of these hy-potheses, the following unconditional
models were used in this study:
Yti = 0i + 1i Week= eti(within-individual model) and
0i = 00 + u0i, 1i = 10 + u1i(between-individual models),
where Yti is the observed score for stu-
dent i at time t, 0i the intercept of the
growth line for student i, 1i the weekly
growth rate for student i, 00 the meanintercept for the entire group of stu-
dents, 10 the mean growth rate for the
entire group of students, eti the error re-
lated to student i, and u0i and u1i the
random errors related to the mean in-tercept and growth rate, respectively.
In the analysis, the intercept was cen-
tered at the first occasion of data col-lection; therefore, it showed individual
differences in the students vocabulary
knowledge at the beginning of the
study.The statistical test of the signifi-
cance of the mean growth rates revealed
that the mean growth rates estimated
by both student- and administrator-
read vocabulary probes were statisti-
cally different from a null growth rate;
that is to say, they were both sensitivefor detecting significant improvement
of students performance over time
(see Table 3). The mean growth rate es-timated by the student-read vocabu-lary probe showed an increase of .65
correct matches per week, whereas the
mean growth rate estimated by the
administrator-read probe showed anincrease of .22 correct matches per
week.
The statistical test of the signifi-
cance of the growth parameter variancerevealed that the growth parameter
variance estimated by the student-read
vocabulary probe was statistically dif-
ferent from no variability in studentsgrowth rates (see Table 4); that is, there
were individual differences in growth
rates on the student-read measure. In
contrast, the growth parameter vari-ance estimated by the administrator-
read probe was not statistically differ-
ent from no variability, indicating that
all students shared the same growthrate (i.e., the mean growth rate). Thus,
the results of this analysis revealed
that only the student-read vocabulary-
matching probe was sensitive enoughto reveal interindividual differences in
growth rates among students. Given
this finding, only the student-read
probe was entered into future analyses
(see Bryk & Raudenbush, 1987, 1992).Regarding the performance at the
beginning of the study, the mean inter-
cepts were statistically significant for
both types of probes (see Table 3). Themean intercept for the administrator-
read probe, however, was slightly
higher than that for the student-read
probe. Both types of probes also sensi-tively reflected the existence of in-
terindividual differences in vocabulary
knowledge at the beginning of the
study (see Table 4).
Validity of Growth Rates
The validity of the growth rates esti-
mated by the student-read vocabulary-
matching measure was examined by
7/29/2019 Vocabulary Matching Research
6/12
JOURNAL OF LEARNING DISABILITIES358
investigating the relations betweenthe growth rates generated by the
vocabulary-matching measures and
the residualized gain scores on the con-
tent knowledge test, course grads insocial studies, and scores on the Social
Science subtest of the ITBS. Means andstandard deviations for the criterion
measures were as follows: knowledgepretest, M = 20.27, SD = 5.07; knowl-
edge posttest, M = 24.86, SD = 5.62;
ITBS, M = 218.72, SD = 32.27; social
studies grades,M = 9.38, SD = 3.24.The three criterion measures were
separately included in the between-
individual model as a Level 2 predictor
because the main interest of our analy-sis was in examining the direct relation
between growth rates on the student-
read vocabulary probe and each crite-
rion measure, not the relative contri-bution of the criterion measures to
predicting the students growth rates.
The three between-individual models
used in the analysis were as follows:
1i = 10 + 11 (GainScore)i + u1i1i = 10 + 11 (CourseGrade)i + u1i1i = 10 + 11 (ITBS)i + u1i
where 1i is the linear growth rate for
student i, 10 the mean growth rate forthe entire group of students, 11 theregression coefficient showing the re-
lation between growth rates on the
student-read vocabulary probe and
each corresponding criterion measure,and u1i the random error related to the
mean growth rate.
Prior to examining the relations
between growth rates and criterionmeasures, the reliability of the growth
rate parameter was explored. This was
done to ensure that the relations be-
tween growth rates and criterion mea-sures were examined reliably. The re-
liability of the growth parameter in
HLM is defined as the proportion of
observed variance of the parameter totrue variance. Low parameter reliabil-
ity (e.g., less than .30) indicates that
estimates of the growth parameter are
unstable and that their relations toother variables cannot be examined
in a dependable way. The reliability of
TABLE 3
Sensitivity of Student- and Administrator-Read Vocabulary Probes for
Revealing Growth Over Time (Fixed-Effect Model)
Probe/effect Coefficient Standard error ta p
Student-read
Intercept (00) 5.16 .46 11.23 .00
Mean growth (10) .65 .06 10.70 .00
Administrator-read
Intercept (00) 7.98 .46 17.24 .00
Mean growth (10) .22 .04 5.86 .00
TABLE 4
Sensitivity of Student- and Administrator-Read Vocabulary Probes for Revealing
Interindividual Differences in Growth Rates (Random-Effect Model)
Probe/effect Variance 2 p
Student-read
Intercept (00) 9.07 222.01 .00
Mean growth (10) .12 133.06 .00
Administrator-read
Intercept (00) 9.84 276.97 .00
Mean growth (10) .01 59.00 .37
Note. Chi-square df= 56, N= 57.
TABLE 2
Means and Standard Deviations for Student- and Administrator-Read
Vocabulary-Matching Probes
Student-read Administrator-read
Probe n M SD M SD
1 53 5.23 3.41 8.41 3.83
2 54 6.02 3.48 8.75 4.33
3 53 7.11 5.35 6.25 3.61
4 44 6.00 4.09 10.80 4.52
5 48 7.04 4.13 8.62 4.23
6 53 9.83 5.24 10.64 4.86
7 49 9.35 4.99 8.62 4.87
8 50 8.84 5.26 8.88 4.51
9 50 9.78 5.14 10.24 4.41
10 52 14.10 5.38 9.47 3.80
11 51 9.71 5.67 10.88 4.20
7/29/2019 Vocabulary Matching Research
7/12
VOLUME 38, NUMBER 4, JULY/AUGUST 2005 359
the growth rate parameter was .52 in
the null model, indicating that 52% of
the total growth rate parameter vari-
ance estimated by the student-read vo-cabulary probe could be attributed to
the true parameter variance (see Bryk
& Raudenbush, 1987, 1992). This resultfor the reliability of the parameter vari-ance suggests that the relations be-
tween growth rates and criterion mea-
sures would be examined reliably in
this study.The results of the validity analy-
ses revealed that the growth rates esti-
mated by the student-read vocabulary
probe were significantly related toresidualized gain scores on the knowl-
edge test, to students course grades in
social studies, and to the ITBS Social
Studies test scores (see Table 5). Inother words, students who had larger
gain scores on the knowledge test,
higher course grades, and higher test
scores on the ITBS also had highergrowth rates on the student-read vo-
cabulary probe. These results support
the validity of the student-read vocab-
ulary measure as an indicator ofgrowth in learning.
Discussion
The results of this study indicate that
only the student-read version of the
vocabulary-matching probe produced
growth trajectories that were valid andreliable predictors of student perfor-
mance in social studies. Both the stu-
dent- and administrator-read versions
of the vocabulary-matching probesproduced significant group growth
rates, although the student-read mea-
sure revealed greater growth over
time. However, only the student-read
vocabulary-matching measure wassensitive to interindividual differences
in growth rates. Because we can as-
sume that not all students participat-ing in the study had identical growth
rates, our findings imply that only the
student-read version is sufficiently
sensitive to growth over time.Examination of Table 2 may help
to explain the contrast between the two
measures in terms of their sensitivity to
interindividual differences in growth
rates. As illustrated in Table 2, stan-
dard deviations for the student-read
probes tended to increase graduallyacross the duration of the study,
whereas standard deviations for the
administrator-read probes did not. Ifthe vocabulary-matching measureswere sensitive to individual changes in
performance, one would expect the
standard deviations for the measures
to increase over the course of the year,as some students learn more whereas
others learn less. The restricted vari-
ability in scores for the administrator-
read probes most likely served to ar-tificially restrict the variability in the
slopes for the administrator-read scores,
leading to a lack of sensitivity to inter-
individual differences. In other words,reading the probes aloud to the stu-
dents produced less individual varia-
tion in performance as the year pro-
gressed.Conceptually, our results imply
that reading is an important factor in
the measurement of student perfor-
mance and progress in the contentareas. Based on previous research
(Espin et al., 2001), we had expected no
differences in the validity of the growth
trajectories created by the two types ofvocabulary-matching measures. How-
ever, the vocabulary-matching task
that incorporated reading was more
sensitive to overall learning than the
measure that removed reading as a fac-tor. Recall, however, that we conducted
this study in the classroom of only one
teacher. It is possible that reading may
be a more important factor in this
teachers classroom than in other class-
rooms. It will be important in future
research to replicate these findings
across different teachers and studentsand to directly examine the role of
reading.
Once we had established that thestudent-read measure was sensitive
both to group growth and to individ-
ual differences in growth over time, we
examined the reliability and validity of
the growth trajectories created by thestudent-read measure. The results re-
vealed that the growth trajectories cre-
ated by the student-read measure were
both reliable and valid. The growth tra-jectories were stable and proved to be
significantly related to growth and
performance on other criterion mea-
sures, including gain scores on theknowledge test, course grades, and
scores on the ITBS. In other words,
students who demonstrated greater
growth on the student-read vocabulary-matching measure also showed more
growth on the knowledge test, had
higher course grades, and had higher
scores on the ITBS. This is the patternof relations we would expect if the
vocabulary-matching measures were
valid measures of performance and
progress.
Examples of ProgressMonitoring
Although our research supports the
technical adequacy of vocabulary-
matching as an indicator of perfor-
mance and progress in the contentareas, it does not address the treatment
TABLE 5Relation Between Growth Rates on Student-Read Vocabulary
Probes and Criterion Measures
Criterion measure Coefficient Standard error ta p
Knowledge test gain score .010 .005 2.00 .05
Course grades .053 .017 3.22 .00
ITBS Social Science scores .002 .001 2.34 .02
Note. ITBS = Iowa Test of Basic Skills(Hoover, Hieronymus, Frisbie, & Dunbar, 1993).adf= 53.
7/29/2019 Vocabulary Matching Research
8/12
JOURNAL OF LEARNING DISABILITIES360
validity (Messick, 1994) of the mea-
sure; that is, our research does not ad-
dress the effect that the use of progress
monitoring might have on teacher in-struction and student performance. In
this section, we illustrate the ways in
which CBM measures could be used in
the content areas to aid special educa-
tion teachers in their decision making.
Research is needed to address theeffects of such implementation on
teacher instruction and student perfor-
mance, especially for students with LD
who spend a large portion of their
school day in general education classes
in the content areas (Lovitt, Plavins, &Cushing, 1999; Wagner, 1990).
At the beginning of the school
year, the special and general educationteacher would administer vocabulary-matching probes to all students in the
classroom. These data would be used
by the general education teacher to
identify students who might experi-ence difficulty in the class, and by the
special education teacher to evaluate
the appropriateness of the class for his
or her students. Following this initialassessment, students who are identi-
fied as at risk for difficulty in class
would be monitored by the special or
general education teacher to evaluatestudent learning in the content class.
One way to evaluate the learning
of the students with disabilities would
be to compare it to the learning of theirpeers without disabilities. For exam-
ple, in Figures 1 through 3, the scores
for three students with LD from our
study are graphed with the mean scorefor all students without LD. Slope lines
indicating rates of progress are dis-
played for each student and for the
class mean. The data from our studyindicate that Student 1 is performing
successfully in this social studies class;
as Figure 1 shows, level and rate of per-
formance for Student 1 are commensu-
rate with that of nondisabled peers.Student 2 is also performing success-
fully; although Student 2s level of per-
formance is below that of nondisabled
peers, the rate of growth is equal tothat of nondisabled peers (see Fig-
ure 2). Student 3, in contrast, is not per-
forming successfully in this social
studies class; both level and rate of per-formance for this student are substan-
tially below that of the class mean and,
for that matter, below that of other
peers with LD. The graph in Figure 3indicates a need for additional accom-
modations or modifications for Stu-
dent 3. If such accommodations or
modifications do not result in im-proved growth, an alternative place-
ment could be considered.
FIGURE 1. Progress graph and trendline for student with LD: Level and growth
rate of performance commensurate with that of peers.
FIGURE 2. Progress graph and trendline for student with LD: Level below that of
peers, but growth rate of performance commensurate with that of peers.
NumberofCorrectMatches
Numbe
rofCorrectMatches
7/29/2019 Vocabulary Matching Research
9/12
VOLUME 38, NUMBER 4, JULY/AUGUST 2005 361
This example illustrates how
content-area and special education
teachers can use progress monitoring
data to make instructional decisions re-garding students with LD in content-
area classes. Progress monitoring pro-
vides a data source different from butcomplementary to the typical evalua-tion based on course grades. Course
grades often reflect factors other than
learning, such as attendance, behavior,
and homework completion (e.g., Mil-ler, Leinhardt, & Zigmond, 1988), and
the meaning of course grades is some-
times unclear, especially when there
have been modifications in the gradingsystem (Olson, 1989; Rojewski, Pollard,
& Meers, 1991). Progress monitoring,
on the other hand, focuses solely on
learning and answers the questionwhether students good behavior, hard
work, and homework completion, and
teachers accommodations and modifi-
cations are having positive effects on
student learning.
Conclusion
In conclusion, our results support the
use of a student-read vocabulary-
matching probe as an indicator of stu-
dent learning in social studies. Takenin combination with the earlier results
of Espin et al. (2001), our results in-
dicate that a simple vocabulary-
matching measure can be used as anindicator of student performance and
progress over time in social studies.
This measure can be administered to
students in groups, takes only 5 min-
utes to administer, and can be scoredrelatively quickly.
On a more general level, our re-
sults provide further support for the
use of CBM measures as measures ofchange. As indicated by Francis et al.
(1994), L. S. Fuchs and Fuchs (1998),
and D. Fuchs et al. (2003), such mea-
sures have potential for use in decisionmaking for students with LD. That is,
the measures can be used not only to
determine to what extent students are
discrepant from their peers at a singletimepoint, but also to examine to what
extent students are progressing rela-
tive to their peers. Students who arediscrepant both in performance and
progress would be those most in need
of intensive interventions.
Our study is only a first step inthe research on the implementation of
progress monitoring procedures in
content-area classes. Several questionsremain, including (a) Will special andcontent-area teachers be willing to im-
plement and rely on progress monitor-
ing data? (b) Can progress monitoring
data serve as a conduit for communi-cation and collaboration between gen-
eral and special education teachers?
and (c) Will the implementation of
progress monitoring procedures resultin improved achievement for students
at risk and students with learning dis-
abilities?
ABOUT THE AUTHORS
Christine A. Espin, PhD, is a professor in the
Department of Educational Psychology at the
University of Minnesota. Her research focuses
on the development of progress-monitoring pro-
cedures in reading, written expression, and
content-area learning for secondary school stu-
dents with learning disabilities. Jongho Shin,
PhD, is an assistant professor in the Depart-
ment of Education at Seoul National Univer-
sity. His current research interests include read-
ing comprehension, learning strategies, and mo-
tivation. Todd W. Busch, PhD, is an assistant
professor of special populations at Minnesota
State University, Mankato. His current inter-
ests include teacher training, progress monitor-
ing for secondary-level students, and reading
comprehension. Address: Jongho Shin, Depart-
ment of Education, Seoul National University,Shinrim-Dong Kwanak-Gu, Seoul 151-748,
Korea; e-mail: [email protected]
AUTHORS NOTES
1. The research reported here was funded in
part by the Guy Bond Foundation, Univer-
sity of Minnesota.
2. We wish to thank the teachers, administra-
tors, and students of the Maplewood schools
for their participation in the study. We wish
to thank Dana Frederick for assistance in
data coding and Ron Kruschwitz for helpwith the data collection. Finally, we wish to
acknowledge the Netherlands Institute for
Advanced Study in the Humanities and So-
cial Sciences for its support in the prepara-
tion of this article.
REFERENCES
Baumann, J. F., & Kameenui, E. J. (1991).
Research on vocabulary instruction: Ode
to Voltaire. In J. Flood, J. M. Jensen, D.
NumberofCorrect
Matches
FIGURE 3. Progress graph and trendline for student with LD: Level and growth
rate of performance below that of peers.
7/29/2019 Vocabulary Matching Research
10/12
JOURNAL OF LEARNING DISABILITIES362
Lapp, & J. R. Squire (Eds.),Handbook of re-
search on teaching the English language arts
(pp. 604632). New York: MacMillan.
Beck, I., & McKeown, M. (1991). Conditions
of vocabulary acquisition. In R. Barr,
M. L. Kamil, P. B. Mosenthal, & P. D. Pear-
son (Eds.), Handbook of reading research
(Vol. II, pp. 784814), New York: Long-man.
Blachowicz, C. L. Z. (1991). Vocabulary in-
struction in content classes for special
needs learners: Why and how?Journal of
Reading, Writing, and Learning Disabilities
International, 7, 297308.
Blachowicz, C. L. Z. & Fisher, P. (2000). Vo-
cabulary instruction. In M. L. Kamil, P. B.
Mosenthal, P. D. Pearson, & R. Barr
(Eds.), Handbook of reading research (Vol.
III, pp. 503524), Mahwah, NJ: Erlbaum.
Bryk, A. S., & Raudenbush, S. W. (1987). Ap-
plication of hierarchical linear models to
assessing change. Psychological Bulletin,
101, 147158.
Bryk, A. S., & Raudenbush, S. W. (1992).Hi-
erarchical linear models: Applications and
data analysis methods. Newbury Park, CA:
Sage.
Compton, D. L. (2000). Modeling the growth
of decoding skills in first-grade children.
Scientific Studies of Reading, 4, 219259.
Deno, S. L. (1985). Curriculum-based mea-
surement: The emerging alternative. Ex-
ceptional Children, 52, 219232.
Espin, C. A., Busch, T., Shin, J., & Krusch-
witz, R. (2001). Curriculum-based mea-sures in the content areas: Validity of
vocabulary-matching measures as indi-
cators of performance in social studies.
Learning Disabilities Research & Practice,
16, 142151.
Espin, C. A., & Deno, S. L. (1993a). Content-
specific and general reading disabilities
of secondary-level students: Identifica-
tion and educational relevance. The Jour-
nal of Special Education, 27, 321337.
Espin, C. A., & Deno, S. L. (1993b). Perfor-
mance in reading from content-area text
as an indicator of achievement. Remedial
and Special Education, 14(6), 4759.Espin, C. A., & Deno, S. L. (19941995).
Curriculum-based measures for secon-
dary students: Utility and task specificity
of text-based reading and vocabulary
measures for predicting performance on
content-area tasks. Diagnostique, 20, 121
142.
Espin, C. A., Deno, S. L., Maruyama, G., &
Cohen, C. (1989, April). The Basic Aca-
demic Skills Samples: An instrument for
screening and identifying children at risk for
failure in the regular education classroom.
Paper presented at the American Educa-
tional Research Association Meeting, San
Francisco, CA.
Espin, C. A., & Foegen, A. (1996). Validity of
three general outcome measures for pre-
dicting secondary students performanceon content-area tasks. Exceptional Chil-
dren, 62, 497514.
Espin, C. A., & Tindal, G. (1998). Curricu-
lum-based measurement for secondary
students. In M. R. Shinn (Ed.), Advanced
applications of curriculum-based measure-
ment (pp. 214253). New York: Guilford
Press.
Francis, D. J., Shaywitz, S. E., Stuebing,
K. K., Shaywitz, B. A., & Fletcher, J. M.
(1994). Measurement of change: Assess-
ing behavior over time and within devel-
opmental context. In G. R. Lyon (Ed.),
Frames of reference for the assessment of
learning disabilities: New views on measure-
ment issues (pp. 2958). Baltimore: Brookes.
Fuchs, D., Fuchs, L. S., McMaster, K. N., &
Al Otaiba, S. (2003). Identifying children
at risk for reading failure: Curriculum-
based measurement and the dual-
discrepancy approach. In H. L. Swanson,
K. R. Harris, & S. Graham (Eds.), Hand-
book of learning disabilities (pp. 431449).
New York: Guilford Press.
Fuchs, L. S., Deno, S. L., & Mirkin, P. (1984).
Effects of frequent curriculum-based
measurement and evaluation on peda-gogy, student achievement, and student
awareness of learning. American Educa-
tional Research Journal, 21, 449460.
Fuchs, L. S., & Fuchs, D. (1998). Treatment
validity: A unifying concept for recon-
ceptualizing the identification of learning
disabilities. Learning Disabilities Research
& Practice, 13, 204219.
Fuchs, L. S., Fuchs, D., & Hamlett, C. L.
(1989a). Effects of alternative goal struc-
tures within curriculum-based measure-
ment. Exceptional Children, 55, 429438.
Fuchs, L. S., Fuchs, D., & Hamlett, C. L.
(1989b). Effects of instrumental use ofcurriculum-based measurement to en-
hance instructional programs. Remedial
and Special Education, 10(2), 4352.
Fuchs, L. S., Fuchs, D., & Hamlett, C. L.
(1989c). Monitoring reading growth
using student recalls: Effects of two
teacher feedback systems.Journal of Edu-
cation Research, 83, 103111.
Fuchs, L. S., Fuchs, D., Hamlett, C. L., &
Allinder, R. M. (1991). The contribution of
skills analysis within curriculum-based
measurement in spelling. Exceptional
Children, 57, 443452.
Fuchs, L. S., Fuchs, D., Hamlett, C. L., &
Ferguson, C. (1992). Effects of expert sys-
tem consultation within curriculum-
based measurement, using a reading maze
task. Exceptional Children, 58, 436450.Fuchs, L. S., Fuchs, D., Hamlett, C. L., &
Stecker, P. M. (1990). The role of skills
analysis in curriculum-based measure-
ment in math. School Psychology Review,
19, 622.
Fuchs, L. S., Fuchs, D., & Maxwell, L. (1988).
The validity of informal reading compre-
hension measures. Remedial and Special
Education, 9(2), 2028.
Hoover, H., Hieronymus, A., Frisbie, D., &
Dunbar, S. (1993). Iowa tests of basic skills.
Chicago: Riverside.
Konopak, B. C. (1989). Effects of inconsid-
erate text on eleventh graders vocabu-
lary learning. Reading Psychology, 10, 339
355.
Lovitt, T. C., Plavins, M., & Cushing, S.
(1999). What do pupils with disabilities
have to say about their experience in high
school? Remedial and Special Education, 20,
6776, 83.
Marston, D. B. (1989). A curriculum-based
measurement approach to assessing aca-
demic performance: What it is and why
do it. In M. R. Shinn (Ed.), Curriculum-
based measurement: Assessing special chil-
dren (pp. 1878). New York: Guilford Press.Messick, S. (1994). The interplay of evi-
dence and consequences in the validation
of performance assessments. Educational
Researcher, 23(2), 1323.
Miller, S. E., Leinhardt, G., & Zigmond, N.
(1988). Influencing engagement through
accommodations: An ethnographic study
of at-risk students. American Educational
Research Journal, 25, 465487.
Nagy, W. E., & Scott, J. A. (2000). Vocabulary
processes. In M. L. Kamil, P. B. Mosen-
thal, P. D. Pearson, & R. Barr (Eds.),Hand-
book of reading research (Vol. III). Mahwah,
NJ: Erlbaum.Nolet, V., & Tindal, G. (1993). Special edu-
cation in content area classes: Develop-
ment of a model and practical proce-
dures. Remedial and Special Education, 14,
3648.
Nolet, V., & Tindal, G. (1994). Instruction
and learning in middle school science
classes: Implications for students with
learning disabilities. The Journal of Special
Education, 28, 166187.
7/29/2019 Vocabulary Matching Research
11/12
VOLUME 38, NUMBER 4, JULY/AUGUST 2005 363
Nolet, V., & Tindal, G. (1995). Essays as
valid measures of learning in middle
school science classes. Learning Disability
Quarterly, 18, 311324.
Olson, G. H. (1989). On the validity of perfor-
mance grades: The relationship between
teacher-assigned grades and standard mea-
sures of subject matter acquisition. (ERIC
Document Reproduction Service No. ED
307 290).
Rojewski, J. W., Pollard, R. R., & Meers,
G. D. (1991). Grading mainstreamed spe-
cial needs students: Determining prac-
tices and attitudes of secondary voca-
tional educators using a qualitative
approach. Remedial and Special Education,
12(1), 715, 28.
Salvia, J., & Ysseldyke, J. (1998).Assessment
(6th ed.). Boston: Houghton Mifflin.
Scruggs, T. E., Mastropieri, M. A., & Boon,
R. (1998). Science education for students
with disabilities: A review of recent re-
search. Studies in Science Education, 32,
2144.
Shin, J., Deno, S. L., & Espin, C. A. (2000).
Technical adequacy of the maze task for
curriculum-based measurement of read-
ing growth. The Journal of Special Educa-
tion, 34, 164172.
Stecker, P. M., & Fuchs, L. S. (2000). Ef-
fecting superior achievement using
curriculum-based measurement: The im-
portance of individual progress monitor-
ing. Learning Disabilities Research & Prac-
tice, 15, 128134.
Tindal, G., & Nolet, V. (1995). Curriculum-
based measurement in middle and high
schools: Critical thinking skills in content
areas. Focus on Exceptional Children, 27(7),
122.
Tindal, G., & Nolet, V. (1996). Serving stu-
dents in middle school content classes: A
heuristic study of critical variables link-
ing instruction and assessment. The Jour-
nal of Special Education, 29, 414432.
Wagner, M. (1990). The school programs and
school performance of secondary students
classified as learning disabled: Findings from
the National Longitudinal Transition Study
of Special Education Students. Paper pre-
sented at the annual meeting of the
American Educational Research Associa-
tion, Boston.
Wesson, C., Deno, S., Mirkin, P., Maruyama,
G., Sevcik, B., Skiba, R., et al. (1988). A
causal analysis of the relationships
among on-going measurement and eval-
uation, the structure of instruction, and
student achievement. Journal of Special
Education, 22(3), 330343.
Willet, J. B. (1989). Questions and answers
in the measurement of change. Review of
Research in Education, 15, 345421.
The six covers of this volume year of the Journal of Learning Dis-abilities feature original artwork by Gabriel Lovett, RachaelSeger, and Caitlin Zirkelbach. We plan to continue showcasingthe artwork of individuals with learning disabilities on JLD cov-ers; therefore, we are now soliciting art for the 2006 issue covers.
Eligibility. Individuals with learning disabilities of any ageare encouraged to submit their original work for consideration.The artwork may be a painting, drawing, photograph, sculpture,computer-generated graphic, or any comparable medium. Workmust not exceed a maximum of 11" by 17"; three-dimensionalpieces must not exceed 10 pounds. Two entries per participantmay be submitted.
Submissions. Each entry must include:
the artists name, age, address, andcontact information
the title of the work the specific medium used (computer-generated pieces
should include step-by-step information on softwareused)
the size of the work
All artwork, including photographic images, must be theoriginal work of the submitting artist. Signed photo releases
must accompany any work that includesphoto images of people.
The actual submission of the art should be a color reproduc-tion (which will not be returned) in one of the following formats:
color laser print photograph slide (35 mm) saved as an EPS or TIFF file on Zip disk, CD, or regular
3 12" floppy disk
The winner(s) may be asked to send in original art, whichwill be returned.
Judging. Work will be judged based on originality, creativeuse of materials, and overall composition and design. The age ofthe artist will be taken into account.
Entries should be postmarked by October 1, 2005. Judgingwill take place on or about October 15, and artists will be noti-fied of our selection by December 1, 2005. Entries, requests formore information, or questions should be directed to Judith K.Voress, Periodicals Director, PRO-ED, 8700 Shoal Creek Blvd.,Austin, TX 78757-6897; 512/451-3246, ext. 630; FAX: 512/302-9129; e-mail: [email protected]. PRO-ED assumes no re-sponsibility for entries damaged in the mail.
Cover Art for 2006 Journal of Learning DisabilitiesSought
7/29/2019 Vocabulary Matching Research
12/12