80
The Correlation of Upper Elementary Spelling Levels and Morphological Awareness as Measured Using Student Interviews Abstract Eighty-seven 4th, 5th and 6th grade students were administered the Derivational Relatedness Interview (DRI) to explore students’ understanding of derivational morphology. During the same week, the subjects were also administered an Upper-Level Spelling Inventory (USI) to determine the students’ level of orthographic development. The DRI interviews were coded for higher level cognitive responses and then compared to each student’s performance on the USI. MAXQDA was used to code performance on the DRI using L 3 , a learning level lens developed for this study. Correlations between the coded categories of the DRI and each student’s qualitative spelling inventory (USI) were determined using Pearson r. Results showed a high correlation between the increasing gradients of morphological comprehension as measured by L 3 © coding and increasing USI scores.

hstrial-drsmithsmiley.homestead.comhstrial-drsmithsmiley.homestead.com/The...Dissertation_Award.docxWeb viewThe Correlation of Upper Elementary Spelling Levels and Morphological Awareness

Embed Size (px)

Citation preview

The Correlation of Upper Elementary Spelling Levels and Morphological Awareness as

Measured Using Student Interviews

Abstract

Eighty-seven 4th, 5th and 6th grade students were administered the Derivational Relatedness

Interview (DRI) to explore students’ understanding of derivational morphology. During the same

week, the subjects were also administered an Upper-Level Spelling Inventory (USI) to determine

the students’ level of orthographic development. The DRI interviews were coded for higher level

cognitive responses and then compared to each student’s performance on the USI. MAXQDA

was used to code performance on the DRI using L3, a learning level lens developed for this

study. Correlations between the coded categories of the DRI and each student’s qualitative

spelling inventory (USI) were determined using Pearson r. Results showed a high correlation

between the increasing gradients of morphological comprehension as measured by L3© coding

and increasing USI scores.

The Correlation of Upper Elementary Spelling Levels and Morphological Awareness as

Measured Using Student Interviews

Table of Contents

Abstract…………………………………………………………………………………………….i

Table of contents……………………………………......................................................................ii

INTRODUCTION

What is morphology and why is it important…………….…………………………….………….1

Correlations between morphology and reading comprehension…………………………………..3

Correlations between morphology and orthography ……………………………………..…...…..4

Relationships between morphology, orthography and reading comprehension…………………..5

Morphology’s current inclusion in literacy development…………………………………………6

Assessments in morphological understanding……………………………………………………7

Interviews as assessments…………………………………………………………………………8

Purpose of this study……………………………………………………………………………..13

METHOD

Pre-existing data………………………………………………………………………………….13

Upper-level spelling inventory………...………………………………………………...………15

Table 1 Predictive and concurrent validity for USI……………………………………..15

Table 2 Upper-level spelling inventory (USI)…………………………………………..16

Table 3 Reliability and validity for USI…………………………………….………..…17

Derivational Relatedness Interview………………………………………………….…………..17

Table 4 Derivational relatedness interview flow chart…………………….……………18

Dissertation research design…………………………………………………….……………….19

Figure 1 Explanatory design of study………………………………….………………..20

MAXQDA coding…………………………………………………………………………….….20

L3© coding………………………………………………………………………………………21

Coding of pauses…………………………………………………………………………………22

Table 5 L3© (Lens for Learning Levels) coding parameters…………………..……….25

RESULTS

Table 6 Guilford’s suggested interpretations for values of r ……………………….…..26

L3© coding results……………………………………………………………………………….27

Table 7 Interview Pearson r correlations…………………………………………….….28

Table 8 Interview coefficient of determination……………………………………...….29

DISCUSSION…………………………………………………………………………………...30

Limitations of the present study…………………………………………………………………36

Conclusions and future prospects………………………………………………………………..36

REFERENCES…………………………………………………………………………………37

The Correlation of Upper Elementary Spelling Levels and Morphological Awareness as

Measured Using Student Interviews

What is Morphology and Why is it Important?

For centuries, spelling was used to capture the sounds of speech in order to place them on

documents for the purpose of sharing thoughts with others, and so the first step in learning to

read was to learn spelling (Mathews, 1966). English spelling has been similarly viewed simply as

a mirror of speech, with those words that diverged from the sound-to-letter correspondence

considered only deviations (Venezky, 1967). Since the 1980’s, English spelling has been

recognized as a combination of phonology (sound) and meaning, and reading comprehension has

been understood to depend on one’s grasp of this orthographic knowledge. Sounding out the

word phonetically and following rules of spelling patterns can trigger word recognition (tap and

tape), but recognizing spelling units that are related to another more familiar word is also useful

in word recognition (oppose and opposition) (Berninger, Abbott, Nagy, & Carlisle, 2010;

Templeton, 1989; Templeton & Morris, 2000).

The spelling of words captures their morphology – how they are assembled through the

combination of meaningful word parts (Templeton, Bear, Invernizzi, & Johnston, 2010).

Morphology has been variously defined in the literature (Bryant & Nunes, 2006; Carlisle, 2003;

Jarmulowicz & Hay, 2009; Mahony, Singson, & Mann, 2000; Nippold & Sun, 2008; Templeton

et al., 2010) as the study of how words are built and includes the study of morphemes, the

smallest units of meaning in a language. These can be free morphemes – free-standing words

such as soft, or bound morphemes – non-word units with meaning that are bound to words, such

as the suffix (–en in soften). Morphology also includes compounding (combining words to make

a single concept like strawberry), inflectional morphology which involves the grammatical

processes of inflection (marking categories like person, tense, and case; for example -s, -ed, -er),

and derivational morphology which involves deriving new words by adding prefixes and suffixes

(affixes) to existing root words (e.g., acceptable from accept, signal from sign) and Latin and

Greek roots (e.g., inspect from in- + -spect-; phonic from phon- + -ic). All prefixes and suffixes,

root words, and Greek and Latin roots are morphemes. A word itself can consist of one

morpheme (eg., salamander, house) or two or more morphemes (eg., predictable – pre- + -dict-

+ -able). The transparency of the word can be either phonologically neutral, where the root

word’s pronunciation does not change, or phonologically nonneutral (opaque), where the root

changes in stress and pronunciation. For example, open to openness, flavor to flavorful, and

small to smaller, maintain their phonological transparency, while vain to vanity, active to

activity, and atom to atomic, do not, and are therefore opaque. The combination of derivational

suffixes and root words can create nouns from adjectives and verbs (eg., happy to happiness and

invent to invention) as well as adjectives from nouns and verbs ( joy to joyful and read to

readable).

Whether morphemes are recognizable in print or in speech, their identification can be a

useful tool in the decoding of words, and being able to recognize a false relation in a foil pair is

also important (eg., tail to tailor, ear to earth, and numb to numbers) (Anglin, 1993; Freyd &

Baron, 1982; Henry, 1988; Henry, 1989; Templeton, 1992; Templeton, 2004; White, Power, &

White, 1989; Verhoeven & Perfetti, 2011). By isolating and correctly assigning meaning to a

morpheme, the reader theoretically can limit the effort required for comprehension of the

otherwise unfamiliar word (Baumann et al., 2002; Berninger, Abbott, Nagy, & Carlisle, 2010;

Carlisle, McBride-Chang, Nagy & Nunes, 2010; Cunningham, 2006; Ebbers & Denton, 2008;

Harris, Schumaker, & Deshler, 2011; Keiffer & Lesaux, 2007; Kieffer & Lesaux, 2010; Larsen

& Nippold, 2007).

Correlations between Morphology and Reading Comprehension

Reading’s ultimate objective has been comprehension, and schools continuously refine

approaches to find the best way to help all students achieve that goal. The correlation between

derivational morphological knowledge and reading comprehension is well documented (Anglin,

1993; Berninger, Abbott, Nagy, & Carlisle, 2010; Carlisle, 2000; Freyd & Baron, 1982; Mahony,

1994; Mahony et al., 2000). In school text books, the morphologically-derived words increase

dramatically between grades four and seven, and the student’s knowledge of derivational

morphological relationships among words presumably increases in tandem (White, Power, &

White, 1989). Having an accurate measure of a student’s understanding of morphology could be

used to confirm the best methods to teach morphology and, by extension, accelerate the increase

in student’s reading comprehension.

A few studies have shown that understanding morphology is related to reading

comprehension (Carlisle, 2000; Freyd & Baron, 1982; Nagy, Berninger, & Abbott, 2006).

Baumann, Kame’enui,, and Ash (2003) reviewed the research on vocabulary learning and

reiterated that the evidence of a causal link between vocabulary (including morphology) and

comprehension was historically long but empirically soft. Though they reported only a few

studies supporting the effectiveness of directly teaching morphology, all basal reading programs

contain instruction in structural elements such as base or root words, inflections, contractions,

derivations, and compound words. While they reported some studies that suggested that teaching

specific morphemic elements could show a statistically significant increase in the comprehension

of morphemically similar words, the efficacy of such instruction was documented to be limited.

They note that morphological knowledge has been shown to increase with increasing level of

reading comprehension, and vice-versa, so the question is raised whether the administered testing

is adequate to measure the improvement.

Correlations between Morphology and Orthography

Early research on the relationship between morphological awareness and spelling is

found in Rubin (1988). The study finds that early elementary students (kindergarten and first

grade) are less likely to include the sound of the final morpheme in two morpheme words in their

spelling when they demonstrated low scores on oral morphological awareness tasks. Carlisle

(1988) reports that with typically developing upper elementary students (Grade 4, 5, and 6) their

spelling of derived forms of a word lags behind their ability to orally generate the derived form.

These early researchers demonstrated a general relationship between morphology and spelling.

Almost thirty years later, Deacon, Kirby, and Casselman-Bell (2009) measured

morphological awareness in typically developing early elementary students at in Grade 2, and

then assessed their spelling two years later in Grade 4. They report that while the results of

their study of children in grades 2 and 4 offer a first step in determining that morphological

awareness is a robust variable in determining spelling outcomes, “the research field is still in the

process of determining whether morphological awareness deserves a place as a robust factor in

literacy development” (p. 301). Casalis, Deacon, and Pacton (2011), reporting on French

children in Grades 3 and 4, write that while their results suggest that morphological awareness

exerts a broad influence on spelling, the link between the two did seem to be general because the

morphological awareness scores correlated with spelling words regardless of whether they

involved morphology or not. The word’s phonological structure and developmental level of the

child were also suggested to play a part in the link between morphological awareness and

spelling, thus agreeing with prior research that morphological awareness appears to be generally

connected to spelling outcomes. This general connection suggests “that the greater a child’s

awareness of morphology, the more accurately (and possibly fluently) he or she will spell” (p.

509).

It is noteworthy that there are models of spelling development that include morphological

awareness as a driver of spelling progress (e.g., Bear, Invernizzi, Templeton, & Johnston, 2008)

that are widely used in public schools.

Relationships between Morphology, Orthography and Reading Comprehension

Sensitivity to derivational morphemes not only is related to proficiency in reading and

spelling but is also a developmental skill that increases with age into adolescence (Anglin, 1993;

Carlisle, 2000; Carlisle & Fleming, 2003; Singson et al., 2000; Windsor, 1994). Morphology

should be considered as just one part of an overall comprehensive vocabulary and reading

comprehension program for upper elementary students (Kieffer & Lesaux, 2007).

Kieffer and Lesaux, (2007) report that morphology was equally important for reading

comprehension in native English speakers and English language learners. They suggest that

teachers can get a sense of the level of morphological understanding their students possess by

administering a developmental spelling inventory to inform the teacher’s instruction. By

correctly identifying where their students are on this developmental continuum, teachers can rely

on informed decision making to help their students better understand word structure in academic

vocabulary, thus aiding comprehension.

The Upper Level Spelling Inventory (USI) (Bear, Ivernizzi, Templeton, & Johnston,

2008), by tracking increasing accuracy in spelling, including the morphological aspects of

spelling, has been positively correlated with reading comprehension in upper elementary students

(Center for Research in Educational Policy [CREP], 2007).

Mann and Singson (2003) suggest that by Grade 4 (10 years of age), knowledge about the

structure of words is a better predictor of decoding ability than is phonological awareness.

Carlisle (2003) concludes that as children encounter longer, more complex words, morphological

awareness becomes critical for developing good literacy skills.

Morphology’s Current Inclusion in Literacy Development

Carlisle et al. (2010) integrative review of 16 studies notes that the current models of

literacy development seldom include morphological awareness. The studies in this review

represent a beginning effort in the link of morphological awareness to literacy development. The

various studies included instruction using activities, direct instruction of affixes and base words,

morphological problem solving, and instruction accompanied by morphological analysis

addressing word meanings, but most of the studies did not provide documentation of the transfer

or maintenance of the skills, and the research quality was subject to questions concerning reliable

and valid results. “(L)ittle has been done since 1970 to investigate the nature and value of

instruction in morphological awareness…(and) much work remains to be done” (pg. 481).

What is believed about morphology with the research conducted to date? First,

morphological awareness can support literacy skills like word identification, comprehension and

reading fluency. Second, morphological knowledge can increase literacy skills by addressing

vocabulary skills and comprehension skills, and may support spelling patterns. Third,

phonologically opaque morphological tasks are more strongly associated with good reading than

phonologically transparent tasks (Green, 2009). However, there is still no compelling body of

research evidence for the efficacy of instruction in contextual and morphemic analysis that

would expand the vocabulary or the reading comprehension of children to date. This could

indicate a number of possibilities: (a) is the length of intervention sufficient, (b) is the length of

time between pre and post testing sufficient, (c) is the knowledge of morphology that is acquired

by a student accomplished as the knowledge is needed rather than being taught by a teacher

when it is required by the curriculum, and (d) are the instruments of measurement sensitive

enough to accurately quantify the increase in knowledge.

Assessments in Morphological Understanding

A consistent theme to the discussion of morphological interventions has been variability,

including assessment measures. Most measures have reported validity and reliability, but the

reliability of some of the assessment measures used in morphological studies has also been

questioned (Goodwin & Ahn 2010). Some of the usual methods include cloze, multiple choice,

fill in the blank, and looking for errors by keeping track of right/wrong responses, analyzing for

the presence of word or word parts, frequency of words or word parts, and spelling of words or

word parts (Anglin 1993; Apel & Thomas-Tate 2009; Carlisle 2000; Collins 2005; Conti-

Ramsden, 2011; Jarmulowicz, 2006; Jarmulowicz & Hay 2009; Larsen & Nippold, 2007;

Mahoney, 1994). While these measures vary, they are based on an answer being judged correct

or incorrect, present or absent.

Eleven assessment tasks commonly used in educational research are employed by Siegel

(2008) to correlate morphological awareness with various reading and spelling tasks. The

sample size was large (1238 6th grade students) and all correlations in the study are significant (p

< .0001) but are limited to the moderate range and below. The correlations of morphological

awareness with reading and spelling range from .300 - .520, with a mean of .434, SD = .06.

These moderate correlations are reported to be consistent with prior studies that have shown

morphological awareness to be related to both reading and spelling, but while moderate

correlations indicate that the tasks are related, they still measure abilities that are somewhat

different. If the coefficient of determination is calculated, the highest reported morphological r

value (.520) equals an r2 = .27, indicating a shared variance of only 27%. 73% remains

unexplained.

Finding a measure that accurately tracks incremental increases in morphological

understanding occurring in upper elementary students has been problematic (Pressley, Disney, &

Anderson, 2007; Reed, 2008). While direct teaching methods are research based, the quantitative

paper and pencil testing outcomes that would demonstrate a statistically significant correlation to

the students’ understanding have not been consistent. Conducting interviews in addition to the

usual testing has been tried, but the categorization methods have not proven sensitive enough to

demonstrate that long-term learning has taken place (Nagy, Herman, & Anderson, 1985a).

Typically those categories have been based on identification of specific knowledge, such as base

words, prefix and suffix. This type of analysis can be superficial and might be a reflection of

memorized or short term knowledge (Tyler & Nagy, 1990).

Interviews as Assessments

Typically, student answers are considered to be either correct or incorrect. Testing is

built on the assumption that a correct answer means that the student knows the answer. In the

intervention studies reviewed to this point, morphology has been taught directly, and then tested

after the teaching and/or intervention, most frequently with forced-choice instruments such as

Likert scale tests, multiple choice, yes/no questions, and fill-the-blank formats that check more

for memory than understanding (Constantinou, Hadjilouca, & Papadouris, 2010; Otterman, 1955;

Swanborn & de Glopper, 1999; Thompson, 1958, in Baumann, Kame’enui, & Ash, 2003). It is

also possible that research testing results reported as positive are subject to test sensitization

where higher scores are associated with familiarization (partial word knowledge) rather than

understanding (Swanborn & de Glopper, 1999). Teaching to the test does give the most effective

short term results (Pressley, Disney, & Anderson, 2007) and so immediate effects are usually

larger than delayed effects. Bauman et al. (2002) reported that the effect of the morphological

instruction provided to the 88 5th grade students in their study seemed to degrade over time so

there were no residual instructional effects on the 5 week delayed post-tests checking for an

increase in both morphemic and contextual analysis.

With respect to assessing or probing knowledge more broadly, Maclellan (2004)

explores the history of assessment tasks accompanied by the thoughts of twelve contemporary

academics on the likelihood of a truly authentic assessment. It was agreed that assessment tasks

themselves drive student learning and “the reproduction of cued knowledge, the application of

algorithms and the performance of drills and exercises do not of themselves reflect

understanding” (pg. 20), so most agree that a more accurate assessment should allow students to

demonstrate what they could do through both written and oral opportunities. Oral assessments

are given higher import than written. Even though the ability to accurately and reliably assess

critical thinking and reflection have yet to be fully developed, the need to delineate higher order

skills such as justification, analysis, and evaluation is deemed desirable, with the current

practices of assessing regurgitated or “knowledge” recall needing to be reduced. Marker

consistency and a simple marking system to differentiate between “trivial and non-trivial”

learning is high on their list to make the assessment workable. They also acknowledge that

cognitive activity is not directly observable, so there are many problems involved with

interpreting competence simply from observation. Nonetheless, talking with students remains the

best way to accurately assess what they are really thinking.

Student interviews are used infrequently to assess morphological awareness/knowledge

(Berninger, Abbott, Nagy, & Carlisle, 2010; Carlisle, McBride-Chang, Nagy, & Nunes, 2010;

Goodwin & Ahn, 2010; Larsen & Nippold, 2007) and typically yield less conclusive information

than multiple choice tests (Nagy, Herman, & Anderson, 1985b). Researchers have asked whether

the testing used was accurately measuring the knowledge obtained by students (Wysocki &

Jenkins, 1987). Nagy and Scott (2000) suggested that new research should consider that gauging

student development of word understanding should not be all-or-nothing (correct or incorrect)

but should gauge the growth incrementally. The National Research Council (NRC) calls for three

elements in all assessments: cognition, observation, and interpretation (NRC, 2001). Adams and

Wieman (2011) acknowledge that student interviews are rarely used in educational testing but

argue that interviews are valuable for their ability to examine how new knowledge makes sense

in terms of the student’s prior knowledge. “Student interviews are necessary to verify that

students interpret the question consistently and as intended” (pg. 1301).

Devising categories to accurately identify the higher learning or comprehension

demonstrated in children’s interviews has led to a variety of criteria, including variations on the

right/wrong reasoning approaches. Constantinou, Hadjilouca, and Papadouris (2010)

investigating students’ understanding concerning the distinction between science and technology,

code student responses in written tests (multiple-choice with an open-ended question) and

follow-up interviews, using a variety of coding categories. Written responses were coded as

either correct or incorrect, while interviews and open-ended written responses were subjected to

phenomenographic analysis. The 36 interviewed students’ responses are judged using six

different categories based on variations of perceived differences, with each main category being

subdivided into as many as five additional sub-categories. Within the 24 different forms of

coding, responses were evaluated using criteria including (a) depth and level of detail, (b)

insufficient or ambiguous discrimination, (c) irrelevant answers, (d) circular reasoning, (e)

incorrect answers, and (f) no answer. While the coding procedure is complex and time

consuming, the authors hold that their assessment approach using interviews to explore

variations of distinction could assist the shift from only assessing conceptual understanding to

help monitor student epistemological understandings.

Using a different approach, Larsen and Nippold (2007) used a teacher directed interview

(DATMA: Dynamic Assessment Task of Morphological Analysis) where dynamic assessment

techniques were used to prompt additional student information, rate student answers on a scale of

one to five points, with five points given for a correct definition and correct reference to the root

morpheme in each question without further prompting, down to zero points when the child could

not select the correct meaning after 6 different prompts. Each answer was coded for either being

correct or incorrect, weighted depending on the number of prompts, then collectively analyzed.

While the teacher directed interview did reveal a range of skills, the relationship of the interview

results to other aspects of literacy (word knowledge and reading comprehension) were positive,

statistically significant, but moderate. When word scores were analyzed for the high, medium

and low student subgroups, the score of 1 was obtained more frequently in the low subgroup than

the high subgroup, but no other word scores were found to lie in any one subgroup more than

another. Other than the score of 1, there were no significant differences reported between

subgroups with their scoring method when analyzed by Tukey’s procedure.

The DATMA is a structured interview for 5th grade students where the answers are either

right or wrong. The interviewer has scripted prompts for incorrect answers, and the more

prompts needed to obtain the correct answer, the fewer points were given. When correlating the

DATMA results to the Oregon standardized achievement test for literacy development (OSA)

and the Peabody Picture Vocabulary Test (PPVT), DATMA to OSA was r = 0.50 , p = .0002 and

DATMA to PPVT was r = 0.36, p = .01, both reported as moderate positive correlations that

indicated that DATMA was moderately and positively related to the children’s literacy levels.

The published correlation between the OSA and the PPVT was r = 0.65, p < .0001, which was

reported as strongly positively correlated (Larsen & Nippold, 2007).

Teasing out the components of comprehension has allowed a comparison to skill

acquisition typically obtained through direct teaching. Sesma, Mahone, Levine, Eason, and

Cutting (2009) examine higher level executive brain functions in 60 children aged 9-15 years

old. Using seven different standardized measures from different instruments thought to measure

cognitive processes supporting reading comprehension, they concluded that the executive

functions make a significant contribution to reading comprehension, but not to word recognition

skills. Their definition of executive function included goal-directed behavior that included

holding and manipulating information in working memory, multistep task planning, and coming

up with the “big picture” from details. Rapp, Van den Broek, McMaster, Kendeou, and Espin

(2007) also address investigation of higher-order comprehension processes as a key to helping

struggling readers. The traits, (a) comparing similarities and differences, (b) extrapolation to

associated concepts, and (c) summarization or integration (gist), are outlined as processes

underlying higher-order, coherence-oriented comprehension that could be used in research as

vehicles to investigate instructional approaches.

Taken together, these varied research perspectives suggest that coding should be as

simple as possible and that the student learning should be gauged as a range rather than simply

deemed right or wrong. The variety of coding categories that were reviewed fell into four major

categories: (a) correct answers using higher learning responses that verbalized similarity,

difference, gist, or extrapolation; (b) confused responses where partially correct, vague or

circular answers were given; (c) one word answers or simple phrases retelling information

recently taught directly (memorized by the student) or recollected (for example: Do you know

this word?); and (d) incorrect answers, questions that were answered with “I don’t know” or a

similar phrase, or questions that were simply not answered.

Purpose of this Study

Student interviews possess the potential to verify student knowledge. If categories based

on the degrees of higher cognitive reflections could accurately tap a student’s derivational

morphological knowledge-base, then the statistics generated should correlate either positively or

negatively to the results of their Upper Level Spelling Inventory (USI), a well-known

orthographic measure, and a proven correlate of reading comprehension.

Method

Pre-existing data

An overview of the unpublished study by Templeton, Smith, Maloney, VanPelt, and Ives

(2009), which was presented at the 59th annual meeting of the Literacy Research Association,

will describe the participant selection and the measures used to collect the data. In the 2007-08

school year, working with the Director of Elementary Curriculum and Instruction in a large

western United States urban school district, a number of schools were identified, contacted and

invited to participate in a study to investigate morphological knowledge and vocabulary. Of the

63 elementary schools in the district, eight principals in schools representing a cross-section of

socioeconomic neighborhoods within the district expressed interest and subsequently

participated in the study. Within those schools, the 4th, 5th and 6th grade teachers met with the

investigators where the project was described. A total of 28 teachers consented to participate. Of

the eight participating schools, five were Title I, and three were low-middle SES. In all

classrooms, student assent and parental consent was obtained. In addition, in order to administer

additional one-on-one assessments, three students were randomly selected from each of the 28

classrooms for additional assessments of morphological knowledge and vocabulary. This

subpopulation of 84 students provided the one-on-one interviews that are the subject of this

study. Data were not gathered on whether students were second language learners, though

students who were identified as special needs were not selected. All participating students were

administered the Upper Level Spelling Inventory (USI) (Bear, Invernizzi, Templeton, &

Johnston, 2008) which provides information about the level of students’ orthographic knowledge

and specific orthographic features that characterize those different developmental levels. Three

additional measures were individually administered with the randomly selected subpopulation of

84 students. These measures are the Test of Morphological Structure (Carlisle, 2000), the

Peabody Picture Vocabulary Test (Dunn. Dunn, & Pearson Assessments, 2007), and the

Derivational Relatedness Interview (DRI). This study examines each student’s performance on

the USI in relation to their performance on the DRI using MAXQDA (www.maxqda.com), a

computer assisted mixed-methods software program. Pearson r is the statistical function selected

to measure the correlation between the described measures.

Upper-Level Spelling Inventory

The Upper-Level Spelling Inventory (USI) (Bear, Invernizzi, Templeton, & Johnston,

2008) has a predictive validity of 0.617 with p > .001 for 5th grade reading comprehension,

indicating reliable differentiation between higher and lower performing students (Center for

Research in Educational Policy [CREP], 2007) (see Table 1).

Table 1.Predictive and Concurrent Validity for USI

Upper Form Spelling Inventory (USI)Predictive and Concurrent Validity

Includes all studentsExcludes ELL, SPED, Gifted students

Predictive Concurrent Predictive Concurrent

Validity Validity Validity Validity

Fifth Grade

Reading 0.638 0.611 0.559 0.544

Word Analysis Vocabulary 0.647 0.660 0.577 0.637

Reading Comprehension 0.617 0.583 0.535 0.467

Literacy Response and Analysis 0.543 0.534 0.463 0.459

Writing Conventions 0.522 0.533 0.5 0.539

Writing Strategies 0.480 0.464 0.355 0.376

CST ss ELA 0.633 0.589 0.519 0.492

CST pl ELA 0.585 0.601 0.525 0.568Center for Research in Educational Policy [CREP], 2007, pg.19.

According to Bear, Invernizzi, Templeton, and Johnston (2012) this upper form USI is

used widely from upper elementary and middle school through college level. Used to assess

students orthographic knowledge, the score from this upper-level list (see Table 2) is an

indication of a student’s ability to make meaning connections using the word’s orthography,

whether there are similarities in sound (sail to interpret sailor) or differences in sound (medicine

to interpret medicinal). Based on the increasing word difficulty, both the number of words

spelled correctly and the correct spelling of the specified word parts combine to provide a score

that indicates the ability of a student to make morphological connections between words, and to

target areas where a student would need additional assistance. These USI scores are also used to

identify the student’s spelling stage which then suggests the student’s reading stage. The USI is

highly reliable (see Table 3). For example, scores of 183 fifth graders on the USI significantly

predicted their scores on the Word Analysis subtest of the CST four months later (Bear et al.,

2012).

Table 2.Upper-Level Spelling Inventory (USI)

Bear, Invernizzi, Templeton and Johnston, 2012, p. 257

Table 3.Reliability and Validity for USI

Bear, Invernizzi, Templeton and Johnston, 2012, pg. 29

Derivational Relatedness Interview

The Derivational Relatedness Interview (DRI) was developed as a measure of explicit

morphological knowledge (Templeton, Smith, Moloney, & Ives, 2007; Templeton, Smith,

Moloney, VanPelt, & Ives, 2009). The instrument consists of seven word pairs that are

derivationally related plus two unrelated word pairs that graphically appear similar (see Table 4).

For administration, the researcher visually presents each word pair with bases or roots aligned in

order to make the relatedness of the words more salient. The researcher asks the participant to

read each word; the word is supplied if there is no response within three seconds, or if the word

is mispronounced. The participant is asked if s/he has seen or heard the word before. The

researcher then initiates the conversation of each word pair with the prompt: “Tell me a way

these two words are similar to each other.” A flow chart is followed in order to standardize

subsequent probing questions; however, the exact phrasing of follow-up questions is based on

the wording of individual responses. Before moving on to the next word pair, the researcher asks,

“What does the word ______mean to you?” for each word (see Table 4).

Table 4. Derivational Relatedness Interview Flow Chart

READ: I am going to be asking you questions about words. There is no right or wrong answer to these questions; we really just want to know what you think about certain pairs of words. Your thoughts about these words will help teachers learn how to teach vocabulary better.

First, I will show you a pair of words and ask you to read them. Then, I will ask you some questions about that pair of words. Let’s try a couple of practice word pairs.

POINT: Show word pair. Wait two seconds; if the participant does not read the words, read them aloud. Point to each word as you say it aloud.

Sample 1 hunthunted

Sample 2 triptriple

Pair 1 trustdistrust

Pair 2 predictpredictable

Pair 3 admireadmiral

Pair 4 cavecavity

Pair 5 equalequality

Pair 6 fractionfracture

Pair 7 mistakeunmistakable

The word pairs, other than the two unrelated pairs, are derivationally related with varying

levels of transparency, and range from simple affixation (trust/distrust) to affixation with sound

and/or spelling change (equal/equality; cave/cavity). To mitigate the influence of vocabulary,

base words were chosen that, according to Dale and O’Rourke (1981) are known by a majority

of fourth through sixth-grade students. Audio recordings were made of each interview and later

transcribed. Each taped interview was approximately 10-15 minutes in length.

Dissertation Research Design

The study is a mixed methods research design is defined by Creswell and Plano Clark

(2007) as follows:

As a method; it focuses on collecting, analyzing, and mixing both quantitative andqualitative data in a single study or series of studies. Its central premise is that theuse of quantitative and qualitative approaches in combination provides a betterunderstanding of research problems than either approach alone. (p. 5)

In this study, qualitative data is collected (student interviews) and it is transformed by

MAXQDA into quantitative data by counting the number of coded responses and by counting the

number of keystrokes in each response. Concurrent with the qualitative data collection,

quantitative data is collected (USI). According to Creswell and Plano Clark (2007) this type of

content analysis study is included as a ‘gray area’ of mixed methods studies because “both

qualitative and quantitative data analysis is going on” (pg. 11, 12).

In this study, the Explanatory Design is used to help explain and to build upon the initial

quantitative results. The Explanatory Design is explained as a four-phase mixed-method. First,

the quantitative portion of the research was conducted and the data were analyzed. Quantitative

data analysis helped identify results that would confirm or deny the interpretation of the collected

qualitative data and the validity of the coding. Figure 1 presents the schematic Explanatory

Design of this mixed-methods study. In Figure 1, the term quantitative is referred to as “quan”,

and the term qualitative is referred to as “QUAL”. Using capitalization of the letters in mixed-

methods research is recommended by Tashakkori and Teddlie (1998) to more clearly illustrate

the relative priority that each method contributes to the study. In this study the emphasis was

placed on collection and analysis of qualitative data to explore its relationship to the quantitative

data.

Figure 1. Explanatory Design of study

MAXQDA Coding

MAXQDA is a software program used to analyze qualitative data by integrating both

qualitative and quantitative methods (www.MAXQDA.com). The collected qualitative data can

be coded so that the data can be classified and transformed into quantitative data with

quantitative aspects such as number of segments and frequency of occurrence. In this way, a

mixed type of analysis can be performed by transforming the occurrence of codes as well as their

frequencies into categorical variables. The numerical data of codes and attributes can then be

exported into Microsoft Excel and/or PASW Statistics (formerly SPSS) to be analyzed in terms

of statistical correlation. “Text search tools, such as MAXQDA…allows researchers to

interrogate the data set, look for co-occurrences of codes or themes, relationships between codes,

and to play with ideas in an exploratory fashion (Lewins & Silver, 2008, p. 11, in D’Andrea,

Waters, & Rudd, 2011 pg. 49). In this study, MAXQDA is used to tabulate and record each

coded answer (‘number’ of responses), as well as the number of typed characters contained in

each coded answer (‘amount’ in response).

L3© Coding (lens for learning levels)

To code the student interviews, each interview was first listened to in its entirety, to gain

insight into prosody and meaning that is captured by the student’s spoken word. Mauthner and

Doucet (1998) suggest that by listening, the researcher can “retain some grasp over the blurred

boundary between their narratives and our interpretation” (pg. 127-128). It has also been

suggested that listening to the taped interview not only checks for accuracy of transcription, but

aids the coder in forming initial reactions and informed assumptions that influence interpretation.

This active listening allows a better sense of how the person interviewed wants to be understood

(McCormack, 2004).

All interviews were coded to identify the levels of learning (L3© - lens for learning levels)

using the following guidelines:

1. GOOD COMPREHENSION – Student verbiage indicates some higher knowledge, where

there is probably no need to reteach. Student use of words indicate some higher learning

connections (similarity, difference, gist and/or extrapolation). It is important to keep in

mind that these are elementary students with grade school vocabularies. For instance

when students were asked to define a word, if they used their own words to correctly

explain the meaning and/or related the word to an experience as a correct explanation, it

was coded GOOD. If the student was asked if they knew another word that started with

the prefix or suffix being discussed, and then gave a correct word, it was coded GOOD.

When there was even a small amount of higher learning connection found in a student’s

correct response, it was coded GOOD.

2. INCOMPLETE COMPREHENSION – Student verbiage indicates somewhat correct

answers but appears to be “using but confusing”; either it is probable that a mini-lesson

would be sufficient to either correct the misunderstanding or link to a higher learning

understanding, or there is some correct basis verbalized, indicating an existing knowledge

that could be used to build on that would likely require a small effort on the part of both

student and teacher to take to good understanding. Most of the student responses fell into

this coding category.

3. LACKING COMPREHENSION – Student verbiage indicates a wrong answer, and is

usually accompanied by the impression that it would take time to discover a basis for

knowledge building, or to discover what the misunderstanding is.

Coding of Pauses

Listening to students’ interviews brought an additional student response to light that had not

been addressed by the previously referenced studies. Many students included frequent pauses of

varying lengths that were either silent or filled with vocalizations such as “erm’s”, “um’s” and

“er’s”. A literature search suggested some options for coding. Over the last 50 years, the

understanding of the purpose of pauses has evolved from being primarily viewed as performance

errors and as a nuisance, to being recognized as correlating to the processes involved with speech

production (including grammatical, stylistic and articulatory function) as well as being linked to

brain functions accessing higher learning centers for thought production (Edrington, Buder, &

Jarmulowicz, 2009; Hawkins, 1971; Ruhlemann, Bagoutinov, & O’Donnell, 2011). In

spontaneous speech, when the speaker found themselves at a loss of words, pauses were the

result, and it was reported that pause length increased to longer than three seconds with

increasing indecision on the part of the speaker (Goldman-Eisler, 1958a; Goldman-Eisler, 1958b;

Goldman-Eisler, 1960). Conversational narrative has had four different types of pauses identified

and investigated: two types of filled pauses, the short silent pause, and the long silent pause. The

short filled pauses (both the “er” and the “erm”) served as discourse markers and placeholders

for planning/recall time, the short silent pause (up to five seconds) served both as planning and

time to recall facts from memory as well as time for necessary articulatory processes, and long

silent pauses (longer than five seconds) were used to temporarily suspend conversation

(Ruhlemann, Bagoutinov, & O’Donnell, 2011). Schonpflug (2008) reported that in retelling a

story, children between the ages of 8 and 10 used fewer and shorter pauses when using gist than

when trying to recall the story verbatim. The conclusion was that fluent retrieval with the shorter

pauses was an indication of higher cognitive processing. However, very short pauses were also

caused by speech production processes (Goldman-Eisler, 1960). Traditionally, minimum cut-off

points for pause durations have ranged from 0.25 seconds to close to 2 seconds since proper

articulation requires short periods of silence (Kowal, Wiese, & O’Connell, 1983).

The timing of student pauses in response to teacher questions revealed that the more difficult

questions were related to an increase in the student’s pause, with the average pause length being

just over 3 seconds for all answered questions (Arnold, Atwood, & Rogers, 1974). Providing

adequate wait-time to allow a sufficient amount of time for the student to think has proven

difficult for teachers, with wait-times reported in normal school teaching situations lasting

around one second (Arnold, Atwood, & Rogers, 1974; Gambrell, 1983). When wait-time was

increased to a value just above three seconds, student discourse improved and higher cognitive

achievement was reported (Honea, 1982; Torbin, 1987). The proper amount of wait time

appropriate for the student was reported to vary depending on the difficulty of the question and

the knowledge base of the student, and was determined to be best decided by the questioner.

Lengthy pauses could be unproductive. As a guideline, uncomfortable lengthy pauses were

reported to last around 5 seconds or more, so the instructor needed to remain sensitive to every

student’s needs in order to avoid an increase of confusion or heightened frustration (Stahl, 1994).

Based on this presented research on pauses, very short pauses (2 seconds and less) had been

associated with thinking time when there was good comprehension, as well as with speech

production issues. Pauses lasting around 3-4 seconds, whether filled or unfilled, have been

associated with incomplete higher comprehension, where more recall time was needed to access

information. Longer unfilled pauses that were determined by the questioner to be uncomfortable

had been associated with an absence of knowledge.

For this study, all short unfilled pauses of less than four seconds will not be coded. These

short pauses have been identified with processes of speech production as well as adequate and

inadequate knowledge, but differentiation cannot be accurately determined in this study’s

interviews. Short, filled pauses of less than four seconds will be coded as incomplete

comprehension as they have been associated with more time needed for recall. Long unfilled

pauses (more than 4 seconds) will be coded as lacking comprehension based on Stahl’s

observation of unproductive wait-time.

The coding parameters for L3© , including examples, are presented in Table 5.

Table 5.L3

© (Lens for Learning Levels) Coding Parameters

Quantitative Parameters Qualitative Parameters Examples

GOODResponse

Correct answers accompanied by higher level learning comments.

Higher level learning is identified by verbalization containing similarity, difference, gist, and/or extrapolation

Student verbiage includes higher knowledge, where there is probably no need to reteach.

It is important to keep in mind that these are elementary students with grade school vocabularies.

When a student is asked to define a word, and they use their own words (synonyms for the original word) to correctly explain the meaning and/or relate the word to an experience as a correct explanation (gist/extrapolation).

Specific example: When asked “What does mistake mean to you?” a GOOD response is “That you did something wrong that you didn’t want to do.” (gist/extrapolation)

When a student is asked if they knew another word that started with the prefix or suffix being discussed and gives a correct word (extrapolation).

INCOMPResponse

Generally correct answers that are not accompanied by higher level learning comments.

Can appear as one word answers or simple phrases that appear to be simple memorization.

Confused responses where partially correct, vague or circular answers are given.

Short filled pauses of less than a 4 second duration.

Student verbiage shows that it is probable that a mini-lesson would be sufficient to either correct the misunderstanding or link to a higher learning level.

Verbiage includes somewhat correct answers but appears to be ‘using but confusing’.

Verbiage shows some correct basis, where an existing knowledge could be used to build on that would likely require a small effort on the part of both student and teacher to take the student to GOOD understanding.

Specific example: When asked “Why do you think equal are both together in both of these words?” (equal & equality), an INCOMP response is “Because they might kind of mean the same thing”.

Specific example: When asked “Can you read the bottom word for me?”, the student response of the correct pronunciation of the written word such as, “trust”, is INCOMP because while correct, it does not indicate higher level learning.

Specific example: When asked “What does hunt mean to you?” and the student response defines only by using the word as in, “It means to hunt something.”

LACKINGResponse

Incorrect answers.

Questions answered with “I don’t know” or similar response.

Unfilled long pauses lasting for more than 4 seconds.

Student verbiage leaves the impression that it would take a chunk of time to discover a basis for knowledge building.

Verbiage indicates it would take a chunk of time to discover what the misunderstanding is.

Specific example: When asked “Take a guess at what admiral might mean” a LACKING response is “Um like a diamond or something.”

Inter-rater reliability was determined following a protocol suggested by Armstrong et al.

(1997). Fifteen percent of the DRI interviews were randomly selected from the pool of student

interviews and the coding was done together as a group. The disagreements averaged less than

10% and were discussed by the group until consensus was reached.

The Pearson r correlation is used to measure the correlation between the DRI and the

USI. In addition, Pearson r2 is calculated to report the coefficient of determination, which

explains the percent of variance shared between the DRI and the USI.

Results

This study examines each student’s performance on the USI in relation to their

performance on the DRI as measured with coding by the lens for learning levels (L3©) using

MAXQDA (www.maxqda.com), a computer assisted mixed-methods software program. Pearson

r is the statistical function selected to measure the correlation between the described measures. .

The suggested interpretations for values of r that will be used to describe the results of this study

were taken from Sprinthall (2003), pg. 287, and are presented in Table 6.

Table 6.

Guilford’s suggested interpretations for values of r___________________________________________________________________ r Value Interpretation___________________________________________________________________

Less than .20 Slight; almost negligible relationship.20-.40 Low correlation; definite but small relationship.40-.70 Moderate correlation; substantial relationship.70-.90 High correlation; marked relationship.90-1.00 Very high correlation; very dependable relationship___________________________________________________________________

L3© Coding Results

A total of 70 DRI student interviews were coded using the L3© protocol. Each interview

lasted approximately 10-15 minutes. The basic structure of the interviews remained constant, but

students talked at different rates, did different amounts of talking for their responses, and the

researcher probed with different numbers of questions to explore a specific student’s knowledge

base. These differences account for the differing interview times. Because the interview is semi-

structured, the number and content of the questions and student responses varied with each

interview. The total number of coded student responses range from 45 to 91 and average 68.

There is zero correlation between the student total number of responses and the total USI. The

amount of student talking is measured by counting the number of keystrokes in the typed

transcript of the student verbal responses. The total quantity of keystrokes (reported as amount)

per interview, range from 980 to 4842 with an average of 2168 keystrokes. There is zero

correlation between the student total amount (quantity of keystrokes) and the total USI.

Because the interviews varied considerably by number of responses and the amount of

talking done by the student, each coding category is reported as a percent of the total for each

student’s interview. The ‘number’ of responses for GOOD, INCOMPLETE, and LACKING is

expressed as a percent, with the number of responses within each category divided by the total

number of responses coded for each interview. The ‘amount’ in the response for GOOD,

INCOMPLETE, and LACKING is expressed as a percent, with the number of keystrokes

recorded for each category divided by the total number of keystrokes coded for each interview.

The researcher’s coding validity was verified with an initial inter-rater reliability of 91%,

with all disagreements being discussed until consensus was reached. Using Excel, Pearson r was

calculated for each L3© coded parameter to attain the correlations with the student’s total USI

score. The data is presented in Table 7.

Table 7.Interview Pearson r correlations (N=70) –Pearson r correlations between L3

© coding categories and Total USI Score -

Correlation between % in Number (GOOD, INCOMPLETE, and LACKING) and USI total score; Correlation between % of Amount (GOOD, INCOMPLETE, and LACKING) and USI total score______________________________________________________________________ % No. % No. % No. % Amount % Amount % Amount GOOD INCOMP LACKING GOOD INCOMP LACKING__ +0.66** +0.07 -0.56** +0.73** -0.53** -0.47**______________________________________________________________________

**p< .01

The number of GOOD responses (r = 0.66, p < .01) indicates a statistically significant

moderate positive correlation with the USI. The number of LACKING responses (r = -0.56, p

< .01) indicates a statistically significant moderate negative correlation with the USI. The

number of INCOMPLETE responses did not correlate with the USI (r = 0.07).

The amount in GOOD responses (r = 0.73, p < .01) indicates a statistically significant

high positive correlation with the USI. The amount in INCOMPLETE responses (r = -0.53, p

< .01) and the amount in LACKING responses (r = -0.47, p < .01) both indicated statistically

significant moderate negative correlations with the USI.

In summary, all coding categories but one, the number of INCOMPLETE, indicate

statistically significant moderate to high correlations to the USI. All of the coding categories for

the amount in the responses indicate statistically significant moderate to high correlations with

the student USI.

Pearson r expresses associational meanings, but they are not proportional. To uncover the

strength of proportion, for instance whether one number has twice the predictive power of

another, Pearson r must be squared to produce r 2, the coefficient of determination (Sprinthall,

2003). The coefficient of determination data is presented in Table 8.

Table 8.Interview coefficient of determination (N=70)–Pearson r2 correlations between L3

© coding categories and Total USI Score -

Correlation between % in Number (GOOD, INCOMPLETE, and LACKING) and USI total score; Correlation between % of Amount (GOOD, INCOMPLETE, and LACKING) and USI total score

______________________________________________________________________ % No. % No. % No. % Amount % Amount % Amount GOOD INCOMP LACKING GOOD INCOMP LACKING__ 0.43 0.0 0.31 0.53 0.28 0.22

Viewing the L3© coding parameters as a coefficient of determination, only the percent of

the amount of GOOD student responses shows a little over half of the information derived from

the DRI interview L3© coding is contained and can explain the results of the total USI score. The

number of GOOD responses is close to half, but the rest fall below one-third. In other words, if

all students had the same USI score, the variation in the coding in GOOD responses (both

amount and number) would decrease by around 50%, meaning more uniformity of the coding for

any level of USI score (Sprinthall, 2003, p. 289). The remaining coding parameters would

decrease variations by less than 33%, which does explain an amount of the variation, but not as

much as the GOOD responses. The coefficient of determination provides a more precise

explanation of the high correlations of the USI with the L3© coding by pointing out the amount of

variations that are/are not explained by the coding parameters (Sprinthall, 2003, p. 287-89).

The coefficient of determination, because it is based on equal intervals (unlike the

Pearson r), also allows the relative predictive power of one category to be compared to another.

In this way, the amount in GOOD responses is around 20% more effective than the calculation of

the number of GOOD responses in correlating with the USI (0.53 / 0.43 = 1.23 = 23% more).

The number of LACKING responses is around 40% more effective than the amount of talking a

student does with a LACKING response (0.31 / 0.22 = 1.41 = 41% more). The ratio of

INCOMPLETE amount/number could not be calculated due to the lack of correlation (0.0) to the

USI by the number of INCOMPLETE responses, however the correlation reported for amount in

INCOMPLETE responses (0.28) was more highly correlated to the USI than the number of

INCOMPLETE responses (0.0).

Discussion

The purpose of this mixed methods study was to attempt to construct a lens for coding

interviews exploring students’ knowledge of morphology that could tease out the higher learning

levels of student knowledge. If categories based on the degrees of higher cognitive reflections

could accurately tap a student’s derivational morphological knowledge-base, then the statistics

generated should correlate either positively or negatively to the results of their Upper Level

Spelling Inventory (USI), a proven correlate of reading comprehension.

The coding lens that resulted from those efforts, L3© , was constructed by first looking for

aspects of higher learning that were based on evidence of similarity, difference, gist and

extrapolation within each individual student response in order to tease out the higher learning

functions (GOOD). The remaining student responses were categorized as wrong answers

(LACKING), or those falling somewhere in-between (INCOMPLETE). All three coding

categories included counting both the number of responses and the amount in the responses. The

final coding rubric contains both quantitative and qualitative aspects. This is a previously

unpublished rubric for coding interviews. All three coding categories were figured in two ways:

by ‘amount’ in responses and by ‘number’ of responses. The amount of talking was compiled by

counting the number of keystrokes within each transcribed response then totaling the amounts

for each L3© category, and the number of responses was tallied by keeping count of the number

of responses that were coded for each L3© category.

The percent amount in GOOD was the only coding category that was highly correlated

with the USI. This marked relationship to the USI suggests that the amount of talking in a

GOOD response could be used as a tool to assist accurate assessment of a student’s

morphological knowledge (see Table 7).

The number of GOOD responses and the number of LACKING responses were

moderately correlated to the USI, along with the amount in INCOMPLETE and the amount in

LACKING responses. While all four categories are statistically significant with r values that

range from 0.47 to 0.66, the fact is that they are only moderate correlations to the USI, and are

not expected to be as reliable in their ability to group students homogeneously in the classroom

as the amount of talking in a GOOD response, with its high correlation and marked relationship

to the USI. Results showed a high correlation between the increasing gradients of comprehension

of morphology as measured by L3© coding and increasing upper level spelling inventory scores.

While the literature reviewed suggests that there is still no compelling body of research evidence

for the efficacy of instruction in contextual and morphemic analysis that would expand the

vocabulary or the reading comprehension of children, whether this indicates that the knowledge

of morphology must be acquired by a student as the knowledge is needed rather than by being

taught by a teacher when it is required by the curriculum, or whether the instruments of

measurement are not sensitive enough to accurately quantify the increase of knowledge remains

to be determined with continued research.

Alternative assessments have been used in assessing student achievement, but they are

time consuming to administer and evaluate. A central problem with alternative assessments is

figuring out how to accomplish the assessment in a manageable time period (Maeroff, 1991).

Student interviews, which are time consuming to administer, transcribe and code, are used

infrequently in educational research to assess morphological awareness/knowledge (Berninger,

Abbott, Nagy, & Carlisle, 2010; Carlisle, McBride-Chang, Nagy & Nunes, 2010; Goodwin &

Ahn, 2010, Larsen & Nippold, 2007) and have typically yielded less conclusive information than

multiple choice tests (Nagy, Herman, & Anderson, 1985b). These concerns are valid

impediments for general use of student interviews as an assessment of student knowledge, and

the question of the relative sensitivity of the measurement is ongoing. Our understanding of what

learning entails and how to accurately assess the learning continues to evolve.

Traditionally, only the number or percentage of right and wrong answers has been used

to determine the extent of student knowledge. L3© parameters redefine the traditional right/wrong

lens, and go a step further. By factoring in the amount of student talking (i.e., the number of

keystrokes in the words the child said) the results from this study indicate there could be an

improvement in the accuracy of determining student knowledge when compared to merely

tallying the number of responses (see Table 7). The percentage of GOOD responses does show a

substantial correlation to the USI, but using the determinant of correlation, the amount of student

talking about a concept that they really know indicates an increased correlation of more than

20% when compared to using only the number of responses. The amount of talking in the L3©

GOOD category correlates more highly to the USI than the number of GOOD responses.

The INCOMPLETE category was developed to include student responses that indicated

they were “using but confusing.” When tallying the percent of the number of those responses,

there was no significant correlation to the student’s total USI. When the amount of talking was

taken into consideration, there was a significant and moderate, negative correlation. The more a

student talked and tried to explain when they did not completely understand a concept, the lower

their USI, and conversely, the less a student had to say about what they were not sure of, the

higher their USI. The data indicates that when a student understood that they were confused or

lacked sufficient knowledge, they gave short answers and did not elaborate. When the student

was unsure or did not explicitly recognize that they were not correct, their responses were

extended by giving circular answers, talking about something related to the subject, being

redundant, or going into details that gave insight into their thinking. This extended amount of

talking was hidden and not able to be described by the simple tallying of the number of

INCOMPLETE answers (r2 = 0) (see Table 8), but is clearly shown to exist by using the L3© code

and MAXQDA that quantifies the amount of student talking (r2 = 0.28; r = - 0.53, p < .01, a

moderate negative correlation of the amount of INCOMPLETE responses to the USI). The

amount of talking in the INCOMPLETE L3© category shows an increased negative correlation

with the USI when compared to the number of INCOMPLETE responses. The higher amount of

talking contained in the INCOMPLETE category was a more accurate indicator of student lack

of understanding than the simple number of responses in the INCOMPLETE category, which

had no correlation to the USI.

Both L3© LACKING categories correlated moderately and negatively to the USI. But

unlike the GOOD and INCOMPLETE categories, the number of LACKING responses correlated

more highly to a lower USI score (r2 = 0.31) than the amount of talking the student did to explain

their LACKING response (r2 = 0.22). The number of LACKING responses was 41% more

closely correlated to the student’s USI score than the amount of talking they did (0.31 / 0.22 =

1.41 = 41% more). Similar to the INCOMPLETE response results, the less a student talked, the

higher their USI score tended to be, and conversely, the more a student talked, the lower their

USI score tended to be. In this L3© LACKING category, however, the number of responses

showed a higher correlation to the USI than the amount in the responses.

The moderate negative correlation to the USI with the LACKING category is a

significant contribution to the study’s results, and more can be learned by looking at the

difference between the amount in the responses and the number of responses in this category.

The increased predictive power of number over amount was not expected in the LACKING

category, because most students gave short answers, or answered ‘No’ or not at all when

responding to a question where they lacked knowledge. In reviewing the data, the coders

remarked that while the LACKING responses by students demonstrated a complete lack of

understanding, occasionally there were students who gave long answers that justified a logical,

yet completely incorrect response. Upon further examination, many of those responses did come

from students with higher USI scores. Those students’ long answers are a confounding factor for

the LACKING L3© category, but because they were recognized after the coding process had been

completed, they were not removed or recoded for this study. It was not expected that the negative

correlations of both L3© LACKING categories with the USI would radically change with the

elimination or recoding of those particular responses, but it was considered possible that the

amount in the response would show a different relative correlation to the number of responses. A

restructuring of the parameters for coding LACKING category to better account for this

phenomenon and remove it from this category could change the results of L3© LACKING

category. The possibility of increasing the ability to fine tune a student assessment by taking into

consideration the amount that is spoken is an unexpected positive result of this study.

Pearson r correlations between each L3© category with the total USI score demonstrated

statistically significant moderate to high correlations with the total USI score in five out of the

six L3© coding categories. The L3

© coding of the amount of talking contained in the student

responses to the interviewer’s questions provided more significant results (all were significant)

and generally larger correlations to the USI than the data derived from the number of responses

that a student gave for each L3© category. The only L3

© that was not significantly related to the

USI was from the number of L3© INCOMPLETE response category.

To summarize, in the GOOD responses the amount of talking that the student does in

responding to the question seems to indicate that the more they talk, the more they know: a

positive and high correlation to the USI. The amount of talking demonstrates a 23% increased

predictive power of the amount of knowledge than by tallying the number of responses that

contained higher thinking parameters. The INCOMPLETE responses indicate that the more the

student talked, the less they knew. The amount in INCOMPLETE responses correlated

moderately and negatively with the USI, while the number of INCOMPLETE responses has

close to a 0.0 correspondence with the USI. The amount of talking, again, provides a more

accurate indication of the amount of higher thinking in this category. In the LACKING category,

the data indicates that for both number of responses as well as the amount in responses, the more

entries, the lower the student level of knowledge. Unlike the GOOD and INCOMPLETE

categories, the number of LACKING responses correlate more highly to the USI than the amount

in the responses. Pearson r2 describes the number of responses producing a 41% increased

predictive power of the lack of knowledge than the amount of talking in the LACKING category.

Limitations of the Present Study

L3© coding entails structuring the interview, recording, transcribing, coding, and

interpretation of the results. As such, it is too time intensive for regular classroom use.

Conclusions and Future Prospects

The insights gained using Pearson r and r2 help explain learning levels within a

population of 4th, 5th and 6th Grade students. The high correlation (r = .73) of the amount of

GOOD student response with the USI suggests that cognitive reflections coded using L3© could

be used to more accurately assess orthographic knowledge of 4th, 5th and 6th grade students as

measured by the USI to link orthography and morphology than other more customary

assessments.

The coding lens for learning levels, L3© , offers a method to assess student interviews that

expands the more traditional right/wrong differentiation into variations of learning levels. L3©

high correlation to the USI suggests that interviews could be used as an additional assessment

measure to correlate a student’s knowledge of morphology and orthography.

Teachers customarily use informal student interviews to assess student knowledge in the

classroom, but there has been no reliable way to report the findings. If teachers can listen for the

higher learning levels as coded by L3© and find that their results correlate with the other usual

and customary assessment measures used in the classroom, interviews could then be more easily

used as a valid assessment vehicle by the elimination of the transcription and coding which made

interviews time consuming to administer and evaluate. Further study is needed to explore the

efficacy of using L3© to simply listen to student responses to scientifically determine a student’s

knowledge level.

References

Adams, W. K., & Wieman, C. E. (2011). Development and validation of instruments to measure

learning of expert-like thinking. International Journal of Science Education, 33(9), 1289-

1312.

Anglin, J. M. (1993). Vocabulary development: A morphological analysis. Monographs of the

Society of Research in Child Development, 58 (10, Serial No. 238).

Apel, K., & Thomas-Tate, S. (2009). Morphological awareness skills of fourth-grade African

American students. Language, Speech, and Hearing Services in Schools, 40, 312-324.

Arnold, D. S., Atwood, R. K., & Rogers, V. M. (1974). Question and response levels and lapse

time intervals. The Journal of Experimental Education, 43(1), 11-15.

Armstrong, Gosling, Weinman, & Marteau. (1997). The place of inter-rater reliability in

qualitative research: An empirical study. Sociology, 31(3), 597-606.

Baumann, J. F., Edwards, E. E., Font, G., Tereshinski, C. A., Kame’enui, E. J., & Olejnik, S.

(2002). Teaching morphemic and contextual analysis to fifth-grade students. Reading

Research Quarterly, 37(2), 150-176.

Baumann, J. F., Kame’enui, E. J., & Ash, G. E. (2003). Research on vocabulary instruction:

Voltaire redux. In J. Flood, J. Jensen, P. Lapp, & J. R. Squire (Eds.), Handbook of

research on teaching the English language arts (pp. 752-785). Mahwah, NJ: Earlbaum.

Bear, D., Invernizzi, M., Templeton, S., & Johnston, F. (2008). Words their way: Word study for

phonics, vocabulary, and spelling instruction. Upper Saddle River, NJ: Prentice Hall.

Bear, D. R., Invernizzi, M., Templeton, S., & Johnston, F. J. (2012). Words their way (5th ed.).

Boston, MA: Pearson/Allyn & Bacon.

Berninger, V. W., Abbott, R. D., Nagy, W., & Carlisle, J. (2010). Growth in phonological,

orthographic, and morphological awareness in grades 1 to 6. Journal of Psycholinguistic

Research, 39, 141-163.

Bryant, P., & Nunes, N. (2006). Morphemes and literacy: A starting point. In T. Nunes, P.

Bryant, U. Pretzlik, & J. Hurry (Eds.), Improving Literacy by Teaching Morphemes (pp.

3-33). New York, NY: Routledge.

Carlisle, J.F. (1988). Knowledge of derivational morphology and spelling ability in fourth, sixth

and eighth grades. Applied Psycholinguistics, 9, 247-266.

Carlisle, J. F. (2000). Awareness of the structure and meaning of morphologically complex

words: Impact on reading. Reading and Writing Journal, 12, 169-190.

Carlisle, J. F. (2003). Morphology matters in learning to read: A commentary. Reading

Psychology, 24, 291-322.

Carlisle, J. F., McBride-Chang, C., Nagy, W., & Nunes, T. (2010). Effects of instruction in

morphological awareness on literacy achievement: An integrative review. Reading

Research Quarterly, 45(4), 464-487.

Carlisle, J. F., & Fleming, J. (2003). Lexical processing of morphologically complex words in

the elementary years. Scientific Studies of Reading, 7, 230-254.

Casalis, S., Deacon, S.H., & Pacton, S. (2011). How specific is the connection between

morphological awareness and spelling? A study of French children. Applied

Psycholinguistics, 32, 499-511.

Center for Research in Educational Policy. (2007, February). Words Their Way spelling

inventories: Reliability and validity analysis (Fact Sheet). Memphis, TN: University of

Memphis.

Collins, L. (2005). Accessing second language learners’ understanding of temporal morphology.

Language Awareness, 14(4), 207-220.

Constantinou, C., Hadjilouca, R., & Papadouris, N. (2010). Students’ epistemological awareness

concerning the distinction between science and technology. International Journal of

Science Education, 32(2), 143-172.

Conti-Ramsden, G., Durkin, K., Simkin, Z., Lum, J. A., & Marchman, V. (2011). The PTT-20:

UK normative data for 5- to 11-year-olds on a 20-item past-tense task. International

Journal of Language Communication Disorders, 46(2), 243-248.

Creswell, J., & Plano Clark, V. (2007). Research design: Qualitative, quantitative, and mixed

methods approaches (2nd ed.). Thousand Oaks, CA: Sage.

Cunningham, P. (2006). What if they can say the words but don’t know what they mean? The

Reading Teacher, 59(7), 708-711.

Dale, E., & O’Rourke, J. (1981). Living word vocabulary. Chicago, IL: World Book/Childcraft

International.

Deacon, S.H., Kirby, J.R., & Casselman-Bell, M. (2009). How robust is the contribution of

morphological awareness to general spelling outcomes? Reading Psychology, 30, 301-

318.

Dunn, L. M., Dunn, D. M., & Pearson Assessments (2007). Peabody picture vocabulary test

[educational test]. Published instrument. Minneapolis, MN: Pearson Assessments.

D’Andrea, L. M., Waters, C., & Rudd, R. (2011). Using computer assisted qualitative software

(CAQDAS) to evaluate a novel teaching method for introductory statistics. International

Journal of Technology in Teaching and Learning, 7(1), 48-60.

Ebbers, S. M., & Denton, C. A. (2008). A root awakening: Vocabulary instruction for older

students with reading difficulties. Learning Disabilities Research & Practice, 23(2), 90-

102.

Edrington, J. L., Buder, E. H., & Jarmulowicz, L. (2009). Hesitation patterns in third grade

children’s derived word productions. Clinical Linguistics & Phonetics, 23(5), 348-374.

Freyd, P., & Baron, J. (1982). Individual differences in acquisition of derivational morphology.

Journal of Verbal Learning and Verbal Behavior, 21, 282-295.

Gambrell, L. B. (1983). The occurrence of think-time during reading comprehension instruction.

The Journal of Educational Research, 77(2), 77-80.

Goldman-Eisler, F. (1958a). Speech production and the predictability of words in context.

Quarterly Journal of Experimental Psychology, 10, 96-106.

Goldman-Eisler, F. (1958b). The predictability of words in context and the length of pauses in

speech. Language & Speech, 1(3), 226-231.

Goldman-Eisler, F. (1960). The distribution of pause durations in speech. Language & Speech,

4(1), 232-237.

Goodwin, A. P., & Ahn, S. (2010). A meta-analysis of morphological interventions: effects on

literacy achievement of children with literacy difficulties. Annals of Dyslexia, 60, 183-

208.

Green, L. (2009). Morphology and literacy: Getting our heads in the game. Language, Speach

and Hearing Services in Schools, 40, 283-285.

Harris, M. L., Schumaker, J. B., & Deshler, D. D. (2011). The effects of strategic morphological

analysis instruction on the vocabulary performance of secondary students with and

without disabilities. Learning Disability Quarterly, 34(1), 17-33.

Hawkins, P. R. (1971). The syntactic location of hesitation pauses. Language & Speech, 14(3),

277-288.

Henry, M. K. (1988). Beyond phonics: Integrated decoding and spelling instruction based on

word origin and structure. Annals of Dyslexia, 38, 259-275.

Henry, M. K. (1989). Children’s word structure knowledge: Implications for decoding and

spelling instruction. Reading and Writing: An Interdisciplinary Journal, 2, 135-152.

Honea, Jr, J. M. (1982). Wait-time as an instructional variable: An influence on teacher and

student. The Clearing House, 56(4), 167-170.

Jarmulowicz, L. (2006). School-aged children’s phonological production of derived English

words: Theoretical/review article. Journal of Speech, Language and Hearing Research,

49, 294-308.

Jarmulowicz, L., & Hay, S. E. (2009). Derivational morphophonology: Exploring errors in third

graders’ productions. Language, Speech and Hearing Services in Schools, 40, 299-311.

Kieffer, M. J., & Lesaux, N. K. (2007). Breaking down words to build meaning: Morphology,

vocabulary, and reading comprehension in the urban classroom. The Reading Teacher,

61(2), 134-144.

Kieffer, M. J., & Lesaux, N. K. (2010). Morphing into adolescents: Active word learning for

English-language learners and their classmates in middle school. Journal of Adolescent &

Adult Literacy, 54(1), 47-56.

Kowal, S., Wiese, R., & O’Connell, D. C. (1983). The use of time in storytelling. Language and

Speech, 26(4), 377-392.

Larsen, J. A., & Nippold, M. A. (2007). Morphological analysis in school-age children: Dynamic

assessment of a word learning strategy. Language, Speech, and Hearing Services in

Schools, 38, 201-212.

MAXQDA-10 (2007). MAXQDA - The Professional Tool for Qualitative Data Analysis

[Software]. Published instrument. Available at www.MAXQDA.com

Maclellan, E. (2004). Authenticity in assessment tasks: a heuristic exploration of academics’

perceptions. Higher Education Research & Development, 23(1), 19-33.

Maeroff, G. I. (1991). Assessing alternative assessment. The Phi Delta Kappan, 73(4), 272-281.

Mahony, D. (1994). Using sensitivity to word structure to explain variance in high school and

college level reading ability. Reading and Writing: An Interdisciplinary Journal, 6, 19-

44.

Mahony, D., Singson, M., & Mann, V. (2000). Reading ability and sensitivity to morphological

relations. Reading and Writing: An Interdisciplinary Journal, 12, 191-218.

Mann, V., & Singson, M. (2003). Linking morphological knowledge to English decoding

ability: Large effects of little suffixes. In E. Assink & D. Sandra (Eds.), Reading complex

words: Cross-language studies (pp. 1-25). Dordrecht, the Netherlands: Kulwer.

Mathews, M. M. (1966). Teaching to read: Historically considered. Chicago, IL: The University

of Chicago Press.

Mauthner, N., & Doucet, A. (1998). Reflections on a voice-centred relational method. In J.

Ribbens & R. Edwards (Eds.), Feminist dilemmas in qualitative research (pp. 119-146).

London: Sage Publications.

McCormack, C. (2004). Storying stories: a narrative approach to in-depth interview

conversations. International Journal of Social Research Methodology, 7(3), 219-236.

NCR (National Research Council) (2001). Knowing what students know. The science and design

of educational assessment. In J. W. Pellegrino, N. Chudowsky, & R. Glaser (Eds.),

Committee on the foundations of assessment (Board on Testing and Assessment Center

for Education Division of Behavioral and Social Sciences and Education) (pp. 1-14).

Washington, DC: National Academy Press.

Nagy, W., Berninger, V. W., & Abbott, R. D. (2006). Contributions of morphology beyond

phenology to literacy outcomes of upper elementary and middle-school students. Journal

of Educational Psychology, 98, 134-147.

Nagy, W. E., Herman, P. A., & Anderson, K. (1985). Learning word meanings from context:

How broadly generalizable? (Tech. Rep. No. 347). Urbana-Champaign: University of

Illinois: Center for the Study of Reading.

Nagy, W. E., & Scott, J. A. (2000). Vocabulary processes. In M. I. Kamil, P. Mosenthal, P. D.

Pearson, & R. Barr (Eds.), Handbook of reading research (Vol. 3, pp. 269-284).

Mahwah, NJ: Erlbaum.

Nagy, W., Herman, P., & Anderson, R. (1985b). Learning words from context. Reading

Research Quarterly, 20, 233-253.

Nippold, M. A., & Sun, L. (2008). Knowledge of morphologically complex words: A

developmental study of older children and young adolescents. Language, Speech, and

Hearing Services in Schools, 39, 365-373.

Otterman, L. M. (1955). The value of teaching prefixes and word-roots. The Journal of

Educational Research, 48(8), 611-616.

Pressley, M., Disney, L., & Anderson, K. (2007). Landmark vocabulary instructional research

and the vocabulary instructional research that makes sense now. In R. K. Wagner, A. E.

Muse, & K. R. Tannenbaum (Eds.), Vocabulary acquisition: Implications for reading

comprehension (pp. 205-232). New York, NY: The Guilford Press.

Rapp, D. N., Van den Broek, P., McMaster, K. L., Kendeou, P., & Espin, C. A. (2007). Higher-

order comprehension processes in struggling readers: A perspective for research and

intervention. Scientific Studies of Reading, 11(4), 289-312.

Reed, D. K. (2008). A synthesis of morphology interventions and effects on reading outcomes

for students in grades K-12. Learning Disabilities Research & Practice, 23, 36-49.

Rubin, H. (1988). Morphological knowledge and early writing ability. Language and Speech,

31(4), 337-355.

Ruhlemann, C., Bagoutinov, A., & O’Donnell, M. B. (2011). Windows on the mind: Pauses in

conversational narrative. International Journal of Corpus Linguistics, 16(2), 198-230.

Schonpflug, U. (2008). Pauses in elementary school children’s verbatim and gist free recall of a

story. Cognitive Development, 23, 385-394.

Sesma, H. W., Mahone, E. M., Levine, T., Eason, S. H., & Cutting, L. E. (2009). The

contribution of executive skills to reading comprehension. Child Neuropsychology, 15(3),

232-246.

Siegel, L.S. (2008). Morphological awareness skills of English language learners and children

with dyslexia. Topics in Language Disorders, 28 (1),15-27.

Singson, M., Mahony, D., & Mann, V. (2000). The relation between reading ability and

morphological skills: Evidence from derivation suffixes. Reading and Writing, 12, 191-

218.

Sprinthall, R.C. (2003). Basic Statistical Analysis. (7th ed.). Boston, MA: A and B.

Stahl, R. J. (1994). Using “think-time” and “wait-time” skillfully in the classroom (ERIC

Development Team-ED370885 1994-05-00). Retrieved from Eric Digests:

http://www.eric.ed.gov

Swanborn, M. S., & de Glopper, K. (1999). Incidental word learning while reading: A meta-

analysis. Review of Educational Research, 69(3), 261-285.

Tashakkori, A., & Teddlie, C. (1998). Mixed-methodology: Combining qualitative and

quantitative approaches. Thousand Oaks, CA: Sage.

Templeton, S. (1989). Tacit and explicit knowledge of derivational morphology: Foundations for

a unified approach to spelling and vocabulary development in the intermediate grades and

beyond. Reading Psychology, 10(3), 233-253.

Templeton, S. (1992). Theory, nature, and pedagogy of higher-order orthographic development

in older students. In S. Templeton, & D. R. Bear (Eds.), Development of the orthographic

knowledge and the foundations of literacy: A memorial festschrift for Edmund H.

Henderson (pp. 253-277). Hillsdale, NJ: Erlbaum.

Templeton, S. (2004). The vocabulary-spelling connection: orthographic development and

morphological knowledge at the intermediate grades and beyond. In J. F. Baumann, & E.

J. Kame’enui (Eds.), Vocabulary instruction: Research to practice (pp. 118-138). New

York, NY: The Guilford Press.

Templeton, S., Bear, D. R., Invernizzi, M., & Johnston, F. (2010). Vocabulary their way: Word

study with middle and secondary students. Boston, MA: Pearson.

Templeton, S., & Morris, D. (2000). Spelling. In M. Kamil, P. Mosenthal, P. D. Pearson, & R.

Barr (Series Ed.), Handbook of reading research: Vol. 3. , (pp. 525-543). Mahway, NJ:

Erlbaum.

Templeton, S., Smith, D., Maloney, K., & Ives, B. (2007, December). The nature of morphology

in a developmental model of word knowledge. Paper presented at the 57th annual meeting

of the National Reading Conference, Orlando, FL.

Templeton, S., Smith, D., Moloney, K., Van Pelt, J., & Ives, B. (2009, December). Generative

vocabulary knowledge: Learning and teaching higher-order morphological aspects of

word structure in grades 4, 5, and 6. Paper presented at the 59th annual meeting of the

National Reading Conference, Albuquerque, NM.

Tobin, K. (1987). The role of wait time in higher cognitive level learning. Review of Educational

Research, 57(1), 69-95.

Tyler, A., & Nagy, W. (1990). Use of derivational morphology during reading. Cognition, 36,

17-34.

Venezky, R. L. (1967). English orthography: Its graphical structure and its relation to sound.

Reading Research Quarterly, 1, 59-85.

Verhoeven, L., & Perfetti, C. A. (2011). Morphological processing in reading acquisition: A

cross-linguistic perspective. Applied Psycholinguistics, 32, 457-466.

White, T. G., Power, M. A., & White, S. (1989). Morphological analysis: Implications for

teaching and understanding vocabulary growth. Reading Research Quarterly, 24(3), 283-

304.

Windsor, J. (1994). Children’s comprehension and production of derivational suffixes. Journal

of Speech and Hearing Research, 37, 408-417.

Wysocki, K., & Jenkins, J. R. (1987). Deriving word meanings through morphological

generalization. International Reading Association, 22(1), 66-81.