Developing Theory-Based Diagnostic Tests of Grammar: Application of Processability Theory Rosalie...

Developing Theory-Based Diagnostic Tests of Grammar: Application of Processability TheoryRosalie HirchApril 26, 2013

Order of the Presentation Introduction Literature Review: Processability Theory

(PT) & Diagnostic Language Tests Hierarchies Errors Task Types

Method Participants Instruments Analyses

Results Discussion, Limitations, & Conclusions

Introduction:Background & Motivation Bridging the gap between testing and the

classroom

Previous Research in Diagnostic Language Assessment Empirical-based Theory-based

Processability Theory Already used for tests (RapidProfile) Is it sufficient for diagnostic tests?

Introduction:Major Goals & Aims of the Study To evaluate the reliability of a diagnostic

grammar test for middle school students

To explore theoretical approaches to diagnostic language assessment

To investigate the application of Processability Theory for diagnostic grammar tests

Literature ReviewProcessability Theory & Diagnostic Language Tests

HierarchiesProcessability Theory Based on Lexical Functional Grammar

Levels are implicational

Levels come from grammar tree

Problem: the PT hierarchy is very limited

HierarchiesProcessability Theory

Susan decorated a cake while John was playing tennis.

N V D N SC N V PrP NCategory Procedure

Phrasal Procedure (Phrase)

(Phrase)

S-Procedure

(Phrase)

S’-Procedure

Word/ Lemma

HierarchiesDiagnostic Tests Other educational diagnostic tests also use

hierarchies Used for analyzing problems Some are implicational

Tend to be very broad (covering as much as possible) Suggestion that grammar, in particular, must cover a

ErrorsProcessability Theory Learners tend to make 2 types of errors

These account for interlanguages

Is she at home? (Target Sentence)She Ø at home? (Deletion)She is at home? (Overuse)

ErrorsDiagnostic Tests The primary focus of diagnostic tests

Can potentially show 2 elements in learner performance Where the problem lies (error—observable outcome) What thinking led to the error (weakness—underlying

problem)

Requires careful planning Before: Item Design After: Rubric Design

Types of TasksProcessability Theory Emphasis on implicit knowledge (automaticity)

Based on Levelt’s Speaking Model

Tasks tend to be productive (speaking, writing)

Analysis is done afterwards

Types of TasksDiagnostic Tests It is possible to use productive tasks, but not

optimal Difficult to control contexts

More likely to be discrete and, as a result “inauthentic”

Tasks from Norris (2005) and Chapelle et al. (2010) Some qualities of multiple choice Attempt to imitate productive

Research Questions

1. Can we achieve an acceptable level of reliability for the grammatical diagnostic test used for this study?

2. Do the items for the grammatical diagnostic test work well at an item level in terms of item discrimination and difficulty? Were there unexpected patterns?

3. What is the relationship between the subtest, full test, and self-assessment?

4. Were mastery and non-mastery patterns consistent with predictions based on the Processability Theory hierarchy?

MethodParticipantsInstrumentAnalyses

Participants—Subjects

219 middle school students

Outside Seoul

No overseas education

Girls%

Grammar Test Writing Test

Mean StDev Range Mean StDev Range

Gr. 3-5 72 52.7 47.2 0.46 0.180.10-0.85

3.3 1.8 0-7.5

Gr. 6 89 59.6 40.4 0.50 0.200.13-0.87

3.3 1.8 0-8

Gr. 7 39 51.3 48.7 0.47 0.190.02-0.79

3.8 1.6 0-7

Gr. 8&9

19 36.8 63.2 0.58 0.220.04-0.90

4.2 2.4 0-7

Total 219 53.9 46.1 0.49 0.190.02-0.90

3.5 1.8 0-8

Participants—Raters

2 rounds of rating

Round 1: Grammar 6 Raters All experienced in teaching; 4 in preparing tests Scored the grammar tests and writing tests for the specific

grammar points Rated once (absolute answers)

Round 2: Holistic 5 Raters All experienced in scoring writing tests Rated twice (3 times where raters differed by 2 or more)

Instruments

Grammar Test (see handout)

Writing test: picture task Comparison purposes

PT grammar and additional levels

Analyses

Descriptive Statistics Central tendency & dispersion measures T-unit analysis

Test and subsection reliability (Alpha)

Item difficulty and discrimination

Correlation with the writing test

Fit to PT hierarchy

Results

Descriptive StatisticsGrammar Test & Writing Test

N Items Mean SD Median Mode Range

Version 1 219 52 25.6 10.1 25 15 1-47

Version 2 219 42 20.3 9.0 20 19 0-40

Writing 219 1 3.0 1.8 4.0 4.5 0-8

NAve. Word Count

Range Word Count

Avet-unit Count

Words per

t-unit

Words per

Clause

Clauses pert-unit

Target Clause

219 67.83 0-242 10.78 6.30 5.69 0.11 0.19

Reliability StatisticsGrammar Test and Subsections & Writing Test

Section

Det NC PNPas

Prep SCA SCB SCC SCT TestPTes

tNumber of items

5 5 5 5 5 6 4 5 4 4 4 12 52 42

Alpha score 0.18 0.7 0.88 0.85 0.93 0.92 0.73 0.76 0.73 0.74 0.61 0.83 0.92 0.93

NCorrelati

onKappa

Perfect Agreem

Adjacent

Scores

Perfect+

Adjacent

Rho Alpha

P-B Proph

(3-rater)

Writing Test

219 0.92 0.41 0.49 0.49 0.99 0.91 0.96 0.98

Item Difficulty and DiscriminationGrammar Test

Item Numbers

Correlation with the Writing TestGrammar Test and SubsectionsPlN Past PrC SVSg SVPl Prep

SubClTest Total

Writing

ScoreA B C Tot

PlN 1Past .37** 1PrC .29** .34** 1

SVsg .28** .42** .43** 1SVpl .38** .36** .27** .25** 1Prep .28** .33** .46** .45** .26** 1SCA .21** .28** .40** .40** .26** .53** 1SCB .23** .34** .27** .38** .23** .53** .56** 1SCC .15* .18** .26** .39** .11 .39** .50** .42** 1SCT .25** .34** .39** .48** .26** .60** .87** .86** .69** 1Test .55** .65** .70** .75** .51** .73** .67** .64** .51** .76** 1

Writing

.36** .43** .44** .37** .33** .47** .42** .46** .31** .50** .61** 1

**. Correlation is significant at the 0.01 level (2-tailed).

*. Correlation is significant at the 0.05 level (2-tailed).

Fit to Implicational Hierarchies

Coefficient of Scalability:

PT Only=94.1%

PT + Proposed Levels=89.3%

1 2 3 N

3 levels 5

2 levels

1 level

0 levels 2

𝑇𝑜𝑡𝑎𝑙 ¿𝑜𝑓 𝑐𝑒𝑙𝑙𝑠−𝐸𝑥𝑐𝑒𝑝𝑡𝑖𝑜𝑛𝑠 ¿𝑇𝑜𝑡𝑎𝑙¿

𝑜𝑓 𝑐𝑒𝑙𝑙𝑠 ¿=90%+¿

Discussion, Limitations, & Conclusion

Discussion

Overall reliability was quite good

Determiner and non-count section did not work Exposed a problem with determiners generally

Task-types have good potential for diagnostic information

Grammar correlated fairly well with writing scores Follows from complexity and accuracy May also explain determiners & non-count nouns

Fit to PT of proposed levels suggests tasks are plausible

Limitations

Results are generalizable only to Koreans Methods may be universal

Should have had a larger writing sample Also, more feedback from students and teachers

More high-level students

Conclusions

Most of the grammar tasks can work well, but require more planning & research Particular attention on error types

It may be possible to expand the PT hierarchy Needed in order to be useful for diagnostic purposes

Developing Theory-Based Diagnostic Tests of Grammar: Application of Processability Theory Rosalie...

Documents

TESTING PROCESSABILITY THEORY IN L2 SPANISH: CAN

Reassessing the application of Processability Theory The WEAVER++ model and ESL nominal plural marking Helen Charters, Loan Dao & Louise Jansen University

Processability of High Waste Loading Hanford LAW Feeds

Ensure productivity and high quality processability of the

Increasing the Processability of Pullulan for Biological ... · Increasing the Processability of Pullulan for Biological Applications by Changes in ... TITLE: Increasing the Processability

Investigation of the Processability of Different PEEK

PAINT ON POWER: ENHANCED DURABILITY AND PROCESSABILITY

Pienemann’s Teachability Hypothesis and Processability

LFG CONTRIBUTIONS IN SECOND LANGUAGE ACQUISITION …web.stanford.edu/.../18/papers/lfg13artonimagnani.pdf · 2014-01-24 · Processability Theory (PT) offers a principled transitional

A Unificational Mechanism for Morphological Developmental ......Pienemann (1998) proposed the Processability Theory which associates Incremental Procedural Grammar (IPG) (Kempen &

Improving the processability of coke water slurries for

Processability of Biobased Thermoset Resins and Flax Fibres … · 2014. 1. 7. · Processability of Biobased Thermoset Resins and Flax Fibres Reinforcements Using Vacuum Assisted

Processability Theory: Current issues in theory and

Optimizing the Processability of Selenium Nanowires and Their Chemical Transformation into

Developing Theory-Based Diagnostic Tests of English Grammar: Application of Processability Theory Rosalie Hirch April 26, 2013

Prilling of API/fatty acid suspensions Processability and

Understanding the calendering processability of Li(Ni0

TESTING PROCESSABILITY THEORY IN L2 SPANISH: CAN …d-scholarship.pitt.edu/11611/1/BonillaCL_2012.pdf · post-test three weeks later. Learners’ production of the target structures

Processability and Film Performance of Single Site sLLDPE / LDPE

Ali Alshowaish. Collective realization that machine-processability requires a coherent data model A casual discussion at in Chicago, October of