Upload
myles-holt
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Improving Learning from Peer Review with NLP and ITS Techniques
(July 2009 – June 2011)
Kevin AshleyDiane Litman Chris Schunn
Thank You for the Support!
New interdisciplinary research group Research outcomes
– Refereed publications– Pending IES and NSF proposals
Technology development– New version of SWoRD – “Intelligent” scaffolding components
Outline
SWoRD Intelligent Scaffolding for Reviewers and Authors AI-supported Argument Diagramming Summary
SWoRD [Cho & Schunn, 2007]
Authors submit papers Reviewers submit (anonymous) feedback Authors revise and resubmit papers Authors provide back-ratings to reviewers
regarding feedback helpfulness
SWoRD Rebuild SWoRD 3.5@LRDC dying, v4.0@Missouri struggling New SWoRD v5.0@LRDC
– Rebuilt from scratch (more stable, expandable)– More instructional flexibility
» # and type of rating dimensions, reviewing dimensions, # of drafts, grading options, …
– Better instructor oversight of students» Missing papers & reviews, high conflict reviews, inter-
rater accuracy, …
– Better research support» Can directly download ‘research’ data
SWoRD 5.0 Users Active classes in Spring 2011: 39
Users in Spring 2011: ~2000
TOTAL User accounts: ~3900
Countries:
– USA
– Canada
– United Kingdom
– Netherlands
– Estonia
– Hungary
– Turkey
– China
Disciplines:
– Psychology
– Astronomy & Physics
– Computer Science
– Biology
– Economics
– Engineering
– Speech-Language Pathology
– English & Rhetoric
– Philosophy
– Women's Health
Levels:
– University
– High School
– Middle School
Some Remaining Weaknesses
1. Feedback is often not stated in effective ways
2. Feedback and papers often do not focus on core aspects
Feedback Features and Positive Writing Performance [Nelson & Schunn, 2008]
Solutions
Summarization
Localization
Understanding of the Problem
Implementation
Our Approach: Detect and Scaffold
1. Detect and direct reviewer attention to key feedback features such as solutions
2. Detect and direct reviewer and author attention to thesis statements in papers and feedback
Detecting Key Features of Text Using Educational Data Mining
Natural Language Processing (NLP) to extract attributes from text, e.g.– Regular expressions (e.g. “the section about”)– Domain lexicons (e.g. “federal”, “American”)– Syntax (e.g. demonstrative determiners)– Overlapping lexical windows (quotation identification)
Machine Learning (ML) to predict whether feedback contains localization and solutions, and whether papers contain a thesis statement
Quantitative Model Evaluation(10 fold cross-validation)
Feedback Feature
ClassroomCorpus
N BaselineAccuracy
ModelAccuracy
ModelKappa
HumanKappa
Localization
History 875 53% 78% .55 .69
Psychology 3111 75% 85% .58 .63
Solution
History 1405 61% 79% .55 .79
CogSci 5831 67% 85% .65 .86
Predicting Feedback Helpfulness [Xiong & Litman, under review]
Recall that SWoRD supports numerical back ratings of feedback helpfulness
– My concerns come from some of the claims that are put forth. Page 2 says that the 13th amendment ended the war. Is this true? Was there no more fighting or problems once this amendment was added? … (rating 5)
– Your paper and its main points are easy to find and to follow. (rating 1)
Predicting Expert Ratings(Average of Writing and Domain Experts)
Structural attributes (e.g. review length, number of questions), lexical statistics, and meta-data (e.g. paper ratings) developed for product reviews (e.g. Amazon) are also useful for peer feedback
Features specialized for peer-review (e.g. localization) can further improve performance
Current work: student helpfulness ratings
The ProblemStudents unable to synthesize what the
sources say…
Students unable to synthesize what the
sources say…
The Problem Students unable to synthesize what the
sources say…
Students unable to synthesize what the
sources say…
… or to apply them in solving the
problem.
… or to apply them in solving the
problem.
Our Solution
Source texts
Source texts Author creates
Argument Diagram
Author creates Argument Diagram
Peers review Argument Diagrams
Peers review Argument Diagrams
Author revises Argument Diagram
Author revises Argument Diagram
Author writes paper
Author writes paper
Peers review papers
Peers review papers
Author revises paper
Author revises paper
AI: Guides preparing diagram and using it
in writing
AI: Guides preparing diagram and using it
in writing
AI: Guides reviewingAI: Guides reviewing
Argument diagram student created with LASAD
1 · Hypothesis Link: 1
If: Participants are assigned to the active conditionThen: they will be better at correctly identifying stimuli than participants in the passive condition.
2 · Hypothesis Link: 2
If: The participant has small handsThen: they will be better at recognizing objects
than regardless of what condition they’re in..
9 · (+) supports Link: 1
Active touch participants were able to more accurately identify objects because they had the use of sensitive
fingertips in exploring the objects
7 · (+) supports Link: 1
Active touch is more effective than passive touch
11 · (+) supports Link: 2
Active touch improved through the development levels but passive touch stayed the same (hand size may
play role)
20 · (+) supports Link: 2
Sensory perceptors in smaller hands are closer together, allowing for more accurate object acuity
8 · Citation Link: 1
(Craig 2001)
6 · Citation Link: 1
(Gibson 1962)
10 · Citation Link: 2
(Cronin 1977)
17 · Citation Link: 2
(Peters 2009)
LASAD analyzes diagrams With even small set of types of argument nodes and relations and of
constraint-defining rules… Even simple argument diagrams provide pedagogical information that
can be automatically analyzed. E.g., has student:– Addressed all sources and hypotheses? (No)– Indicated that citations support claims/hypotheses? (Not vice versa as
here)– Related all sources and hypotheses under single claim? (No)– Related some citations to more than one hypothesis? (No interactions
here)– Included oppositional relations as well as supports? (No)– Avoided isolated citations? (Yes)– Avoided disjoint sub-arguments? (No)
Prototype SWoRD Interface for feedback to reviewer pre-review submission
Claims or reasons are unconnected to the research question or hypothesis.
Lippman, 2010 is not organized around a hypothesis.
Siler 2009 is more focused on the response to the task not focused on the actual type of task which is what the hypothesis for the effect of IV2. Doesn’t support the research question.
H2 needs reasoning to connect prior research with the hypothesis, e.g. “because multi-step algebra problems are perceived as more difficult, people are more likely to fail in solving them.”
Support 2 is weak because it’s basically citing a study as the reason itself. Instead, it should be a general claim, that uses Jones, 2007 to back it up.
Lippman, 2010 is free floating and needs to be linked to either the research question or a hypothesis.
Say where these issues happen!(like the green text in other comments)
Say where these issues happen!(like the green text in other comments)
Suggest how to fix these
problems!(like the blue text
in other comments)
Suggest how to fix these
problems!(like the blue text
in other comments)
= Localization hintsXX = Solution hintsXX
Prototype tool to translate student argument diagrams into text
A Translation of Your Argument Diagram (click to edit)
Next Steps
A Translation of Your Argument Diagram (click to edit)
Next Steps
The first hypothesis is, “If participants are assigned to the active condition, then they will be better at correctly identifying stimuli than participants in the passive condition.” This hypothesis is supported by (Craig 2001) where it was found that “Active touch participants were able to more accurately identify objects because they had the use of sensitive fingertips in exploring the objects.” The hypothesis is also supported by (Gibson 1962) where …
The first hypothesis is, “If participants are assigned to the active condition, then they will be better at correctly identifying stimuli than participants in the passive condition.” This hypothesis is supported by (Craig 2001) where it was found that “Active touch participants were able to more accurately identify objects because they had the use of sensitive fingertips in exploring the objects.” The hypothesis is also supported by (Gibson 1962) where …
The second hypothesis is, … The second hypothesis is, …
1
2
Export textExport text
QuitQuit
Save progressSave progressPossible things to improve your argument:•Add a missing citation•Add third hypothesis•Indicate which hypothesis is an interaction hypothesis and specifying an interaction variable(s)•Relate one or more hypotheses along with their supporting sources under a single sub claim•Include any oppositional relations between citations and a hypothesis•Relate the disjointed subarguments concerning the hypotheses under one overall argument
Possible things to improve your argument:•Add a missing citation•Add third hypothesis•Indicate which hypothesis is an interaction hypothesis and specifying an interaction variable(s)•Relate one or more hypotheses along with their supporting sources under a single sub claim•Include any oppositional relations between citations and a hypothesis•Relate the disjointed subarguments concerning the hypotheses under one overall argument