Upload
timvogelsang
View
263
Download
1
Tags:
Embed Size (px)
Citation preview
On the Validity of Peer Grading
and a Cloud Teaching Assistant
SystemTim Vogelsang - March 2015
Lara Ruppertz
Paper (ACM): http://dx.doi.org/10.1145/2723576.2723633
On the Validity of Peer Grading
and a Cloud Teaching Assistant
SystemTim Vogelsang - March 2015
Lara Ruppertz
Paper (ACM): http://dx.doi.org/10.1145/2723576.2723633
Q3: How do students’ performances and the grading types differ
amend different scoring criteria?
Q4: How valid are CTAS and peer grading?
MOOCINTRODUCTION
“MOOC, every letter is negotiable“ by Mathieu Plourde is licensed under CC-BY.
On The Validity of Peer
Grading and a Cloud Teaching
Assistant System
On The Validity of Peer
Grading and a Cloud Teaching
Assistant System
On The Validity of Peer Grading and a
Cloud Teaching Assistant
System
On The Validity of Peer Grading and a
Cloud Teaching Assistant
System
On The Validity of Peer
Grading and a Cloud Teaching
Assistant System
On The Validity of Peer
Grading and a Cloud Teaching
Assistant System
Reliability and ValidityMETHODS
„Validity & Reliability“ by Nevit Dilmen is licensed under CC BY-SA 3.0.
Reliability and ValidityMETHODS
„Validity & Reliability“ by Nevit Dilmen is licensed under CC BY-SA 3.0.
Reliability and ValidityINTRODUCTION
Many Peers Some CTAs
Few Profs
Compute Validity
Assignment
Compute Reliability
Professor: Anja MihrEXPERIMENT SETUP
Screenshot taken from iversity.org platform. Image courtesy Anja Mihr.
Sample SizeEXPERIMENT SETUP
Many Peers Some CTAs
Few Profs
Compute Validity
Compute Reliability
Final Exam
Many Peers Some CTAs
Few Profs
Compute Validity
Compute Reliability
- 25 paid exam participants
- 16 multiple choice questions
- 476 exam participants
- 4620 active students- 30k enrolments
- 1 essay question
Sample Size
Final Exam
EXPERIMENT SETUP
Many Peers (476) Some CTAs (4)
Few Profs (1)
Final Exam
Compute Validity
Compute Reliability
- 25 paid exam participants
- 16 multiple choice questions
- 476 exam participants
- 4620 active students- 30k enrolments
- 1 essay question
25
Sample SizeEXPERIMENT SETUP
Final ExamEXPERIMENT SETUP
Screenshot taken from iversity.org platform. Image courtesy Anja Mihr.
Source: 11 pt, regular37
THESIS / CENTRAL IDEA
ORGANISATION
CONTENT
BALANCED ARGUMENTATION
Grading RubricsEXPERIMENT SETUP
I
II
III
IV
I
II
III
IV
V
THESIS / CENTRAL IDEA
ORGANISATION
CONTENT
BALANCED ARGUMENTATION
SOURCES
Grading RubricsEXPERIMENT SETUP
Q3: How do students’ performances and the grading types differ
amend different scoring criteria?
Q4: How valid are CTAS and peer grading?
Research QuestionsMETHODS
Q1: How are grades given by peers, the CTAs and the instructor
distributed?
Q3: How does the grading differ between the grading rubrics?
Q4: How valid are CTAS and peer grading?
Research QuestionsMETHODS
Q1: How are grades given by peers, the CTAs and the instructor
distributed?
Q2: How does the grading vary among the grading rubrics?
Q4: How valid are CTAS and peer grading?
Research QuestionsMETHODS
Q1: How are grades given by peers, the CTAs and the professor
distributed?
Q2: How does the grading vary among the grading rubrics?
Q3: How valid are peer and CTA grading?
Research QuestionsMETHODS
RESULTS
Histogram of grades given by peers, CTAs and
professor
Range: 0 (worst grade) - 4 (best grade)
Grade
CTAs
Peers
Professor
How are grades distributed?
RESULTS
Thesis Organisation Content Argumentation Sources
Histogram of grades given by peers, CTAs and professor.
Range: 0 (worst grade) - 4 (best grade).
Distinguished by grading rubrics.
CTAs
Peers
Professor
Grade
How does the grading vary among the rubrics?
52
RESULTS
Average grade assigned by peers, CTAs and professor to a specific criterion.
Range: 0 (worst grade) - 4 (best grade).
Peers CTAs Professor
How does the grading vary among the rubrics?
53
Many Peers (476) Some CTAs (4)
Few Profs (1)
Final Exam
Compute Validity
Compute Reliability
- 25 paid exam participants
- 16 multiple choice questions
- 476 exam participants
- 4620 active students- 30k enrolments
- 1 essay question
25
RESULTS
How valid are peer and CTA grading?
RESULTS
How valid are peer and CTA grading?
Correlation matrix using Pearson’s correlation.
375 data points: (25 exams) x (5 criteria) x (3 grading groups).
RESULTS
Scatterplot : One dot represents one
final exam graded by peers, CTAs
and professor.
Range: 0 (worst grade) - 4 (best grade)
Average CTA Grade
How valid are peer and CTA grading?
RESULTS
How valid are peer and CTA grading?
Correlation matrix using Pearson’s correlation.
375 data points: (25 exams) x (5 criteria) x (3 grading groups).
RESULTS
How valid are peer and CTA grading?
Peers CTAs Prof.
Peers
CTAs
Prof.
Correlation matrix using Pearson’s correlation.
375 data points: (25 exams) x (5 criteria) x (3 grading groups).
Many Peers (476) Some CTAs (4)
Few Profs (1)
Final Exam
Compute Validity
Compute Reliability
- 25 paid exam participants
- 16 multiple choice questions
- 476 exam participants
- 4620 active students- 30k enrolments
- 1 essay question
25
RESULTS
How valid are peer and CTA grading?
Hypotheses / Future WorkDISCUSSION
Repeat Experiment Adjust Experiment
Disagreement on
‚argumentation‘?
63
Hypotheses / Future WorkDISCUSSION
Repeat Experiment Adjust Experiment
Disagreement on
‚argumentation‘?
Peer’s bias?
64
Hypotheses / Future WorkDISCUSSION
Repeat Experiment Adjust Experiment
Disagreement on
‚argumentation‘?
Peer’s bias?
Advanced CTA
selection process.
65
Hypotheses / Future WorkDISCUSSION
Repeat Experiment Adjust Experiment
Disagreement on
‚argumentation‘?
Peer’s bias?
Advanced CTA
selection process.
Peer calibration
process.
66
How to define
validity in the
context of massive
grading?
Hypotheses / Future WorkDISCUSSION
Repeat Experiment Adjust Experiment
Disagreement on
‚argumentation‘?
Peer’s bias?
Advanced CTA
selection process.
Peer calibration
process.
67
Disagreement on
‚argumentation‘?
Peer’s bias?
Advanced CTA
selection process.
Peer calibration
process.
How to define
validity in the
context of massive
grading?
How to design a valid grading
process including professor,
peers and CTAs?
Hypotheses / Future WorkDISCUSSION
Repeat Experiment Adjust Experiment
68