20
Fundamentals of Automated Essay Scoring Mark D. Shermis, Ph.D. Professor and Dean, College of Education The University of Akron

ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content

Fundamentals of Automated

Essay Scoring

Mark D. Shermis, Ph.D.

Professor and Dean,

College of Education

The University of Akron

Page 2: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content

What is Automated Essay

Scoring?

• Software technology that automatically grades written

English. Graders in other languages have already

been developed.

• Has been applied successfully to short essays (high-

and low-stakes tests) and longer documents.

• Presently a web-based performance assessment.

• Provides both holistic and trait scores.

• Can provide discourse analysis.

CSSO Conference

2

Page 3: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content

AES--How Does It Work?

• Most grading engines use rater-behavior as the

criterion for their predictions.

• The computer doesn’t “understand” what is written,

but can be programmed to evaluate keywords and

synonyms.

• It is possible to write a non-sensical essay that gets a

good score, but you have to be a good writer to

accomplish this.

• Can evaluate both content and writing ability.

CSSO Conference

3

Page 4: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content

CSSO Conference

4

Parsers Invested Heavily in

Content

• Intelligent Essay AssessorTM (Pearson

Knowledge Technologies)

• e-Rater® (Educational Testing Service)

• IntelliMetric™ (Vantage Learning)

Page 5: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content

5

Content is Slippery, However

• Christopher Columbus – Queen America sailed to Santa Maria with 1492

ships. Her husband, King Columbus, looked to

the Indian explorer, Nina Pinta, to find vast wealth

on the beaches of Isabella, but would settle for

spices from the continent of Ferdinand.

CSSO Conference

Page 6: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content

Tape Measure Analogy

CSSO Conference

6

If you ask a person how to measure length…

Page 7: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content

Reliability

• Most studies show exact agreement in the

80s and adjacent agreement in the 90s for

the three major vendors.

CSSO Conference

7

Page 8: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content

Validity

• Validity demonstrated through true score

analysis, correlations with other (objective)

tests, and prediction studies (Keith, 2003).

CSSO Conference

8

Page 9: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content

Writing A Prompt

• A good prompt is a good prompt; no different in the

automated world.

• Focused Topic and Expectations

• Clear Task/ Charge

• Other Characteristics

– Generate enough content

– Scorability

– Stimulates original writing

– unemotional/unbiased

CSSO Conference

9

Page 10: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content

Rating Rubrics

• Scoring mechanism that evaluates essays holistically, analytically, or via traits.

• Most of the trait analytic and trait rubrics don’t seem to differentiate all that much from holistic scoring, but people like them (Shermis et al, 2002)

• May miss important (unarticulated) aspects of the writing enterprise (Bennett & Bejar, 1999).

CSSO Conference

10

Page 11: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content

6+1 Traits™

• Ideas

• Organization

• Voice

• Word Choice

• Sentence Fluency

• Conventions

• +1 Presentation (not used) • Source: Northwest Educational Research Laboratory, Eugene, OR. 6+1™ is a trademark of NWREL.

CSSO Conference

11

Page 12: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content

6+1 Traits™ Scoring Rubric

CSSO Conference

12

Page 13: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content

Intelligent Essay Assessor

• http://www.pearsonkt.com

CSSO Conference

13

Page 14: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content

eRater® and Criterion(SM)

• http://www.ets.org/criterion

CSSO Conference

14

Page 15: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content

Intellimetric™

• http://www.vantagelearning.com

CSSO Conference

15

Page 16: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content

Developing The Model

• Ideal: 300 Typical, Scored-Responses Drawn

From the Population

• Ideal: Strong Representation at the Tails of

the Distribution

• Ideal: Scored by Two Well Trained Scorers

• Cross-validated

CSSO Conference

16

Page 17: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content

Portfolios for Document

Storage/Evaluation

• View Reports/Reporting Options

• Set up Assignments/Assignments

• View Setup Options (Tools, feedback)

CSSO Conference

17

Page 18: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content

The Florida Proposal

• Develop norms for automated essay scoring

& assess for “vulnerable” groups; replace

FCAT+ Writing

CSSO Conference

18

0

1

2

3

4

5

6

Ass

ign

1

Ass

ign

3

Ass

ign

5

Ass

ign

7

Ass

ign

9

Ass

ign

11

Ass

ign

13

Ass

ign

15

Assignment

Sc

ore

Page 19: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content

CSSO Conference

19

Future Directions

• Development of general writing models that will

speed up formulation of specific statistical models for

grading.

• Grading by an “ideal” or “gold standard” essay.

• More work with LSA-like approaches to evaluating

content.

• Writing tutorials that will provide additional feedback.

Page 20: ABCs of Automated Essay Scoring - the Conference Exchange · • Grading by an “ideal” or “gold standard” essay. • More work with LSA-like approaches to evaluating content

CSSO Conference

20

For Further Information…

Lawrence Erlbaum Associates,

Inc.

http://www.erlbaum.com