37
Hello world!

Chounta@paws

Embed Size (px)

Citation preview

Page 1: Chounta@paws

Hello world!

Page 2: Chounta@paws

My baby steps…

PhD: Methods and tools for the evaluation of collaborative learning activities using time series (2014, University of Patras)

Greece!!!

Page 3: Chounta@paws

Growing up…

Page 4: Chounta@paws

Here we are!

Postdoc @ HCII since April, 2016!

Page 5: Chounta@paws

“Linking Dialogue with Student Modeling to Create an Enhanced Micro-adaptive Tutoring

System”

University of Pittsburgh (LRDC)&

CMU (HCII)

Page 6: Chounta@paws

Dialogue as a means for learning

• We aim to develop an adaptive tutorial dialogue system, guided by a student model that will support students in learning physics

Page 7: Chounta@paws

Research questions• RQ: What makes tutorial dialogues successful?

Teachers’ adapt the level of discussion to the student’s “zone of proximal development”

(Vygotsky)

Page 8: Chounta@paws

Research questions• RQ: What makes tutorial dialogues successful?

Teachers’ adapt the level of discussion to the student’s “zone of proximal development”

(Vygotsky)• RQ2: How do tutorial dialogues adapt to different

student characteristics and prior knowledge?Degree of Teacher Control (van de Pol)Contingent Tutoring (Pino-Pasternak)Cognitive Complexity (Nystrand, Graesser)

Page 9: Chounta@paws

An example would be nice….RQ: What minimum acceleration must the climber have in order for the rope not to break while she is rappelling down the cliff? (You do not have to come up with a numerical answer. Just solve for "a" without any substitution of numbers.)Chip: a = f / mT: what's f ?Chip : f = mgT: just mg ? how many forces act ont he climber ?Chip : mg + TT: is mg down or up ?Chip : down and T is upT: ok so now solve for a again plugging in T and mg

RQ: What minimum acceleration must the climber …….Dale: 500/55 kg=a m/s^2T: I don't agree - that's the acceleration that just the pull from the rope would produce (well once the units are straightened out it would be). Think a little more. What is the general rule for finding acceleration from forces?Dale : F/m=aT: and what is the F there?Dale : tension?T: No.. the F in F=ma is always the net force on the object (or group of objects). The vector sum of all the forces on the object. I prefer to say "Sum of F= ma" because it's easier to get it right. So.. if she is sliding down and the rope is just short of breaking, what is the *net* force on her?

High performer

Page 10: Chounta@paws

Research Objective

To integrate a student model into a tutorial dialogue system in order to guide the dialogue more effectively according to students’ needs

“more effectively”

Page 11: Chounta@paws

• Analyze human-to-human tutorial dialogues

• Build a coding scheme to operationalize Level of Support

The mechanics of tutorial discussions

“we need to identify and group the features of dialogic discourse to differentiate the levels of support”

Page 12: Chounta@paws

Method of the study

3 human-to-human dialogues on Physics / 1 per overall learning gains level [low/medium/high]

Level of Control [3-step scale]

Question Category[18 types]

Level of Specificity[3-step scale]

Contingent Tutoring[binary]

LOSCoding Scheme

Page 13: Chounta@paws

Coding scheme - Application

• 4 coders• 3 dialogues [low/medium/high]• 19 tutor turns• Introduction to the coding scheme• Rating handbook & template

Page 14: Chounta@paws

Coding scheme - ResultsDimension Fleiss’ Kappa p-value

Level of control 0.404 4.13e-11Question category 0.395 0Level of specificity 0.141 0.0245

Contingency 0.0764 0.415

Lessons learned:

Still unclear how teachers effectively regulate the level of support

Page 15: Chounta@paws

Coding scheme: Lessons learned

• Not easy to interpret the goal of the intervention• One intervention, multiple goals• Crucial features: – New content– Feedback Information / Information meant to push student

forward– Degree of detail

Page 16: Chounta@paws

Coding scheme Adaptation and Evaluation

Before After

Level of Control Information related to student’s answer (Backward/Forward)Hints Provision

Question Category Question Category

Level of Specificity Feedback on Correctness

Information related to feedback

Contingency Contingency

Page 17: Chounta@paws

Application and Evaluation

• 10 human-to-human tutorial dialogues (Physics) – 3 High, 3 Low, 4 Medium

• 2 raters per dialogue• The raters were given a tutorial on the coding

scheme and detailed instructionsDimensions Cohen's kappa

D1. Information Provision (B) 0.871D1. Information Provision (F) 0.843

D2. Hints Provision 0.843D3. Feedback on correctness 0.826

D4. Information related to feedback 0.764

How to author dialogues

Page 18: Chounta@paws

So, you’ve coded it… Now what?

Remember the Research Objective?“To integrate a student model into a tutorial dialogue system in order to guide the dialogue more effectively according to students’ needs”

Use a student model to decide on what step and which piece of dialogue to give next

Page 19: Chounta@paws

The line of reasoning

*Jordan, P., Albacete, P., & Katz, S. Exploring Contingent Step Decomposition in a Tutorial Dialogue System.

Can you please tell me what is the vertical net force on the arrow?

What does that mean with respect to the arrow’s vertical acceleration?

RQ: “Suppose the archer is standing at the edge of a high cliff and shoots his arrow perfectly horizontally with an initial velocity of 50 m/s. Neglecting air resistance, how will the vertical velocity of the arrow vary during its flight until it eventually hits the ground? “

Page 20: Chounta@paws

Move through the line of reasoning

*Jordan, P., Albacete, P., & Katz, S. Exploring Contingent Step Decomposition in a Tutorial Dialogue System.

We need an assessment of the specific skills(KCs) involved in the specific steps (nodes) (Conati, 2010)

RQ: “Suppose the archer is standing at the edge of a high cliff and shoots his arrow perfectly horizontally with an initial velocity of 50 m/s. Neglecting air resistance, how will the vertical velocity of the arrow vary during its flight until it eventually hits the ground? “

High Support: I disagree. Look at Newton's second law, Fnet = m*a. Is there a vertical net force acting on the arrow when it leaves the bow? In our case, the Fnet is the force of gravity and is applied on the arrow in the vertical direction. So Fnet = mg and mg = ma

Low Support: Recall that you just said that the force of gravity is applied on the arrow. Would this force produce an acceleration on the arrow as it leaves the bow?

Page 21: Chounta@paws

A couple of things to consider…

• How to model The Model?• And how to evaluate it?

Page 22: Chounta@paws

How do we model the Model?

• Regression models• Bayesian networks• Rule-based models• Time-series models• Overlay models (using NLP)

Page 23: Chounta@paws

What challenges we face

• Not enough observations per skill

Page 24: Chounta@paws

What challenges we face

• Not enough student input

Page 25: Chounta@paws

What challenges we face

• Not enough observations per skill• Not enough student content• Real-time assessment and prediction requires

efficiency

Page 26: Chounta@paws

What challenges we face

• Not enough observations per skill• Not enough student content• Real-time assessment and prediction requires

efficiency

Page 27: Chounta@paws

How do we model the Model?

• Regression models• Bayesian model (essentially Bayesian KT)• Rule-based models• Time-series models• Overlay models (using NLP)

Page 28: Chounta@paws

Regression models

• Advantages – Good match for our datasets (multiple skills per step) – Easy to implement /maintain (for new skills, scenarios)

• Disadvantages– Multicollinearity issues might affect performance (i.e.

when the predictors highly correlate with each other)– Overfitting– Never used for real-time predictions before!

*Cen, H. (2009). Generalized learning factors analysis: improving cognitive models with machine learning .*MacLellan, C. J., Liu, R., & Koedinger, K. R. (2015). Accounting for Slipping and Other False Negatives in Logistic Models of Student Learning.International Educational Data Mining Society.

Page 29: Chounta@paws

Bayesian models

• Advantages – Good way to describe complex mechanisms– Provide better understanding/reasoning for diagnosis

• Disadvantages– Need of a pre-defined network for every scenario/activity– Difficult to adapt/maintain (when new skills are added)BUT - Bayesian Knowledge Tracing has been used “online” for inferring mastering of “main” skills per step… *recent work on individualization/introducing student-specific parameters into BKT*Yudelson, M. V., Koedinger, K. R., & Gordon, G. J. (2013, July). Individualized bayesian knowledge tracing models.

In International Conference on Artificial Intelligence in Education (pp. 171-180). Springer Berlin Heidelberg*Pardos, Z. A., & Heffernan, N. T. (2010, June). Modeling individualization in a bayesian networks implementation of knowledge tracing. In International Conference on User Modeling, Adaptation, and Personalization (pp. 255-266). Springer Berlin Heidelberg.

Page 30: Chounta@paws

Evaluation

• How to identify the best student model (or models)?

– Model parameters (when available) – aic, bic– Error rates (RMSE, MAE) and Learning curves

(Corbett, et.al.,1995)– Students’ input – observe students’ behavior,

interviews etc.– Performance indicators[e.g. time, complexity]/cost

Page 31: Chounta@paws

Work in progress: Early application of modeling techniques on existing data

• 3 datasets 1 super set Skills(Dynamics, Revoice, Summary) 217 skills

• We applied Regression & BKT on all datasets

Page 32: Chounta@paws

Work in Progress: Results (per study)

Dynamics (N=5272)BKT AFM PFM IFM

AIC 6816 4946 5029 4946BIC 7922 6812 6895 7200

RMSE 0.458 0.396 0.401 0.399

Revoice (N=4613)BKT AFM PFM IFM

AIC 6621 3954 3908 3913BIC 7759 5746 5700 6117

RMSE 0.439 0.396 0.391 0.392

Summary (N=6209)BKT AFM PFM IFM

AIC 7110 5821 5773 5797BIC 8456 7861 7813 8302

RMSE 0.425 0.392 0.390 0.391

*The Akaike information criterion (AIC) and the Bayesian information criterion (BIC) are measures of the relative quality of statistical models for a given set of data.

Page 33: Chounta@paws

Work in Progress: Results (super set)

Global (N=15644)BKT AFM PFM IFM

AIC 7110 13103 13123 13056BIC 8456 17229 17249 17836

RMSE 0.425 0.42 0.421 0.419Prediction time

(sec) 0.125 0.253 0.223 0.229

Page 34: Chounta@paws

Notes

• Overfitting– For regression models: The more skills, the more overfitting– For BKT: practically the same even for large datasets

• Error– For regression models the error increases along with the

dataset size– For BKT, practically the same (or even better…)– For large datasets, error(regression)≈error(BKT)

• Execution Time– Regression models almost double than BKT

Page 35: Chounta@paws

Tutor architecture and Student Model

answer

UpdateProfile( student)

Compute Next Steps’

Probabilities

Tutor

Next step

Next step

Profile(student)

Evaluation (answer)

Run Rules

Student Model

Next Steps alternatives

Page 36: Chounta@paws

Future Work

• Implementing the infrastructure for real-time studies

• Field trials in lab and in classroom• Further refinements…

But KEEP THE CONVERSATION GOING!!!!

Page 37: Chounta@paws

The end.If you want to know more, get in touch!

NSH [email protected]

(plus cat pictures!)