48
Using Machine Learning to Monitor Collaborative Interactions Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute

Using Machine Learning to Monitor Collaborative Interactions

  • Upload
    suchi

  • View
    34

  • Download
    5

Embed Size (px)

DESCRIPTION

Using Machine Learning to Monitor Collaborative Interactions. Carolyn Penstein Ros é Language Technologies Institute/ Human-Computer Interaction Institute. VMT-Basilica (Kumar & Ros é, 2010). Labeled Texts. Labeled Texts. TagHelper. Behavior. Unlabeled Texts. - PowerPoint PPT Presentation

Citation preview

Page 1: Using Machine Learning to Monitor  Collaborative Interactions

Using Machine Learning to Monitor Collaborative Interactions

Carolyn Penstein Rosé

Language Technologies Institute/ Human-Computer Interaction

Institute

Page 2: Using Machine Learning to Monitor  Collaborative Interactions

VMT-Basilica (Kumar & Rosé, 2010)

Page 3: Using Machine Learning to Monitor  Collaborative Interactions

Download tools at:http://www.cs.cmu.edu/~cprose/TagHelper.htmlhttp://www.cs.cmu.edu/~cprose/SIDE.html

Monitoring Collaboration with Machine Learning Technology

TagHelper

Labeled Texts

Unlabeled Texts

Labeled Texts

A Model that can Label More Texts

Time

Beh

avio

r

<Triggered Intervention>

Page 4: Using Machine Learning to Monitor  Collaborative Interactions

TagHelper Tools and SIDE

TagHelper Tools uses text miningtechnology to automate annotationof conversational data SIDE facilitates rapid prototyping of reporting

interfaces for group learning facilitators

Define Summaries

Annotate Data

Visualize Annotated Data

http://www.cs.cmu.edu/~cprose/TagHelper.htmlhttp://www.cs.cmu.edu/~cprose/SIDE.html

Page 5: Using Machine Learning to Monitor  Collaborative Interactions

Important caveat!! Machine learning isn’t magic But it can be useful for

identifying meaningful patterns in your data when used properly

Proper use requires insight into your data

?

Page 6: Using Machine Learning to Monitor  Collaborative Interactions

Naïve Approach: When all you have is a hammer…

TargetRepresentationData

Page 7: Using Machine Learning to Monitor  Collaborative Interactions

Naïve Approach: When all you have is a hammer…

TargetRepresentation

Problem: there isn’t one universally best approach!!!!!

Data

Page 8: Using Machine Learning to Monitor  Collaborative Interactions

Slightly less naïve approach: Aimless wandering…

TargetRepresentationData

Page 9: Using Machine Learning to Monitor  Collaborative Interactions

Slightly less naïve approach: Aimless wandering…

TargetRepresentation

Problem 1: It takes too long!!!

Data

Page 10: Using Machine Learning to Monitor  Collaborative Interactions

Slightly less naïve approach: Aimless wandering…

TargetRepresentation

Problem 2: You might not realize all of the options that are available to you!

Data

Page 11: Using Machine Learning to Monitor  Collaborative Interactions

Expert Approach: Hypothesis driven

TargetRepresentationData

Page 12: Using Machine Learning to Monitor  Collaborative Interactions

Expert Approach: Hypothesis driven

TargetRepresentation

You might end up with the same solution in the end, but you’ll get there faster.

Data

Page 13: Using Machine Learning to Monitor  Collaborative Interactions

Expert Approach: Hypothesis driven

TargetRepresentation

Today we’ll start to learn how!

Data

Page 14: Using Machine Learning to Monitor  Collaborative Interactions

What is machine learning?

Automatically or semi-automatically Inducing concepts (i.e., rules) from dataFinding patterns in dataExplaining dataMaking predictions

Data Learning Algorithm Model

New Data

PredictionClassification Engine

Page 15: Using Machine Learning to Monitor  Collaborative Interactions

How does machine learning work?

The simplest rule learner willlearn to predict whatever isthe most frequent result class.This is called the majorityClass.

What will the rule be in this case?

It will always predict yes.

A slightly more sophisticated rule learner will find the feature that gives the mostinformation about the result class. Whatdo you think that would be in this case?

Outlook:Sunny -> NoOvercast -> YesRainy-> Yes

<Feature Name>:<value> -> <prediction><value> -> <prediction>…

Page 16: Using Machine Learning to Monitor  Collaborative Interactions

What will be the prediction?

Outlook:Sunny -> NoOvercast -> YesRainy-> Yes

Model

New Data

Yes

Page 17: Using Machine Learning to Monitor  Collaborative Interactions

More Complex Algorithm… Two simple algorithms

last time0R – Predict the

majority class1R – Use the most

predictive single feature Today – Intro to

Decision TreesToday we will stay at a

high levelWe’ll investigate more

details of the algorithm next time

* Only makes 2 mistakes!

Page 18: Using Machine Learning to Monitor  Collaborative Interactions

More Complex Algorithm… Two simple algorithms

last time0R – Predict the

majority class1R – Use the most

predictive single feature Today – Intro to

Decision TreesToday we will stay at a

high levelWe’ll investigate more

details of the algorithm next time

* Only makes 2 mistakes!

Page 19: Using Machine Learning to Monitor  Collaborative Interactions

More Complex Algorithm… Two simple algorithms

last time0R – Predict the

majority class1R – Use the most

predictive single feature Today – Intro to

Decision TreesToday we will stay at a

high levelWe’ll investigate more

details of the algorithm next time

* Only makes 2 mistakes!

What will it do with this example?

Page 20: Using Machine Learning to Monitor  Collaborative Interactions

More Complex Algorithm… Two simple algorithms

last time0R – Predict the

majority class1R – Use the most

predictive single feature Today – Intro to

Decision TreesToday we will stay at a

high levelWe’ll investigate more

details of the algorithm next time

* Only makes 2 mistakes!

What will it do with this example?

Page 21: Using Machine Learning to Monitor  Collaborative Interactions

More Complex Algorithm… Two simple algorithms

last time0R – Predict the

majority class1R – Use the most

predictive single feature Today – Intro to

Decision TreesToday we will stay at a

high levelWe’ll investigate more

details of the algorithm next time

* Only makes 2 mistakes!

What will it do with this example?

Page 22: Using Machine Learning to Monitor  Collaborative Interactions

More Complex Algorithm… Two simple algorithms

last time0R – Predict the

majority class1R – Use the most

predictive single feature Today – Intro to

Decision TreesToday we will stay at a

high levelWe’ll investigate more

details of the algorithm next time

* Only makes 2 mistakes!

What will it do with this example?

Page 23: Using Machine Learning to Monitor  Collaborative Interactions

Why is it better?

Not because it is more complexSometimes more complexity makes

performance worse What is different in what the three rule

representations assume about your data?0R1RTrees

The best algorithm for your data will give you exactly the power you need

Page 24: Using Machine Learning to Monitor  Collaborative Interactions

Why is it better?

Not because it is more complexSometimes more complexity makes

performance worse What is different in what the three rule

representations assume about your data?0R1RTrees

The best algorithm for your data will give you exactly the power you need

Let’s say you know the rule you are trying to learnis a circle and you have these points. What rulewould you learn?

Page 25: Using Machine Learning to Monitor  Collaborative Interactions

Why is it better?

Not because it is more complexSometimes more complexity makes

performance worse What is different in what the three rule

representations assume about your data?0R1RTrees

The best algorithm for your data will give you exactly the power you need

Let’s say you know the rule you are trying to learnis a circle and you have these points. What rulewould you learn?

Page 26: Using Machine Learning to Monitor  Collaborative Interactions

Why is it better?

Not because it is more complexSometimes more complexity makes

performance worse What is different in what the three rule

representations assume about your data?0R1RTrees

The best algorithm for your data will give you exactly the power you need

Let’s say you know the rule you are trying to learnis a circle and you have these points. What rulewould you learn?

Now lets say you don’t know the shape, now what would you learn?

Page 27: Using Machine Learning to Monitor  Collaborative Interactions

Why is it better?

Not because it is more complexSometimes more complexity makes

performance worse What is different in what the three rule

representations assume about your data?0R1RTrees

The best algorithm for your data will give you exactly the power you need

Let’s say you know the rule you are trying to learnis a circle and you have these points. What rulewould you learn?

Now lets say you don’t know the shape, now what would you learn?

Page 28: Using Machine Learning to Monitor  Collaborative Interactions

Why is it better?

Not because it is more complexSometimes more complexity makes

performance worse What is different in what the three rule

representations assume about your data?0R1RTrees

The best algorithm for your data will give you exactly the power you need

Let’s say you know the rule you are trying to learnis a circle and you have these points. What rulewould you learn?

Now lets say you don’t know the shape, now what would you learn?

If you know the shape, you have fewer degreesof freedom – less room to make a mistake.

Page 29: Using Machine Learning to Monitor  Collaborative Interactions

Why is it better?

Not because it is more complexSometimes more complexity makes

performance worse What is different in what the three rule

representations assume about your data?0R1RTrees

The best algorithm for your data will give you exactly the power you need

Let’s say you know the rule you are trying to learnis a circle and you have these points. What rulewould you learn?

Now lets say you don’t know the shape, now what would you learn?

If you know the shape, you have fewer degreesof freedom – less room to make a mistake.

Page 30: Using Machine Learning to Monitor  Collaborative Interactions

Why is it better?

Not because it is more complexSometimes more complexity makes

performance worse What is different in what the three rule

representations assume about your data?0R1RTrees

The best algorithm for your data will give you exactly the power you need

Let’s say you know the rule you are trying to learnis a circle and you have these points. What rulewould you learn?

Now lets say you don’t know the shape, now what would you learn?

If you know the shape, you have fewer degreesof freedom – less room to make a mistake.

Page 31: Using Machine Learning to Monitor  Collaborative Interactions

Why is it better?

Not because it is more complexSometimes more complexity makes

performance worse What is different in what the three rule

representations assume about your data?0R1RTrees

The best algorithm for your data will give you exactly the power you need

Let’s say you know the rule you are trying to learnis a circle and you have these points. What rulewould you learn?

Now lets say you don’t know the shape, now what would you learn?

If you know the shape, you have fewer degreesof freedom – less room to make a mistake.

Page 32: Using Machine Learning to Monitor  Collaborative Interactions

Why is it better?

Not because it is more complexSometimes more complexity makes

performance worse What is different in what the three rule

representations assume about your data?0R1RTrees

The best algorithm for your data will give you exactly the power you need

Page 33: Using Machine Learning to Monitor  Collaborative Interactions

What do concepts look like?

Page 34: Using Machine Learning to Monitor  Collaborative Interactions

Clarification: Concepts as Lines

R B

S

T

C

X

X

X

X

X

X

Page 35: Using Machine Learning to Monitor  Collaborative Interactions

Machine Learning Process Overview Get to know your data

What distinguishes messages from different categories

Represent messages in terms of features Use feature table tab

Build machine learning model Use machine learning tab

Learn from mistakes, and try again Use feature analyzer tab

Features Coding

Page 36: Using Machine Learning to Monitor  Collaborative Interactions

Machine Learning

Page 37: Using Machine Learning to Monitor  Collaborative Interactions

Algorithms you will use

Decision Trees (J48): good with small feature sets, can find contingencies between features

Naïve Bayes: fast, makes decisions based on probabilities

Support Vector Machines (SMO), makes decisions based on weights, usually works well on text

Page 38: Using Machine Learning to Monitor  Collaborative Interactions

Setting Up Your Data

Page 39: Using Machine Learning to Monitor  Collaborative Interactions

How do you know when you have coded enough data?

What distinguishesQuestions and Statements?

Not all questionsend in a questionmark.

Not all WH wordsoccur in questionsI versus you isnot a reliable predictor

You need to codeenough to avoidlearning rules thatwon’t work

Page 40: Using Machine Learning to Monitor  Collaborative Interactions

Basic Idea

Represent text as a vector where each position corresponds to a term

This is called the “bag of words” approach

Cows make cheese

110001

Hens lay eggs 001110

CheeseCowsEggsHensLayMake

But same representationBut same representationfor “Cheese makes cows.”!for “Cheese makes cows.”!

Page 41: Using Machine Learning to Monitor  Collaborative Interactions

What can’t you conclude from “bag of words” representations?

Causality: “X caused Y” versus “Y caused X”

Roles and Mood: “Which person ate the food that I prepared this morning and drives the big car in front of my cat” versus “The person, which prepared food that my cat and I ate this morning, drives in front of the big car.” Who’s driving, who’s eating, and who’s preparing

food?

Page 42: Using Machine Learning to Monitor  Collaborative Interactions

Part of Speech Tagging

1. CC Coordinating conjunction

2. CD Cardinal number 3. DT Determiner 4. EX Existential there 5. FW Foreign word 6. IN Preposition/subord 7. JJ Adjective 8. JJR Adjective,

comparative 9. JJS Adjective, superlative 10.LS List item marker 11.MD Modal

12.NN Noun, singular or mass

13.NNS Noun, plural 14.NNP Proper noun,

singular 15.NNPS Proper noun, plural 16.PDT Predeterminer 17.POS Possessive ending 18.PRP Personal pronoun 19.PP Possessive pronoun 20.RB Adverb 21.RBR Adverb, comparative 22.RBS Adverb, superlative

http://www.ldc.upenn.edu/Catalog/docs/treebank2/cl93.html

Page 43: Using Machine Learning to Monitor  Collaborative Interactions

Part of Speech Tagging

23.RP Particle

24.SYM Symbol

25.TO to

26.UH Interjection

27.VB Verb, base form

28.VBD Verb, past tense

29.VBG Verb, gerund/present participle

30.VBN Verb, past participle

31.VBP Verb, non-3rd ps. sing. present

32.VBZ Verb, 3rd ps. sing. present

33.WDT wh-determiner

34.WP wh-pronoun

35.WP Possessive wh-pronoun

36.WRB wh-adverb

http://www.ldc.upenn.edu/Catalog/docs/treebank2/cl93.html

Page 44: Using Machine Learning to Monitor  Collaborative Interactions

Feature Space Design

Feature Space DesignThink like a computer!Machine learning algorithms look for features that

are good predictors, not features that are necessarily meaningful

Look for approximations If you want to find questions, you don’t need to do a

complete syntactic analysis Look for question marks Look for wh-terms that occur immediately before an

auxilliary verb

Page 45: Using Machine Learning to Monitor  Collaborative Interactions

Feature Space Design

Feature Space DesignPunctuation can be a “stand in” for mood

“you think the answer is 9?” “you think the answer is 9.”

Bigrams capture simple lexical patterns “common denominator” versus “common multiple”

POS bigrams capture syntactic or stylistic information

“the answer which is …” vs “which is the answer”Line length can be a proxy for explanation

depth

Page 46: Using Machine Learning to Monitor  Collaborative Interactions

Feature Space Design

Feature Space DesignContains non-stop word can be a predictor of

whether a conversational contribution is contentful

“ok sure” versus “the common denominator”Remove stop words removes some distracting

featuresStemming allows some generalization

Multiple, multiply, multiplicationRemoving rare features is a cheap form of

feature selection Features that only occur once or twice in the corpus

won’t generalize, so they are a waste of time to include in the vector space

Page 47: Using Machine Learning to Monitor  Collaborative Interactions

Error Analysis

Page 48: Using Machine Learning to Monitor  Collaborative Interactions

Any Questions?