59
Analyzing Argumentative Discourse Units in Online Interactions Debanjan Ghosh, Smaranda Muresan, Nina Wacholder, Mark Aakhus and Matthew Mitsui First Workshop on Argumentation Mining, ACL June 26, 2014

Analyzing Argumentative Discourse Units in Online Interactions

  • Upload
    nan

  • View
    31

  • Download
    0

Embed Size (px)

DESCRIPTION

Analyzing Argumentative Discourse Units in Online Interactions. Debanjan Ghosh, Smaranda Muresan, Nina Wacholder, Mark Aakhus and Matthew Mitsui. First Workshop on Argumentation Mining, ACL June 26, 2014. when we first tried the iPhone it felt natural immediately,. User1. - PowerPoint PPT Presentation

Citation preview

Page 1: Analyzing Argumentative Discourse Units in Online Interactions

Analyzing Argumentative Discourse Units in Online Interactions

Debanjan Ghosh, Smaranda Muresan, Nina Wacholder, Mark Aakhus and Matthew Mitsui

First Workshop on Argumentation Mining, ACLJune 26, 2014

Page 2: Analyzing Argumentative Discourse Units in Online Interactions

But when we first tried the iPhone it felt natural immediately, we didn't have to 'unlearn' old habits from our antiquated Nokias & Blackberrys. That happened because the iPhone is a truly great design.

That's very true. With the iPhone, the sweet goodness part of the UI is immediately apparent. After a minute or two, you’re feeling empowered and comfortable.

It's the weaknesses that take several days or weeks for you to really understanding and get frustrated by.

I disagree that the iPhone just "felt natural immediately"... In my opinion it feels restrictive and over simplified, sometimes to the point of frustration.

User1

User2

User3

when we first tried the iPhone it felt natural immediately,

That’s very true. With the iPhone, the sweet goodness part ofThe UI is immediately apparent. After a minute or two, you’reFeeling empowered and comfortable.

I disagree that the iPhone just “felt natural immediately”… in myOpinion it feels restrictive and over simplified, sometimes to thePoint of frustration.

1. Segmentation 2. Segment Classification 3. Relation Identification

Argumentative Discourse Units (ADU; Peldszus and Stede, 2013)

That’s very true. With the iPhone, the sweet goodness part ofThe UI is immediately apparent. After a minute or two, you’reFeeling empowered and comfortable.

I disagree that the iPhone just “felt natural immediately”… in myOpinion it feels restrictive and over simplified, sometimes to thePoint of frustration.

Page 3: Analyzing Argumentative Discourse Units in Online Interactions

3

Annotation Challenges• A complex annotation scheme seems infeasible– The problem of high *cognitive load* (annotators

have to read all the threads)

– High complexity demands two or more annotators

– Use of expert annotators for all tasks is costly

Page 4: Analyzing Argumentative Discourse Units in Online Interactions

4

Our Approach: Two-tiered Annotation Scheme

• Coarse-grained annotation– Expert annotators (EAs) – Annotate entire thread

• Fine-grained annotation– Novice annotators (Turkers)– Annotate only text labeled by EAs

Page 5: Analyzing Argumentative Discourse Units in Online Interactions

5

Our Approach: Two-tiered Annotation Scheme

• Coarse-grained annotation– Expert annotators (EAs) – Annotate entire thread

• Fine-grained annotation– Novice annotators (Turkers)– Annotate only text labeled by EAs

Page 6: Analyzing Argumentative Discourse Units in Online Interactions

6

Coarse-grained Expert Annotation

Pragmatic Argumentation Theory (PAT; Van Eemeren et al., 1993) based annotation

Post1

Post2

Post3

Post4

Target

Callout

Post2

Post3

Page 7: Analyzing Argumentative Discourse Units in Online Interactions

7

ADUs: Callout and Target

• A Callout is a subsequent action that selects all or some part of a prior action (i.e., Target) and comments on it in some way.

• A Target is a part of a prior action that has been called out by a subsequent action.

Page 8: Analyzing Argumentative Discourse Units in Online Interactions

But when we first tried the iPhone it felt natural immediately, we didn't have to 'unlearn' old habits from our antiquated Nokias & Blackberrys. That happened because the iPhone is a truly great design.

That's very true. With the iPhone, the sweet goodness part of the UI is immediately apparent. After a minute or two, you’re feeling empowered and comfortable.

It's the weaknesses that take several days or weeks for you to really understanding and get frustrated by.

I disagree that the iPhone just "felt natural immediately"... In my opinion it feels restrictive and over simplified, sometimes to the point of frustration.

User1

User2

User3

when we first tried the iPhone it felt natural immediately,

That’s very true. With the iPhone, the sweet goodness part ofThe UI is immediately apparent. After a minute or two, you’reFeeling empowered and comfortable.

I disagree that the iPhone just “felt natural immediately”… in myOpinion it feels restrictive and over simplified, sometimes to thePoint of frustration.

That’s very true. With the iPhone, the sweet goodness part ofThe UI is immediately apparent. After a minute or two, you’reFeeling empowered and comfortable.

I disagree that the iPhone just “felt natural immediately”… in myOpinion it feels restrictive and over simplified, sometimes to thePoint of frustration.

Target

Callout

Callout

Page 9: Analyzing Argumentative Discourse Units in Online Interactions

9

More on Expert Annotations and Corpus

• Five Annotators were free to choose any text segment to represent an ADU

• Four blogs and their first one-hundred comment sections are used as our argumentative corpus

– Android (iPhone vs. Android phones)– iPad (usability of iPad as a tablet)– Twitter (use of Twitter as a micro-blog platform)– Job Layoffs (layoffs and outsourcing)

Page 10: Analyzing Argumentative Discourse Units in Online Interactions

10

Inter Annotator Agreement (IAA) for Expert Annotations

Thread F1_EM F1_OM Krippendorff’s

Android 54.4 87.8 0.64

iPad 51.2 86.0 0.73

Layoffs 51.9 87.5 0.87

Twitter 53.8 88.5 0.82

• P/R/F1 based IAA (Wiebe et al., 2005)

• exact match (EM) • overlap match (OM)

• Krippendorff’s (Krippendorff, 2004)

Page 11: Analyzing Argumentative Discourse Units in Online Interactions

11

Issues• Different IAA metrics have different outcome

• It is difficult to infer from IAA that what segments of the text are easier or harder to annotate

Page 12: Analyzing Argumentative Discourse Units in Online Interactions

12

Our solution: Hierarchical ClusteringWe utilize a hierarchical clustering technique to cluster ADUs that are variant of a same Callout

Thread # of Clusters# of Expert Annotator/ADUs per

cluster5 4 3 2 1

Android 91 52 16 11 7 5

Ipad 88 41 17 7 13 10

Layoffs 86 41 18 11 6 10

Twitter 84 44 17 14 4 5

• Clusters with 5 and 4 annotators shows Callouts that are plausibly easier to identify

• Clusters selected by only one or two annotators are harder to identify

Page 13: Analyzing Argumentative Discourse Units in Online Interactions

13

Example of a Callout Cluster

Page 14: Analyzing Argumentative Discourse Units in Online Interactions

14

Motivation for a finer-grained annotation

• What is the nature of the relation between a Callout and a Target?

• Can we identify finer-grained ADUs in a Callout?

Page 15: Analyzing Argumentative Discourse Units in Online Interactions

15

Our Approach: Two-tiered Annotation Scheme

• Coarse-grained annotation– Expert annotators (EAs) – Annotate entire thread

• Fine-grained annotation– Novice annotators (Turkers)– Annotate only text labeled by EAs

Page 16: Analyzing Argumentative Discourse Units in Online Interactions

16

Novice Annotation: task 1T

CO

T

CO

T

T

CO

CO

This is related to annotation of agreement/disagreement (Misra and Walker, 2013; Andreas et al., 2012) identification research.

Agree/Disagree/Other

Page 17: Analyzing Argumentative Discourse Units in Online Interactions

But when we first tried the iPhone it felt natural immediately, we didn't have to 'unlearn' old habits from our antiquated Nokias & Blackberrys. That happened because the iPhone is a truly great design.

That's very true. With the iPhone, the sweet goodness part of the UI is immediately apparent. After a minute or two, you’re feeling empowered and comfortable.

It's the weaknesses that take several days or weeks for you to really understanding and get frustrated by.

I disagree that the iPhone just "felt natural immediately"... In my opinion it feels restrictive and over simplified, sometimes to the point of frustration.

User1

User2

User3

when we first tried the iPhone it felt natural immediately,

That’s very true. With the iPhone, the sweet goodness part ofThe UI is immediately apparent. After a minute or two, you’reFeeling empowered and comfortable.

I disagree that the iPhone just “felt natural immediately”… in myOpinion it feels restrictive and over simplified, sometimes to thePoint of frustration.

That’s very true. With the iPhone, the sweet goodness part ofThe UI is immediately apparent. After a minute or two, you’reFeeling empowered and comfortable.

I disagree that the iPhone just “felt natural immediately”… in myOpinion it feels restrictive and over simplified, sometimes to thePoint of frustration.

Target

Callout

Callout

Page 18: Analyzing Argumentative Discourse Units in Online Interactions

18

More from Agree/Disagree Relation Label

• For each Target/Callout pair we employed five Turkers

• Fleiss’ Kappa shows moderate agreement between the Turkers

• 143 Agree/153 Disagree/50 Other data instance• We run preliminary experiments for predicting

the relation label (rule based, BoW, Lexical Features…)

• Best results (F1): 66.9% (Agree) 62.9% (Disagree)

Page 19: Analyzing Argumentative Discourse Units in Online Interactions

19

Novice Annotation: task 2

2: Identifying Stance vs. Rationale

This is related to identification of justification task (Biran and Rambow, 2011)

CO S R

Difficulty

T

Page 20: Analyzing Argumentative Discourse Units in Online Interactions

That's very true. With the iPhone, the sweet goodness part of the UI is immediately apparent. After a minute or two, you’re feeling empowered and comfortable.

It's the weaknesses that take several days or weeks for you to really understanding and get frustrated by.

I disagree that the iPhone just "felt natural immediately"... In my opinion it feels restrictive and over simplified, sometimes to the point of frustration.

User2

User3

That’s very true. With the iPhone, the sweet goodness part ofThe UI is immediately apparent. After a minute or two, you’reFeeling empowered and comfortable.

I disagree that the iPhone just “felt natural immediately”… in myOpinion it feels restrictive and over simplified, sometimes to thePoint of frustration.

That’s very true

I disagree that the iPhone just “felt natural immediately”

StanceRationale

Page 21: Analyzing Argumentative Discourse Units in Online Interactions

21

Examples of Callout/Target pairs with difficulty level (majority voting)

Target Callout Stance Rationale Difficulty

the iPhone is a truly great design.

I disagree too. some things they get right, some things they do not.

I…too Some things…do not

Easy

the dedicated `Back' button

that back button is key. navigation is actually much easier on the android.

That back button is key

Navigation is…android

Moderate

It's more about the features and apps and Android seriously lacks on latter.

Just because the iPhone has a huge amount of apps, doesn't mean they're all worth having.

- Just because the iPhone has a huge amount of apps, doesn't mean they're all worth having.

Difficult

I feel like your comments about Nexus One is too positive …

I feel like your poor grammar are to obvious to be self thought...

- - Too difficult/ unsure

Page 22: Analyzing Argumentative Discourse Units in Online Interactions

22

Difficulty judgment (majority voting)

Diff Number of Expert Annotators per cluster

5 4 3 2 1

Easy 81.0 70.8 60.9 63.6 25.0

Moderate 7.7 7.0 17.1 6.1 25.0

Difficult 5.9 5.9 7.3 9.1 12.5Too Difficult to code 5.4 16.4 14.6 21.2 37.5

Page 23: Analyzing Argumentative Discourse Units in Online Interactions

23

Conclusion• We propose a two-tiered annotation scheme

for argument annotation for online discussion forums

• Expert annotators detect Callout/Target pairs where crowdsourcing is employed to discover finer units like Stance/Rationale

• Our study also assists in detecting the text that is easy/hard to annotate

• Preliminary experiments to predict agreement/disagreement among ADUs

Page 24: Analyzing Argumentative Discourse Units in Online Interactions

24

Future Work• Qualitative analysis of the Callout phenomenon

to process finer-grained analysis• Study the different use of the ADUs on different

situations • Annotation on different domain (e.g. healthcare

forums) and adjust our annotation scheme• Predictive modeling of Stance/Rationale

phenomenon

Page 25: Analyzing Argumentative Discourse Units in Online Interactions

25

Thank you!

Page 26: Analyzing Argumentative Discourse Units in Online Interactions

26

Example from the discussion thread

StanceRationale

User2

User3

Page 27: Analyzing Argumentative Discourse Units in Online Interactions

27

Predicting the Agree/Disagree Relation Label

• Training data (143 Agree/153 Disagree)• Salient Features for the experiments– Baseline: rule based (`agree’, `disagree’)– Mutual Information (MI): MI is used to select words

to represent each category– LexFeat: Lexical features based on sentiment

lexicons (Hu and Liu, 2004), lexical overlaps, initial words of the Callouts…

• 10-fold CV using SVM

Page 28: Analyzing Argumentative Discourse Units in Online Interactions

28

Predicting the Agree/Disagree Relation Label (preliminary result)

• Lexical features result in F1 score between 60-70% for Agree/Disagree relations

• Ablation tests show initial words of the Callout is the strongest feature

• Rule-based system show very low recall (7%), which indicates a lot of Target-Callout relations are *implicit*

• Limitation – lack of data (in process of annotating more data currently…)

Page 29: Analyzing Argumentative Discourse Units in Online Interactions

29

# of Clusters for each CorpusThread # of Clusters

# of EA ADUs per cluster5 4 3 2 1

91 52 16 11 7 5

Ipad 88 41 17 7 13 10

Layoffs 86 41 18 11 6 10

Twitter 84 44 17 14 4 5

• Clusters with 5 and 4 annotators shows Callouts that are plausibly easier to identify

• Clusters selected by only one or two annotators are harder to identify

Page 30: Analyzing Argumentative Discourse Units in Online Interactions

30

Target

Callout2

Callout1User1

User2

User3

Page 31: Analyzing Argumentative Discourse Units in Online Interactions

31

Target

Callout2

Callout1User1

User2

User3

Page 32: Analyzing Argumentative Discourse Units in Online Interactions

Fine-GrainedNovice Annotation

32

T

CO

T

T

CO

T

CO

E.g., Agree/Disagree/Other

E.g., Relation Identification

Finer-Grained Annotation

E.g., Stance &Rationale

CO

Page 33: Analyzing Argumentative Discourse Units in Online Interactions

33

Motivation and Challenges

Post1

Post2

Post3

Post4

1. Segmentation2. Segment Classification3. Relation Identification

Argumentative Discourse Units (ADU; Peldszus and Stede, 2013)

Page 34: Analyzing Argumentative Discourse Units in Online Interactions

34

Why we propose a two-layer annotation?

• A two-layer annotation schema– Expert Annotation• Five annotators who received extensive training for the

task• Primary task includes selecting discourse units from user’

posts (argumentative discourse units: ADU)• Peldszus and Stede (2013

– Novice Annotation• Use of Amazon Mechanical Turk (AMT) platform to detect

the nature and role of the ADUs selected by the experts

Page 35: Analyzing Argumentative Discourse Units in Online Interactions

35

Annotation Schema for Expert Annotators

• Call OutA Callout is a subsequent action that selects

all or some part of a prior action (i.e., Target) and comments on it in some way.

• TargetA Target is a part of a prior action that has

been called out by a subsequent action

Page 36: Analyzing Argumentative Discourse Units in Online Interactions

36

Motivation and Challenges• User generated conversational data provides a

wealth of naturally generated arguments

• Argument mining of such online interactions, however, is still in its infancy…

Page 37: Analyzing Argumentative Discourse Units in Online Interactions

37

Detail on Corpora• Four blog posts and the responses (e.g. first 100

comments) from Technorati between 2008-2010.

• We selected blog postings in the general topic of technology, which contain many disputes and arguments.

• Together they are denoted as – argumentative corpus

Page 38: Analyzing Argumentative Discourse Units in Online Interactions

38

Motivation and Challenges (cont.)• A detailed single annotation scheme seems

infeasible– The problem of high *cognitive load* (e.g.

annotators have to read all the threads)

– Use of expert annotators for all tasks is costly

• We propose a scalable and principled two-tier scheme to annotate corpora for arguments

Page 39: Analyzing Argumentative Discourse Units in Online Interactions

39

Annotation Schema(s)• A two-layer annotation schema– Expert Annotation• Five annotators who received extensive training for the

task• Primary task includes a) segmentation, b) segment

classification, and c) relation identification lecting discourse units from user’ posts (argumentative discourse units: ADU)

– Novice Annotation• Use of Amazon Mechanical Turk (AMT) platform to detect

the nature and role of the ADUs selected by the experts

Page 40: Analyzing Argumentative Discourse Units in Online Interactions

40

Example from the discussion thread

Page 41: Analyzing Argumentative Discourse Units in Online Interactions

41

A picture is worth…

Page 42: Analyzing Argumentative Discourse Units in Online Interactions

42

Motivation and Challenges

1. Segmentation2. Segment

Classification3. Relation Identification

Argument annotation includes three tasks (Peldszus and Stede, 2013)

Page 43: Analyzing Argumentative Discourse Units in Online Interactions

43

Summary of the Annotation Schema(s)

• First stage of annotation– Annotators: expert (trained) annotators– A coarse-grained annotation scheme inspired by

Pragmatic Argumentation Theory (PAT; Van Eemeren et al., 1993)

– Segment, label, and link Callout and Target

• Second stage of annotation– Annotators: novice (crowd) annotators– A finer-grained annotation to detect Stance and

Rationale of an argument

Page 44: Analyzing Argumentative Discourse Units in Online Interactions

44

Expert AnnotationExpert Annotators

• Segmentation• Labeling• LinkingPeldszus and Stede (2013)

Coarse-grained annotation• Five Expert (trained) annotators

detect two types of ADUs• ADU: Callout and Target

Page 45: Analyzing Argumentative Discourse Units in Online Interactions

45

The Argumentative Corpus

Blogs and comments extracted from Technorati (2008-2010)

3

1

2

4

Page 46: Analyzing Argumentative Discourse Units in Online Interactions

46

Novice Annotations: Identifying Stance and Rationale

Callout

Crowdsourcing

• Identify the task-difficulty (very difficult….very easy)• Identify the text segments (Stance and Rationale)

Page 47: Analyzing Argumentative Discourse Units in Online Interactions

47

Novice Annotations: Identifying the relation between ADUs

Crowdsourcing

Callout Target

… …

… …

Relation label

Number of EA ADUs per cluster

5 4 3 2 1

Agree 39.4 43.3 42.5 35.5 48.4

Disagree 56.9 31.7 32.5 25.8 19.4

Other 3.70 25.0 25.0 38.7 32.3

Page 48: Analyzing Argumentative Discourse Units in Online Interactions

48

More on Expert Annotations• Annotators were free to chose any text segment

to represent an ADUSplitters

Lumpers

Page 49: Analyzing Argumentative Discourse Units in Online Interactions

49

Novice Annotation: task 1

1: Identifying the relation(agree/disagree/other)

This is related to annotation of agreement/disagreement (Misra and Walker, 2013; Andreas et al., 2012) and classification of stances (Somasundaran and Wiebe, 2010) in online forums.

Page 50: Analyzing Argumentative Discourse Units in Online Interactions

50

ADUs: Callout and Target

Page 51: Analyzing Argumentative Discourse Units in Online Interactions

51

Examples of Clusters# of EAs Callout Target

5 I disagree too. some things they get right, some things they do not.

the iPhone is a truly great design.

I disagree too…they do not. That happened because the iPhone is a truly great design.

2 These iPhone Clones are playing catchup. Good luck with that.

griping about issues that will only affect them once in a blue moon

1 Do you know why the Pre ...various hand- set/builds/resolution issues?

Except for games?? iPhone is clearly dominant there.

Page 52: Analyzing Argumentative Discourse Units in Online Interactions

52

More on Expert Annotations• Annotators were free to chose any text segment

to represent an ADU

Page 53: Analyzing Argumentative Discourse Units in Online Interactions

53

Example from the discussion thread

Page 54: Analyzing Argumentative Discourse Units in Online Interactions

54

Coarse-grained Expert Annotation

Target

Callout

Pragmatic Argumentation Theory (PAT; Van Eemeren et al., 1993) based annotation

Page 55: Analyzing Argumentative Discourse Units in Online Interactions

55

ADUs: Callout and Target

Page 56: Analyzing Argumentative Discourse Units in Online Interactions

56

More on Expert Annotations and Corpus

• Five Annotators were free to chose any text segment to represent an ADU

• Four blogs and their first one-hundred comment sections are used as our argumentative corpus

Layoffs

Android

Twitter

iPad

Page 57: Analyzing Argumentative Discourse Units in Online Interactions

57

Examples of Cluster# of EAs Callout Target

5

I disagree too. some things they get right, some things they do not.

the iPhone is a truly great design.

I disagree too…they do not. That happened because the iPhone is a truly great design.

I disagree too. But when we first tried the iPhone it felt natural immediately . . . iPhone is a truly great design.

Hi there, I disagree too . . . they do not. Same as OSX.

-Same as above-

I disagree too. . . Same as OSX . . . no problem.

-Same as above-

Page 58: Analyzing Argumentative Discourse Units in Online Interactions

58

Predicting the Agree/Disagree Relation Label

Features Categ. P R F1

Baseline Agree 83.3 6.90 12.9Disagree 50.0 5.20 9.50

UnigramsAgree 57.9 61.5 59.7Disagree 61.8 58.2 59.9

MI-based unigram

Agree 60.1 66.4 63.1Disagree 65.2 58.8 61.9

LexF Agree 61.4 73.4 66.9Disagree 69.6 56.9 62.6

Page 59: Analyzing Argumentative Discourse Units in Online Interactions

59

Novice Annotation: task 2

2: Identifying Stance vs. Rationale

This is related to identification of claim/justification task(Biran and Rambow, 2011)