155
Subjectivity and Sentiment Analysis Jan Wiebe Department of Computer Science CERATOPS: Center for the Extraction and Summarization of Events and Opinions in Text University of Pittsburgh

Subjectivity and Sentiment Analysis

  • Upload
    simeon

  • View
    48

  • Download
    4

Embed Size (px)

DESCRIPTION

Subjectivity and Sentiment Analysis. Jan Wiebe Department of Computer Science CERATOPS: Center for the Extraction and Summarization of Events and Opinions in Text University of Pittsburgh. What is Subjectivity?. - PowerPoint PPT Presentation

Citation preview

  • Subjectivity and Sentiment Analysis Jan Wiebe

    Department of Computer Science

    CERATOPS: Center for the Extraction and Summarization of Events and Opinions in Text

    University of Pittsburgh

  • What is Subjectivity?The linguistic expression of somebodys opinions, sentiments, emotions, evaluations, beliefs, speculations (private states)

    Wow, this is my 4th Olympus camera

    Staley declared it to be one wicked song

    Most voters believe he wont raise their taxes

  • One MotivationAutomatic question answering

  • Fact-Based Question AnsweringQ: When is the first day of spring in 2007? Q: Does the US have a tax treaty with Cuba?

  • Fact-Based Question AnsweringQ: When is the first day of spring in 2007? A: March 21 Q: Does the US have a tax treaty with Cuba? A: Thus, the U.S. has no tax treaties with nations like Iraq and Cuba.

  • Opinion Question AnsweringQ: What is the international reaction to the reelection of Robert Mugabe as President of Zimbabwe?

  • Opinion Question AnsweringQ: What is the international reaction to the reelection of Robert Mugabe as President of Zimbabwe?

    A: African observers generally approved of his victory while Western Governments strongly denounced it.

  • More Motivations

    Product review mining: What features of the ThinkPad T43 do customers like and which do they dislike? Review classification: Is a review positive or negative toward the movie?Information Extraction: Is killing two birds with one stone a terrorist event? Tracking sentiments toward topics over time: Is anger ratcheting up or cooling down?Etcetera!

  • wicked visualsloudly condemned The building has been condemned QAReview MiningOpinion Trackingcondemngreatwicked

  • wicked visualsloudly condemned The building has been condemned QAReview MiningOpinion Trackingcondemngreatwicked

  • wicked visualsloudly condemned The building has been condemned QAReview MiningOpinion Trackingcondemngreatwicked

  • wicked visualsloudly condemned The building has been condemned QAReview MiningOpinion Trackingcondemngreatwicked

  • wicked visualsloudly condemned The building has been condemned QAReview MiningOpinion Trackingcondemngreatwicked

  • OutlineSubjectivity annotation scheme (MPQA)Learning subjective expressions from unannotated textsContextual polarity of sentiment expressionsWord sense and subjectivityConclusions and pointers to related work

  • wicked visualsloudly condemned The building has been condemned QAReview MiningOpinion Trackingcondemngreatwicked

  • Corpus Annotation

    Wiebe, Wilson, Cardie 2005Wilson & Wiebe 2005Somasundaran, Wiebe, Hoffmann, Litman 2006Wilson 2007

  • Outline for Section 1OverviewFrame typesNested SourcesExtensions

  • OverviewFine-grained: expression level rather than sentence or document levelAnnotate expressions of opinions, sentiments, emotions, evaluations, speculations, material attributed to a source, but presented objectively

  • Overview Opinions, evaluations, emotions, speculations are private statesThey are expressed in language by subjective expressions Private state: state that is not open to objective observation or verification

    Quirk, Greenbaum, Leech, Svartvik (1985). A Comprehensive Grammar of the English Language.

  • OverviewFocus on three ways private states are expressed in language

    Direct subjective expressionsExpressive subjective elementsObjective speech events

  • Direct Subjective ExpressionsDirect mentions of private states The United States fears a spill-over from the anti-terrorist campaign.

    Private states expressed in speech events We foresaw electoral fraud but not daylight robbery, Tsvangirai said.

  • Expressive Subjective Elements Banfield 1982We foresaw electoral fraud but not daylight robbery, Tsvangirai said

    The part of the US human rights report about China is full of absurdities and fabrications

  • Objective Speech EventsMaterial attributed to a source, but presented as objective fact (without evaluation)

    The government, it added, has amended the Pakistan Citizenship Act 10 of 1951 to enable women of Pakistani descent to claim Pakistani nationality for their children born to foreign husbands.

  • Nested SourcesThe report is full of absurdities, Xirao-Nima said the next day.

  • Nested Sources(Writer)

  • Nested Sources(Writer, Xirao-Nima)

  • Nested Sources(Writer Xirao-Nima)(Writer Xirao-Nima)

  • Nested Sources(Writer Xirao-Nima)(Writer Xirao-Nima)(Writer)

  • The report is full of absurdities, Xirao-Nima said the next day.Objective speech event anchor: the entire sentence source: implicit: trueDirect subjective anchor: said source: intensity: high expression intensity: neutral attitude type: negative target: report

    Expressive subjective element anchor: full of absurdities source: intensity: high attitude type: negative

  • The report is full of absurdities, Xirao-Nima said the next day.Objective speech event anchor: the entire sentence source: implicit: trueDirect subjective anchor: said source: intensity: high expression intensity: neutral attitude type: negative target: report

    Expressive subjective element anchor: full of absurdities source: intensity: high attitude type: negative

  • The report is full of absurdities, Xirao-Nima said the next day.Objective speech event anchor: the entire sentence source: implicit: trueDirect subjective anchor: said source: intensity: high expression intensity: neutral attitude type: negative target: report

    Expressive subjective element anchor: full of absurdities source: intensity: high attitude type: negative

  • The report is full of absurdities, Xirao-Nima said the next day.Objective speech event anchor: the entire sentence source: implicit: trueDirect subjective anchor: said source: intensity: high expression intensity: neutral attitude type: negative target: report

    Expressive subjective element anchor: full of absurdities source: intensity: high attitude type: negative

  • The report is full of absurdities, Xirao-Nima said the next day.Objective speech event anchor: the entire sentence source: implicit: trueDirect subjective anchor: said source: intensity: high expression intensity: neutral attitude type: negative target: report

    Expressive subjective element anchor: full of absurdities source: intensity: high attitude type: negative

  • The report is full of absurdities, Xirao-Nima said the next day.Objective speech event anchor: the entire sentence source: implicit: trueDirect subjective anchor: said source: intensity: high expression intensity: neutral attitude type: negative target: report

    Expressive subjective element anchor: full of absurdities source: intensity: high attitude type: negative

  • The US fears a spill-over, said Xirao-Nima, a professor of foreign affairs at the Central University for Nationalities.

  • The US fears a spill-over, said Xirao-Nima, a professor of foreign affairs at the Central University for Nationalities.(Writer)

  • The US fears a spill-over, said Xirao-Nima, a professor of foreign affairs at the Central University for Nationalities.(writer, Xirao-Nima)

  • The US fears a spill-over, said Xirao-Nima, a professor of foreign affairs at the Central University for Nationalities.(writer, Xirao-Nima, US)

  • The US fears a spill-over, said Xirao-Nima, a professor of foreign affairs at the Central University for Nationalities.(writer, Xirao-Nima, US)(writer, Xirao-Nima)(Writer)

  • Objective speech event anchor: the entire sentence source: implicit: trueObjective speech event anchor: said source:

    Direct subjective anchor: fears source: intensity: medium expression intensity: medium attitude type: negative target: spill-over The US fears a spill-over, said Xirao-Nima, a professor of foreign affairs at the Central University for Nationalities.

  • Corpus@ www.cs.pitt.edu/mpqa

    English language versions of articles from the world press (187 news sources)

    Also includes contextual polarity annotations

    Themes of the instructions:No rules about how particular words should be annotated. Dont take expressions out of context and think about what they could mean, but judge them as they are used in that sentence.

  • ExtensionsWilson 2007

  • ExtensionsWilson 2007I think people are happy because Chavez has fallen.direct subjective span: are happy source: attitude:inferred attitude span: are happy because Chavez has fallen type: neg sentiment intensity: medium target: target span: Chavez has fallentarget span: Chavezattitude span: are happy type: pos sentiment intensity: medium target:direct subjective span: think source: attitude:attitude span: think type: positive arguing intensity: medium target:target span: people are happy because Chavez has fallen

  • OutlineSubjectivity annotation scheme (MPQA)Learning subjective expressions from unannotated textsContextual polarity of sentiment expressionsWord sense and subjectivityConclusions and pointers to related work

  • wicked visualsloudly condemned The building has been condemned QAReview MiningOpinion Trackingcondemngreatwicked

  • Outline for Section 2Learning subjective nouns with extraction pattern bootstrappingAutomatically generating training data with high-precision classifiersLearning subjective and objective expressions (not simply words or n-grams)

    Riloff, Wiebe, Wilson 2003; Riloff & Wiebe 2003; Wiebe & Riloff 2005; Riloff, Patwardhan, Wiebe 2006.

  • Outline for Section 2Learning subjective nouns with extraction pattern bootstrappingAutomatically generating training data with high-precision classifiersLearning subjective and objective expressions

  • Information ExtractionInformation extraction (IE) systems identify facts related to a domain of interest. Extraction patterns are lexico-syntactic expressions that identify the role of an object. For example: was killedassassinated murder of

  • Learning Subjective NounsGoal: to learn subjective nouns from unannotated textMethod: applying IE-based bootstrapping algorithms that were designed to learn semantic categoriesHypothesis: extraction patterns can identify subjective contexts that co-occur with subjective nounsExample: expressed concern, hope, support

  • Extraction Examplesexpressed condolences, hope, grief, views, worries indicative of compromise, desire, thinkinginject vitality, hatredreaffirmed resolve, position, commitmentvoiced outrage, support, skepticism, opposition, gratitude, indignationshow of support, strength, goodwill, solidarity was sharedanxiety, view, niceties, feeling

  • Meta-Bootstrapping Riloff & Jones 1999Best Extraction PatternExtractions (Nouns)Ex: hope, grief, joy, concern, worries Ex:expressed Ex: happiness, relief, condolences

  • Subjective Seed Words

  • Subjective Noun ResultsBootstrapping corpus: 950 unannotated news articlesWe ran both bootstrapping algorithms for several iterationsWe manually reviewed the words and labeled them as strong, weak, or not subjective1052 subjective nouns were learned (454 strong, 598 weak)included in our subjectivity lexicon @ www.cs.pitt.edu/mpqa

  • Examples of Strong Subjective Nounsanguish exploitation pariahantagonism evil repudiationapologist fallacies revengeatrocities genius roguebarbarian goodwill sanctimoniousbelligerence humiliationscumbully ill-treatment smokescreencondemnation injustice sympathydenunciation innuendo tyrannydevil insinuation venomdiatribe liarexaggeration mockery

  • Examples of Weak Subjective Nouns

  • Outline for Section 2Learning subjective nouns with extraction pattern bootstrappingAutomatically generating training data with high-precision classifiersLearning subjective and objective expressions

  • Training Data Creationrule-based subjectivesentenceclassifierrule-basedobjectivesentenceclassifiersubjective & objective sentencesunlabeled textssubjective clues

  • Subjective CluesSubjectivity lexicon @ www.cs.pitt.edu/mpqaEntries from manually developed resources (e.g., General Inquirer, Framenet) +Entries automatically identified (E.g., nouns just described)

  • Creating High-Precision Rule-Based Classifiersa sentence is subjective if it contains 2 strong subjective cluesa sentence is objective if:it contains no strong subjective cluesthe previous and next sentence contain 1 strong subjective cluethe current, previous, and next sentence together contain 2 weak subjective cluesGOAL: use subjectivity clues from previous research to build a high-precision (low-recall) rule-based classifier

  • Accuracy of Rule-Based Classifiers (measured against MPQA annotations)SubjRecSubjPrecSubjFSubj RBC 34.2 90.4 49.6ObjRecObjPrecObjFObj RBC 30.7 82.444.7

  • Generated DataWe applied the rule-based classifiers to ~300,000 sentences from (unannotated) new articles

    ~53,000 were labeled subjective

    ~48,000 were labeled objective

    training set of over 100,000 labeled sentences!

  • Generated DataThe generated data may serve as training data for supervised learning, and initial data for self training Wiebe & Riloff 2005

    The rule-based classifiers are part of OpinionFinder @ www.cs.pitt.edu/mpqa

    Here, we use the data to learn new subjective expressions

  • Outline for Section 2Learning subjective nouns with extraction pattern bootstrappingAutomatically generating training data with high-precision classifiersLearning subjective and objective expressions

  • Representing Subjective Expressions with Extraction PatternsEPs can represent expressions that are not fixed word sequencesdrove [NP] up the wall- drove him up the wall- drove George Bush up the wall- drove George Herbert Walker Bush up the wall

    step on [modifiers] toes- step on her toes- step on the mayors toes- step on the newly elected mayors toesgave [NP] a [modifiers] look- gave his annoying sister a really really mean look

  • The Extraction Pattern LearnerUsed AutoSlog-TS [Riloff 96] to learn EPsAutoSlog-TS needs relevant and irrelevant texts as input Statistics are generated measuring each patterns association with the relevant textsThe subjective sentences are the relevant texts, and the objective sentences are the irrelevant texts

  • passive-vp was satisfied active-vp complained active-vp dobj dealt blow active-vp infinitive appears to be passive-vp infinitive was thought to be auxiliary dobj has position

  • AutoSlog-TS(Step 1)

    [The World Trade Center], [an icon] of [New York City], was intentionally attacked very early on [September 11, 2001]. ParserExtraction Patterns: was attackedicon of was attacked on

  • AutoSlog-TS (Step 2)Extraction Patterns: was attackedicon of was attacked on

  • Identifying Subjective PatternsSubjective patterns: Freq > XProbability > Y~6,400 learned on 1st iterationEvaluation against the MPQA corpus:direct evaluation of performance as simple classifiersevaluation of classifiers using patterns as additional featuresDoes the system learn new knowledge? Add patterns to the strong subjective set and re-run the rule-based classifier recall += 20 while precision -+ 7 on 1st iteration

  • Patterns with Interesting BehaviorPATTERNFREQP(Subj | Pattern) asked 128 .63 was asked 11 1.0

  • Conclusions for Section 2Extraction pattern bootstrapping can learn subjective nouns (unannotated data)Extraction patterns can represent richer subjective expressionsLearning methods can discover subtle distinctions that are important for subjectivity (unannotated data)Ongoing work: lexicon representation integrating different types of entries, enabling comparisons (e.g., subsumption relationships)learning and bootstrapping processes applied to much larger unannotated corpora processes applied to learning positive and negative subjective expressions

  • OutlineSubjectivity annotation scheme (MPQA)Learning subjective expressions from unannotated textsContextual polarity of sentiment expressions (briefly)Wilson, Wiebe, Hoffmann 2005; Wilson 2007; Wilson & Wiebe forthcomingWord sense and subjectivityConclusions and pointers to related work

  • wicked visualsloudly condemned The building has been condemned QAReview MiningOpinion Trackingcondemngreatwicked

  • Prior Polarity versus Contextual PolarityMost approaches use a lexicon of positive and negative wordsPrior polarity: out of context, positive or negative beautiful positive horrid negativeA word may appear in a phrase that expresses a different polarity in context

    Contextual polarityCheers to Timothy Whitfield for the wonderfully horrid visuals.

  • Goal of This WorkAutomatically distinguish prior and contextual polarity

  • ApproachUse machine learning and variety of featuresAchieve significant results for a large subset of sentiment expressions

  • Manual AnnotationsSubjective expressions of the MPQA corpus annotated with contextual polarity

  • Annotation SchemeMark polarity of subjective expressions as positive, negative, both, or neutralAfrican observers generally approved of his victory while Western governments denounced it.Besides, politicians refer to good and evil Jerome says the hospital feels no different than a hospital in the states.positivenegativebothneutral

  • Annotation SchemeJudge the contextual polarity of sentiment ultimately being conveyed

    They have not succeeded, and will never succeed, in breaking the will of this valiant people.

  • Annotation SchemeJudge the contextual polarity of sentiment ultimately being conveyed

    They have not succeeded, and will never succeed, in breaking the will of this valiant people.

  • Annotation SchemeJudge the contextual polarity of sentiment ultimately being conveyed

    They have not succeeded, and will never succeed, in breaking the will of this valiant people.

  • Annotation SchemeJudge the contextual polarity of sentiment ultimately being conveyed

    They have not succeeded, and will never succeed, in breaking the will of this valiant people.

  • FeaturesMany inspired by Polanyi & Zaenen (2004): Contextual Valence ShiftersExample:little threatlittle truthOthers capture dependency relationships between wordsExample: wonderfully horrid posmod

  • Word featuresModification featuresStructure featuresSentence featuresDocument featureWord token terrifiesWord part-of-speech VBContext that terrifies mePrior Polarity negativeReliability strong subjective

  • Word featuresModification featuresStructure featuresSentence featuresDocument featureBinary features:Preceded by adjective adverb (other than not) intensifierSelf intensifierModifies strong clue weak clueModified by strong clue weak clueDependency Parse Tree

  • Word featuresModification featuresStructure featuresSentence featuresDocument featureBinary features:In subject [The human rights report]posesIn copular I am confidentIn passive voicemust be regarded

  • Word featuresModification featuresStructure featuresSentence featuresDocument featureCount of strong clues in previous, current, next sentenceCount of weak clues in previous, current, next sentenceCounts of various parts of speech

  • Word featuresModification featuresStructure featuresSentence featuresDocument featureDocument topic (15) economics health Kyoto protocol presidential election in ZimbabweExample: The disease can be contracted if a person is bitten by a certain tick or if a person comes into contact with the blood of a congo fever sufferer.

  • Word tokenWord prior polarityNegatedNegated subjectModifies polarityModified by polarityConjunction polarityGeneral polarity shifterNegative polarity shifterPositive polarity shifterWord token terrifies

    Word prior polarity negative

  • Word tokenWord prior polarityNegatedNegated subjectModifies polarityModified by polarityConjunction polarityGeneral polarity shifterNegative polarity shifterPositive polarity shifterBinary features:NegatedFor example:not gooddoes not look very goodnot only good but amazing Negated subjectNo politically prudent Israeli could support either of them.

  • Word tokenWord prior polarityNegatedNegated subjectModifies polarityModified by polarityConjunction polarityGeneral polarity shifterNegative polarity shifterPositive polarity shifter Modifies polarity5 values: positive, negative, neutral, both, not mod substantial: negative

    Modified by polarity5 values: positive, negative, neutral, both, not mod challenge: positive

  • Word tokenWord prior polarityNegatedNegated subjectModifies polarityModified by polarityConjunction polarityGeneral polarity shifterNegative polarity shifterPositive polarity shifter Conjunction polarity5 values: positive, negative, neutral, both, not modgood: negative

  • Word tokenWord prior polarityNegatedNegated subjectModifies polarityModified by polarityConjunction polarityGeneral polarity shifterNegative polarity shifterPositive polarity shifterGeneral polarity shifter have few risks/rewardsNegative polarity shifter lack of understandingPositive polarity shifter abate the damage

  • Findings Statistically significant improvements can be gained Require combining all feature typesOngoing work: richer lexicon entries compositional contextual processing

  • S/O and Pos/Neg: both Important

    Several studies have found two steps beneficialYu & Hatzivassiloglou 2003; Pang & Lee 2004; Wilson et al. 2005; Kim & Hovy 2006

  • S/O and Pos/Neg: both Important

    S/O can be more difficultManual annotation of phrases Takamura et al. 2006Contextual polarity Wilson et al. 2005Sentiment tagging of words Andreevskaia & Bergler 2006Sentiment tagging of word senses Esuli & Sabastiani 2006Later: evidence that S/O more significant for senses

  • S/O and Pos/Neg: both ImportantDesirable for NLP systems to find a wide range of private states, including motivations, thoughts, and speculations, not just positive and negative sentiments

  • S/O and Pos/Neg: both Important Sentiment Other Subjectivity ObjectivePos Neg Both

  • OutlineSubjectivity annotation scheme (MPQA)Learning subjective expressions from unannotated textsContextual polarity of sentiment expressionsWord sense and subjectivityWiebe & Mihalcea 2006Conclusions and pointers to related work

  • wicked visualsloudly condemned The building has been condemned QAReview MiningOpinion Trackingcondemn>condemn#1condemn#2condemn#3

  • IntroductionContinuing interest in word senseSense annotated resources being developed for many languageswww.globalwordnet.orgActive participation in evaluations such as SENSEVAL

  • Word Sense and SubjectivityThough both are concerned with text meaning, until recently they have been investigated independently

  • Subjectivity Labels on Senses

    Alarm, dismay, consternation (fear resulting from the awareness of danger)

    Alarm, warning device, alarm system (a device that signals the occurrence of some undesirable event)

  • Subjectivity Labels on Senses Interest, involvement -- (a sense of concern with and curiosity about someone or something; "an interest in music") Interest -- (a fixed charge for borrowing money; usually a percentage of the amount borrowed; "how much interest do you pay on your mortgage?")

  • WSD using Subjectivity Tagging

  • WSD using Subjectivity TaggingSense 4 a sense of concernwith and curiosity about someone or something SSense 1 a fixed charge for borrowing money OThe notes do not pay interest.He spins a riveting plot which grabs and holds the readers interest. WSDSystemSense 4Sense 1?Sense 1Sense 4?SubjectivityClassifierSO

  • WSD using Subjectivity TaggingSense 4 a sense of concernwith and curiosity about someone or something SSense 1 a fixed charge for borrowing money OThe notes do not pay interest.He spins a riveting plot which grabs and holds the readers interest. WSDSystemSense 4Sense 1?Sense 1Sense 4?SubjectivityClassifierSO

  • Subjectivity Tagging using WSDSubjectivityClassifierThe notes do not pay interest.He spins a riveting plot which grabs and holds the readers interest. O S?S O?

  • Subjectivity Tagging using WSDSubjectivityClassifierThe notes do not pay interest.He spins a riveting plot which grabs and holds the readers interest. WSDSystemSense 4Sense 1O S?S O?

  • Subjectivity Tagging using WSDSubjectivityClassifierThe notes do not pay interestHe spins a riveting plot which grabs and holds the readers interest. WSDSystemSense 4Sense 1O S?S O?

  • Refining WordNetSemantic RichnessFind inconsistencies and gapsVerb assault attack, round, assail, last out, snipe, assault (attack in speech or writing) The editors of the left-leaning paper attacked the new House SpeakerBut no sense for the noun as in His verbal assault was vicious

  • GoalsExplore interactions between word sense and subjectivityCan subjectivity labels be assigned to word senses?ManuallyAutomaticallyCan subjectivity analysis improve word sense disambiguation?Can word sense disambiguation improve subjectivity analysis? Current work

  • Outline for Section 4Motivation and GoalsAssigning Subjectivity Labels to Word SensesManuallyAutomatically Word Sense Disambiguation using Automatic Subjectivity AnalysisConclusions

  • Annotation SchemeAssigning subjectivity labels to WordNet sensesS: subjectiveO: objectiveB: both

  • Examples

    Alarm, dismay, consternation (fear resulting form the awareness of danger)Fear, fearfulness, fright (an emotion experiences in anticipation of some specific pain or danger (usually accompanied by a desire to flee or fight))

    Alarm, warning device, alarm system (a device that signals the occurrence of some undesirable event)Device (an instrumentality invented for a particular purpose; the device is small enough to wear on your wrist; a device intended to conserve water

    SO

  • Subjective Sense DefinitionWhen the sense is used in a text or conversation, we expect it to express subjectivity, and we expect the phrase/sentence containing it to be subjective.

  • Subjective Sense ExamplesHis alarm grew Alarm, dismay, consternation (fear resulting form the awareness of danger)Fear, fearfulness, fright (an emotion experiences in anticipation of some specific pain or danger (usually accompanied by a desire to flee or fight))

    He was boiling with anger Seethe, boil (be in an agitated emotional state; The customer was seething with anger)Be (have the quality of being; (copula, used with an adjective or a predicate noun); John is rich; This is not a good answer)

  • Subjective Sense ExamplesWhats the catch? Catch (a hidden drawback; it sounds good but whats the catch?)Drawback (the quality of being a hindrance; he pointed out all the drawbacks to my plan)

    That doctor is a quack. Quack (an untrained person who pretends to be a physician and who dispenses medical advice)Doctor, doc, physician, MD, Dr., medico

  • Objective Sense ExamplesThe alarm went off Alarm, warning device, alarm system (a device that signals the occurrence of some undesirable event)Device (an instrumentality invented for a particular purpose; the device is small enough to wear on your wrist; a device intended to conserve water

    The water boiled Boil (come to the boiling point and change from a liquid to vapor; Water boils at 100 degrees Celsius)Change state, turn (undergo a transformation or a change of position or action)

  • Objective Sense ExamplesHe sold his catch at the market Catch, haul (the quantity that was caught; the catch was only 10 fish)Indefinite quantity (an estimated quantity)

    The ducks quack was loud and brief Quack (the harsh sound of a duck)Sound (the sudden occurrence of an audible event)

  • Objective Senses: ObservationWe dont necessarily expect phrases/sentences containing objective senses to be objectiveWill someone shut that darn alarm off?Cant you even boil water?

    Subjective, but not due to alarm and boil

  • Objective Sense DefinitionWhen the sense is used in a text or conversation, we dont expect it to express subjectivity and, if the phrase/sentence containing it is subjective, the subjectivity is due to something else.

  • Inter-Annotator Agreement ResultsOverall: Kappa=0.74 Percent Agreement=85.5%

    Without the 12.3% cases when a judge is uncertain:Kappa=0.90 Percent Agreement=95.0%

  • S/O and Pos/Neg Hypothesis: moving from word to word sense is more important for S versus O than it is for Positive versus NegativePilot study with the nouns of the SENSEVAL-3 English lexical sample task have both subjective and objective sensesOnly 1 has both positive and negative subjective senses

  • Outline for Section 4Motivation and GoalsAssigning Subjectivity Labels to Word SensesManuallyAutomatically Word Sense Disambiguation using Automatic Subjectivity AnalysisConclusions

  • OverviewMain idea: assess the subjectivity of a word sense based on information about the subjectivity of a set of distributionally similar words in a corpus annotated with subjective expressions

  • PreliminariesSuppose the goal were to assess the subjectivity of a word w, given an annotated corpusWe could consider how often w appears in subjective expressions Or, we could take into account more evidence the subjectivity of a set of distributionally similar words

  • Lins Distributional SimilarityLin 1998 Word R W I R1 havehave R2 dogbrown R3 dog . . .

  • Lins Distributional SimilarityR W R W R W R WR W R W R W R W R W R W R W R WWord1Word2

  • Subjectivity of word w

  • Subjectivity of word wUnannotatedCorpus(BNC)[-1, 1] [highly objective, highly subjective] #insts(DSW) in SE - #insts(DSW) not in SE #insts (DSW)

    subj(w) =

  • Subjectivity of word w

  • Subjectivity of word sense wi

    Rather than 1, add or subtractsim(wi,dswj)+sim(wi,dsw1)-sim(wi,dsw1)+sim(wi,dsw2)

  • Method Step 1 Given word wFind distributionally similar wordsDSW = {dswj | j = 1 .. n}

  • Method Step 2

  • Method Step 2Find the similarity between each word sense and each distributionally similar word

    wnss can be any concept-based similarity measure between word senses we use Jiang & Conrath 1997

  • Method Step 3

    Input: word sense wi of word w DSW = {dswj | j = 1..n} sim(wi,dswj) MPQA Opinion Corpus

    Output: subjectivity score subj(wi)

  • Method Step 3

    totalsim = #insts(dswj) * sim(wi,dswj)

    evi_subj = 0for each dswj in DSW: for each instance k in insts(dswj): if k is in a subjective expression: evi_subj += sim(wi,dswj) else: evi_subj -= sim(wi,dswj)subj(wi) = evi_subj / totalsim

  • EvaluationCalculate subj scores for each word sense, and sort themWhile 0 is a natural candidate for division between S and O, we perform the evaluation for different thresholds in [-1,+1]Determine the precision of the algorithm at different points of recall

  • Evaluation: precision/recall curvesNumber of distri-butionally similar words = 160

    Chart4

    0.2711

    0.270.810.47

    0.270.770.45

    0.270.550.45

    0.270.510.45

    0.270.50.38

    0.270.420.37

    0.270.410.37

    0.270.380.36

    0.270.370.34

    0.270.270.27

    baseline

    selected

    all

    Recall

    Precision

    data

    baselineselectedall

    00.2711

    0.10.270.810.47

    0.20.270.770.45

    0.30.270.550.45

    0.40.270.510.45

    0.50.270.50.38

    0.60.270.420.37

    0.70.270.410.37

    0.80.270.380.36

    0.90.270.370.34

    10.270.270.27

    data

    baseline

    selected

    all

    Recall

    Precision

  • Outline for Section 4Motivation and GoalsAssigning Subjectivity Labels to Word SensesManuallyAutomatically Word Sense Disambiguation using Automatic Subjectivity AnalysisConclusions

  • OverviewAugment an existing data-driven WSD system with a feature reflecting the subjectivity of the context of the ambiguous wordCompare the performance of original and subjectivity-aware WSD systemsThe ambiguous nouns of the SENSEVAL-3 English Lexical Task SENSEVAL-3 data

  • Original WSD SystemIntegrates local and topical features:Local: context of three words to the left and right, their part-of-speechTopical: top five words occurring at least three times in the context of a word sense[Ng & Lee, 1996], [Mihalcea, 2002]Nave Bayes classifier[Lee & Ng, 2003]

  • Subjectivity ClassifierRule-based automatic sentence classifier from Wiebe & Riloff 2005Harvests subjective and objective sentences it is certain about from unannotated dataPart of OpinionFinder @ www.cs.pitt.edu/mpqa/

  • Subjectivity Tagging for WSDSentences of the SENSEVAL-3data that containa target noun are tagged with O, S, or B: SubjectivityClassifierSentencekatmosphereOSTags are fed as input to the Subjectivity Aware WSD System

  • WSD using Subjectivity Tagging

  • WSD using Subjectivity TaggingHypothesis: instances of subjective senses are more likely to be in subjective sentences, so sentence subjectivity is an informative feature for WSD of words with both subjective and objective senses

  • Words with S and O Senses4.3% error reduction; significant (p < 0.05 paired t-test) S sense not in data

    Sheet1

    Classifier

    WordSensesBaselinebasic+ subj

    argument549.4%51.4%54.1%

    atmosphere665.4%65.4%66.7%

    difference540.4%54.4%57.0%

    difficulty417.4%47.8%52.2%

    image736.5%41.2%43.2%

    interest741.9%67.7%68.8%

    judgment728.1%40.6%43.8%

    plan381.0%81.0%81.0%

    sort465.6%66.7%67.7%

    source940.6%40.6%40.6%

    Average46.6%55.6%57.5%

  • Words with Only O SensesOverall 2.2% error reduction; significant (p < 0.1)often target

  • Conclusions for Section 4Can subjectivity labels be assigned to word senses?ManuallyGood agreement; Kappa=0.74Very good when uncertain cases removed; Kappa=0.90AutomaticallyMethod substantially outperforms baselineShowed feasibility of assigning subjectivity labels to the fine-grained sense level of word senses

  • Conclusions for Section 4Can subjectivity analysis improve word sense disambiguation?Quality of a WSD system can be improved with subjectivity informationImproves performance, but mainly for words with both S and O senses4.3% error reduction; significant (p < 0.05)Performance largely remains the same or degrades for words that dontOnce senses have been assigned subjectivity labels, a WSD system could consult them to decide whether to consider the subjectivity feature

  • Pointers To Related WorkTutorial held at EUROLAN 2007 Semantics, Opinion, and Sentiment in Text, Iai, Romania, August Slides, bibliographies, www.cs.pitt.edu/~wiebe/EUROLAN07

  • ConclusionsSubjectivity is common in languageRecognizing it is useful in many NLP tasksIt comes in many forms and often is context-dependentA wide variety of features seem to be necessary for opinion and polarity recognitionSubjectivity may be assigned to word senses, promising improved performance for both subjectivity analysis and WSDPromising as well for multi-lingual subjectivity analysis Mihalcea, Banea, Wiebe 2007

  • Acknowledgements

    CERATOPS Center for the Extraction and Summarization of Events and Opinions in TextPittsburgh: Paul Hoffmann, Josef Ruppenhofer, Swapna Somasundaran, Theresa WilsonCornell: Claire Cardie, Eric Breck, Yejin Choi, Ves StoyanovUtah: Ellen Riloff, Sidd Patwardhan, Bill PhillipsUNT: Rada Mihalcea, Carmen BaneaNLP@Pitt: Wendy Chapman, Rebecca Hwa, Diane Litman,

    Let us look for a moment at question answering.

    Consider the difference between fact-based and opinion-oriented questions READ

    Answering fact-based questions allows a different strategy than opinion-oriented ones.A single instance of the relevant fact is often enough for factual questions

    The corpus annotation scheme we follow is presented in the reference above. This and other references mentioned in this talk are Listed in the bibliography that comes with the slides.

    Finally, we also annotate reports of speech that present a statement in objective form.

    The way we record the private states that are referred to in a text is via annotation frames that have appropriate attributes.

    The anchor is the linguistic expressionthe stretch of textthat tells us that there is a private state.

    The source is the person to whom the private state is attributed. Note that this can be a path.

    The target is the content of the private state or what the private state is about.

    We also annotate an attitude type. If not specified, it is to be understood as neutral but can be set to positive or negative as required.

    Intensity records the intensity of the private state as a whole.

    The expression intensity is different, it is just a property of the anchor. Cf. fumed the next dayThe way we record the private states that are referred to in a text is via annotation frames that have appropriate attributes.

    The anchor is the linguistic expressionthe stretch of textthat tells us that there is a private state.

    The source is the person to whom the private state is attributed. Note that this can be a path.

    The target is the content of the private state or what the private state is about.

    We also annotate an attitude type. If not specified, it is to be understood as neutral but can be set to positive or negative as required.

    Intensity records the intensity of the private state as a whole.

    The expression intensity is different, it is just a property of the anchor. Cf. fumed the next dayThe way we record the private states that are referred to in a text is via annotation frames that have appropriate attributes.

    The anchor is the linguistic expressionthe stretch of textthat tells us that there is a private state.

    The source is the person to whom the private state is attributed. Note that this can be a path.

    The target is the content of the private state or what the private state is about.

    We also annotate an attitude type. If not specified, it is to be understood as neutral but can be set to positive or negative as required.

    Intensity records the intensity of the private state as a whole.

    The expression intensity is different, it is just a property of the anchor. Cf. fumed the next dayThe way we record the private states that are referred to in a text is via annotation frames that have appropriate attributes.

    The anchor is the linguistic expressionthe stretch of textthat tells us that there is a private state.

    The source is the person to whom the private state is attributed. Note that this can be a path.

    The target is the content of the private state or what the private state is about.

    We also annotate an attitude type. If not specified, it is to be understood as neutral but can be set to positive or negative as required.

    Intensity records the intensity of the private state as a whole.

    The expression intensity is different, it is just a property of the anchor. Cf. fumed the next dayThe annotation scheme that I just described has been used in building a corpus of annotated news articles.

    535 documents, 11,114 sentencesThe corpus also includes contextual polarity annotations which I havent mentioned yet and which well see later.

    In lexicons used for sentiment analysis, words are tagged with what we call prior polarityOut of context, does a word seem to evoke something positive or something negative?For example beautiful, horrid

    However words may appear in a phrases that express a different polarity in contextFor example, in the sentence: READAlthough the prior polarity of horrid negative Wonderfully horrid has positive polarity in this contextCall this contextual polarity

    So were asking two questions here: is the word from the lexicon that we find in a text used subjectively? If yes, is it used in the way we expected?The idea was to take a two-step approach to the problem.Beginning with large lexicon of words marked with prior polarity First step takes all instances determine whether the phrase containing each one is neutral or polar Second step takes all those that are polar disambiguate their contextual polarity

    Use machine learning and a variety of featuresAchieve significant results for automatically identifying the polarity of a large subset of sentiment expressions Because doing supervised learning need sentiment expressions with contextual polarity judgmentsWhat we had subjective expressions that had been annotated in the Multi-perspective Question Answering Opinion CorpusSubjective expressions words/phrases expressing opinions, emotions, evaluations, stances, speculations, and other types of subjectivity

    However, sentiment expressions are a subset of subjective expressions

    So what we decided annotate the subjective expression in MPQA corpus with their contextual polaritySpecifically, for each subjective expression marked it with a contextual polarity of positive/negative/both/neutral.For example, in these sentences, subjective expressions blue. Weve already seen the first sentencegenerally approved positive contextual polaritydenounced negative contextual polarity In the second sentence, good and evil marked with a contextual polarity of both In the last sentence: READ : feels marked as neutralFinal note on the annotation scheme:Annotators were asked to judge the contextual polarity of the sentiment that was ultimately being conveyed.So, in this sentence: READ : the phrase: have not succeeded, and will never succeed has a positive contextual polarityAlthough not succeed is often negative, to not succeed in breaking the will of a valiant people is positiveFinal note on the annotation scheme:Annotators were asked to judge the contextual polarity of the sentiment that was ultimately being conveyed.So, in this sentence: READ : the phrase: have not succeeded, and will never succeed has a positive contextual polarityAlthough not succeed is often negative, to not succeed in breaking the will of a valiant people is positiveFinal note on the annotation scheme:Annotators were asked to judge the contextual polarity of the sentiment that was ultimately being conveyed.So, in this sentence: READ : the phrase: have not succeeded, and will never succeed has a positive contextual polarityAlthough not succeed is often negative, to not succeed in breaking the will of a valiant people is positiveFinal note on the annotation scheme:Annotators were asked to judge the contextual polarity of the sentiment that was ultimately being conveyed.So, in this sentence: READ : the phrase: have not succeeded, and will never succeed has a positive contextual polarityAlthough not succeed is often negative, to not succeed in breaking the will of a valiant people is positiveIn each of the classification steps use a variety of features Many were inspired by the paper on contextual valence shifters by Polanyi and Zaenendiscuss a number of things that influence contextual polarityFor example, more things than simple negation can flip polarityLittle before threat negative to positiveLittle before truth positive to negative

    Others capture dependency relationships between words based on the dependency parse of the sentenceFor example whether a word is being modified by a positive word can influence the polarity of the expressionAs we saw before wonderfully modifying horrid creates a positive expression; even though horrid itself has a negative prior polarity

    Word features include things about the word instance, such as:part of speech and words around itinformation from the lexicon the words prior polarity and its reliability.(dont list)The modification features binary featuresSome capture whether the word is preceded by certain parts of speech or certain types of words, like intensifiers.Others capture when words from the lexicon modify each other, taking into account their reliability class.These determined by looking in the dependency parse tree for the sentence.For example have a feature for challenge that captures that it is modified by the clue substantial, which in this case is also an intensifier.

    Details of which items co-occur are important: He was treating me awfully Thats awfully nice, sweetThere are three structure features that represent some information about a words place in the parse tree.For example many subjective expressions involve the copularPrevious work showed that a particular word may be more or less likely to be subjective if its in the passive.We used some contextual features from previous work.These sentences features features that were found useful for sentence-level subjectivity classification.

    Finally have one document feature representing the document topic.We introduced this feature we found that documents from certain topics often contained many subjective words, but they were being used in objective contexts.For example document on health may contain the word fever, but the word is not being used to express a sentiment as in e.g. football feverWord token and word prior polarity same as beforeWe have two features to capture different types of negation:The negated feature captures negations that are more local, such asNot goodDoes not look very goodHowever, note that certain phrases, such as not only intensify rather than negate an expression In not only good, but amazing not is not negating the positive attitude.We check for these determining the value of this feature.

    With the negated subject feature look for a longer distance dependencyFor example, in the sentence: READ: support is being negated because the subject is being negated giving it a negative sentiment rather than a positive one.As before have features to capture when words from the lexicon modify each other, this time, taking into account their prior polarityThese are again determined using the dependency parse tree for the sentence.For example substantial challenge, we have a feature capturing that challenge is being modified by a word with positive prior polarity.And there is a feature for substantial capturing that it is modifying a clue with a negative prior polarity.Conjunction polarity similar to the last two features. It captures whether a word is in a conjunction with another word from the lexicon.For example for good, the conjunction feature captures that it is in a conjunction with a negative word These last three features look in a window of four words before, searching for the presence of particular types of polarity influencers. General polarity shifters flip polarity of a word or phrase negative polarity shifters shift polarity to negative positive polarity shifters shift polarity to positive

    Read through examples.Continued research on enhancing dictionaries such as wordnet; this link points to close to 50 wordnet projects for various languages.a growing number of research groups participate in large-scale evaluations such as SENSEVAL. Some comment that blah groups participated inSo, what did the annotators judge?Boxes around the subjectivity classifer. Color on the sentences. Remove the italics.Boxes around the subjectivity classifer. Color on the sentences. Remove the italics.Boxes around the subjectivity classifer. Color on the sentences. Remove the italics.Boxes around the subjectivity classifer. Color on the sentences. Remove the italics.Boxes around the subjectivity classifer. Color on the sentences. Remove the italics.Boxes around the subjectivity classifer. Color on the sentences. Remove the italics.And, if we care about improving our lexical resources, attention to the subjectivity of senses could, for example, reveal missing senses. The main goal in this work is toShow a specific sense say we are considering only the sense listing in wordnet, we arent looking at a corpusSo, what did the annotators judge?There are a number of methods for distributional similarity that have beenused in NLP. I used Dekang Lins work.The general idea is to assess word similarity based on the distributional pattern of words.First, the corpusIs parsed with Lins dependency grammar, and these dependenciesTriples are extracted. To assess the similarity between word1 and word2,Look at the words correlated with Word1; the words correlated withWord2. Intuitively, the more overlap there is between these sets, the more similar the words are judged to be. The similarity metric takes into accountnot only the size of the set, but also the mutual information between each word andthe members of the intersection set.

    For reference:More precisely, sum the mutual information measures of each word with members of the intersectionAnd divide by the sum of the mutual information measures of the wordAnd its correlated words. And of course we evalutaed the system on the data used for senseval-3, which Ill refer to as the S3dataanimateThe idea is So, lets look at the resultsThe main goal in this work is toThe main goal in this work is to