A lexicon model for deep sentiment analysis and opinion ... lexicon model for deep sentiment analysis and opinion mining applications Isa Maks Piek Vossen VU University, Faculty of

1

A lexicon model for deep sentiment analysis and opinion mining

applications

Isa Maks Piek Vossen

VU University, Faculty of Arts

De Boelelaan 1105, 1081 HV Amsterdam,

The Netherlands

VU University, Faculty of Arts

De Boelelaan 1105, 1081 HV Amsterdam,

The Netherlands

[email protected] [email protected]

Abstract

This paper presents a lexicon model for the description of verbs, nouns and adjectives to be used in appli-

catons like sentiment analysis and opinion mining. The model aims to describe the detailed subjectivity re-

lations that exist between the actors in a sentence expressing separate attitudes for each actor. Subjectivity

relations that exist between the different actors are labeled with information concerning both the identity of

the attitude holder and the orientation (positive vs. negative) of the attitude. The model includes a categori-

zation into semantic categories relevant to opinion mining and sentiment analysis and provides means for

the identification of the attitude holder and the polarity of the attitude and for the description of the emo-

tions and sentiments of the different actors involved in the text. Special attention is paid to the role of the

speaker/writer of the text whose perspective is expressed and whose views on what is happening are con-

veyed in the text. Finally, validation is provided by an annotation study that shows that these subtle subjec-

tivity relations are reliably identifiable by human annotators.

1 Introduction

Subjectivity can be conveyed in various ways, at various text levels, and by various types of expressions.

We explore to what extent subjectivity is a property that can be associated with words and word sense

taking into consideration the detailed and subtle subjectivity relations that exist between words in a sen-

tence.

Consider the following example:

2

Ex. (1) … Damilola’s killers were boasting about his murder...

This sentence expresses a positive sentiment of the killers towards the fact they murdered Damilola and it

expresses the negative attitude on behalf of the speaker/writer towards the murderers of Damilola. Both

attitudes are part of the semantic profile of the word and should be modelled in a subjectivity lexicon.

As opinion mining and sentiment analysis applications tend to utilize more and more the composition of

sentences (Moilanen (2007), Choi and Cardie (2008), Jia et al. (2009)) and to use the value and properties

of the words expressed by its dependency trees, there is a need for specialized lexicons where this infor-

mation can be found. For the analysis of more complex opinionated text like news, political documents,

and (online) debates the identification of the attitude holder and topic are of crucial importance. Applica-

tions that exploit the relations between the word meaning and its arguments can better determine senti-

ment at sentence-level and trace emotions and opinions to their holders.

Our model seeks to combine the insights from a rather complex model like Framenet (Ruppenhofer et al.

(2010)) with operational models like Sentiwordnet where simple polarity values (positive, negative, neu-

tral) are applied to the entire lexicon. Subjectivity relations that exist between the different actors are

labeled with information concerning both the identity of the attitude holder and the orientation (positive

vs. negative) of the attitude. The model accounts for the fact that words may express multiple attitudes. It

includes a categorization into semantic categories relevant to opinion mining and sentiment analysis and

provides means for the identification of the attitude holder and the polarity of the attitude and for the de-

scription of the emotions and sentiments of the different actors involved in the text. Attention is paid to

the role of the speaker/writer of the event whose perspective is expressed and whose views on what is

happening are conveyed in the text.

As we wish to provide a model for a lexicon that is operational and can be exploited by tools for deeper

sentiment analysis and rich opinion mining, the model is validated by an annotation study of 580 verb,

595 noun and 609 adjective lexical units.

3

2 Related Work

In recent years many sentiment analysis and opinion mining applications have been developed to

analyse opinions, feelings and attitudes about products, brands, and news, and the like. These

applications mine opinions from different sources like online forums and news sites (Soma-

sundaran and Wiebe 2010) and from movie, product and hotel reviews (Duan et al. 2008, Kai-

quan et al. 2011). Many of these tools rely on manually built or automatically derived polarity and

subjectivity lexicons and, in particular for English, a couple of smaller and larger lexicons are available.

These lexicons are lists of words (for example, Hatzivassiloglou and McKeown (1997), Kamps et al.

(2004), Kim and Hovy (2004)) or list of word senses (for example, Esuli and Sebastiani (2006), Wiebe

and Mihalcea (2006), Su and Markert, (2008)) annotated for negative or positive polarity. Important as-

pects – as illustrated by examples (2a-d) - of these existing annotation schemata are (1) tagging at word

sense level instead of word level (2) distinction between objective and subjective words and (3) not only

subjective words but also objective words can have a negative or positive connotation.

Ex. (2)

(a) (Subjective:Negative) angry—feeling or showing anger; “angry at the weather; “angry customers; an angry

silence”

(b) (Subjective:Positive) beautiful—aesthetically pleasing

(c) (Objective:NoPolarity) alarm clock, alarm – a clock that wakes the sleeper at a preset time

(d) (Objective: negative) war, warfare – the waging of armed conflict against an enemy; ”thousands of people

were killed in the war”

All the above annotation schemata attribute single polarity le polarity values (positive, negative, neutral)

to words and therefore cannot account for more complex cases like boast (cf. ex. 1) which carry both

negative and positive polarity.

A more elaborate model is proposed by Strapparava and Valitutti (2004 and 2010) forwho developed

Wordnet-Affect, an affective extension of Wordnet. It describes ‘direct’ affective synsets which denote

4

emotions and ‘indirect’ affective synsets which refer to emotion carriers. Direct affective synsets are clas-

sified into emotion, cognitive state, trait, behavior, attitude and feeling. ‘Indirect’ affective synsets are

subdivided according to a specific appraisal model of emotions (OCC) that defines the effects of the emo-

tion to the subjects (actor, actee and observer) semantically connected to it. For example, the word vic-

tory, if localized in the past, can be used to express pride (related to the actor or “winner”), and

disappointment (related to the actee or “loser”). If victory is a future event the expressed emotion is hope.

Their model is similar to ours, as we both relate attitude to the participants of the event. However, their

model focuses on a rich description of different aspects and implications of emotions for each actor

whereas we infer a single positive or negative attitude for each actor. Their model seems to focus on the

cognitive aspects of emotion whereas we aim to also model the linguistic aspects by including specifically

the attitude of the Speaker/Writer in our model. Moreover, our description is not at the level of the synset

but at the lexical unit level which enables us to differentiate gradations of the strength of emotions of the

synonyms within the synsets. Moreover, it enables us to relate the attitudes directly to the syntactic-

semantic patterns of the lexical unit.

Framenet (Ruppenhofer et al. (2010)) is another important resource that is used in opinion mining and

sentiment analysis (Kim and Hovy (2006)). Framenet (FN) is an online lexical resource for English that

contains more than 11,600 lexical units. The aim is to classify words into categories (frames) which give

for each lexical unit the range of semantic and syntactic combinatory possibilities. The semantic roles

range from general ones like Agent, Patient and Theme to specific ones such as Speaker, Message and

Addressee for Verbs of Communication. FN includes frames such as Communication, Judgment, Opin-

ion, Emotion_Directed and semantic roles such as Judge, Experiencer, Communicator which are highly

relevant for opinion mining and sentiment analysis. However, subjectivity is not systematically and not

(yet) exhaustively encoded in Framenet. For example, the verb gobble (eat hurriedly and noisily) belongs

to the frame Ingestion (consumption of food, drink or smoke) and neither the frame nor the frame ele-

ments account for the negative connotation of gobble. Yet, we think that a resource like FN with rich and

5

corpus based valency patterns is an ideal base/ starting point for including subjectivity descriptions in

frame structures that can then be used to recognize subjectivity relations in text.

None of these theories, models or resources is specifically tailored for the subjectivity description of

verbs. Studies which focus on verbs for sentiment analysis, usually refer to smaller subclasses like, for

example, emotion verbs (Mathieu, 2005, Mathieu and Fellbaum, 2010) or quotation verbs (Chen 2005,

2007).

3 Subjectivity in the Lexicon

In the context of existing sentiment and subjectivity lexicons, the notion of subjectivity focuses on posi-

tive and negative polarity. Other types of subjectivity, like for example epistemic subjectivity which deals

with how certain or uncertain a speaker is about a given state-of-affairs, or speaker subjectivity which

deals with the ‘self-expression’ of the speaker, are considered less important. Obviously, this is related to

the applications these lexicons are intended for like sentiment and opinion mining which focus on views,

emotions and beliefs pro or against a certain product, film or point-of-view. However, if we consider

definitions of subjectivity in other linguistic areas, the focus is quite different (Pit 2003). Much attention

is paid to the presence of the speaker in the text and to what extent the speaker is needed as the reference

to be able to interpret the meaning of the sentence (Pit 2003). A distinction is made between speaker sub-

jectivity and character subjectivity (which for the remainder of this paper will be called actor subjectiv-

ity). Actor subjectivity refers to the representation of a character’s thoughts, emotions, utterances, and

perceptions and Speaker subjectivity refers to the representation of the thoughts, emotions, utterances, and

perceptions of the speaker. The following examples illustrate the different types.

Ex. (3)

a) He is a liar

Negative attitude of Speaker/Writer towards ‘he’

b) Bush is bad for the economy …

6

Negative attitude of Speaker/Writer towards Bush

Sentences (3a) and (3b) are illustrations of speaker subjectivity. Both express a negative attitude of the

speaker towards the participants, i.e. ‘he’ and Bush, respectively.

c) His hatred for religion ….

Negative Attitude of ‘his’ (he) towards ‘religion’

d) Bush is angry over Obama's leeking of private conversation …

Negative attitude of Bush towards Obama

e) T.F. upset her daughter by going …

Negative Attitude of ‘her daughter’ towards ‘T.F’

f) He urges her to turn off the phone …

Negative attitude of ‘he’ towards ‘her’and Positive attitude of ‘he’ towards ‘turning off her phone’ Negative at-

titude of ‘her’ towards the ‘urging’ action

g) They are enthusiastic about the service …

Positive attitude of ´they´ towards ´the service´

h) He is boasting about his daughter

Positive attitude of ‘he’ towards ‘his daughter’ and Negative attitude of the Speaker/Writer towards ‘he’.

Sentences (3c) to (3g) are illustrations of actor subjectivity. In all cases, the attitude of one or more actors

is expressed. The speaker is present – being the reporter of the participants’ attitudes – but he is neutral.

The actors may have different syntactic functions in the sentence, e.g. subject (cf. 3d, 3f, 3g ), object (cf.

3e, 3f) , or possessive (3c ), depending on the argument structure of the sentiment word. Sentence (3h) is

an illustration of both actor and speaker subjectivity combined in one word, i.e. boast.

In sentences (3a) to (3h) the identification of the kind of subjectivity and the possible inferences about

attitude holder and attitude target, are part of the semantics of the sentiment-bearing words. For example,

if we know that enthusiastic is an adjective which triggers actor subjectivity then we know that the atti-

7

tude holder is realized either by the subject of the sentence (e.g. they are enthusiastic …) or by the modi-

fied noun (e.g. enthusiastic supporters...). We also know that the PP-complement with about realizes the

attitude target. Likewise, if we know that bad is an emotion adjective which triggers speaker subjectivity

then we know that the attitude holder is the speaker or writer of the sentence and that the attitude target is

either realized by the subject (e.g. Bush is bad) or by the modified noun (e.g. bad children..). Thus, the

subjectivity type in combination with the argument structure, can be used to identify the attitude holder

and the attitude target.

This kind of information is important for subjectivity analysis. According to Banea et al. (2010) subjectiv-

ity analysis is “the task of identifying when a private state is being expressed and identifying attributes of

the private state. Attributes of private states include who is expressing the private state, the type(s) of

attitude being expressed, about whom or what the private state is being expressed, the polarity of the pri-

vate state, (i.e., whether it is positive or negative), and so on”.

In the following section we explain how we want to incorporate these aspects in a subjectivity lexicon.

4 Design of the Lexicon Model

The proposed model is built as an extension of an already existing lexical database for Dutch, i.e. Cor-

netto (Vossen et al. 2008). Cornetto combines two resources with different semantic organizations: the

Dutch Wordnet (DWN) which has, like the Princeton Wordnet, a synset organization and the Dutch Ref-

erence Lexicon (RBN) which is organized in form-meaning composites or lexical units. The description

of the lexical units includes definitions, usage constraints, selectional restrictions, syntactic behaviors and

illustrative contexts. DWN and RBN are linked to each other by the constraint that each synonym in a

synset is linked also a lexical unit in RBN. As argued in Maks and Vossen (2010), all synonyms (lexical

units) in a synset have the same denotation and the same overall polarity (positive, negative or neutral).

8

Lexical units within the synsets can however differ with in the gradation of the polarity or the frame reali-

zation of the subjectivity. The latter subjectivity information is therefore modeled as an extra layer for

each lexical unit representing subjectivity at word sense level.

In the next sessions we will further describe the lexical model for the subjectivity relations at word sense

level. The model consists of a categorization into semantic classes (section 4.1) and complementation

patterns (section 4.2 ).

4.1 Semantic Classes

4.1.1 Verbs

For the identification of relevant semantic classes we adopt – and broaden – the definition of subjective

language by Wiebe et al. (2006). Subjective expressions are defined as words and phrases that are used to

express private states like opinions, emotions, evaluations, and speculations.

Three main types are distinguished:

• Type I :

Direct reference to private states (e.g. his alarm grew, he was boiling with anger). We include in this cate-

gory emotion verbs (like feel, love and hate) and cognitive verbs (like defend, dare, realize etc.) . Ac-

cordintype coincides with an actor subjectivity category. Type I members have actor subjectivity.

• Type II:

Reference to speech or writing events that express private states (e.g. he condemns the president, they

attack the speaker). According to our schema, this category includes all speech and writing events and the

annotation schema points out if they are neutral (say, ask) or bear polarity (condemn, praise). Type II

members have actor subjectivity.

• Type III:

9

The third category contains 'expressive subjective elements' which express indirectly the private state of

the speaker/writer (e.g. superb, that doctor is a quack). Verbs that fall in this category are often also a

member of one of the other categories. For example, boast (cf. ex. 1) is both a Type II (i.e. speech act

verb) verb and a Type III verb as it expresses the negative attitude of the speaker/writer towards the

speech event. By considering this category as combinational, we can make a distinction between

Speaker/Writer subjectivity and Actor subjectivity.

Type Semantic Category Description Example

I (+III) EXPERIENCER Verbs that denote emotions. Included are both experiencer

subject and experiencer object verbs.

hate, love, enjoy, enter-

tain, frighten, upset,

frustrate, cry

I(+III) ATTITUDE A cognitive action performed by one of the actors, in

general the structural subject of the verb. The category is

relevant as these cognitive actions may imply attitudes

between actors.

defend, think, dare,

ignore, avoid, feign,

pretend, patronize,

devote, dedicate

II(+III) JUDGMENT A judgment (mostly positive or negative) that someone

may have towards something or somebody. The verbs

directly refer to the thinking or speech act of judgment.

praise, admire, rebuke,

criticize, scold, re-

proach, value, rate,

estimate

II(+III) COMM-S A speech act that denotes the transfer of a spoken or writ-

ten message from the perspective of the sender or speaker

(S) of the message. The sender or speaker is the structural

subject of the verb.

speak, say, write, grum-

ble, stammer, talk,

email, cable, chitchat,

nag, inform

II(+III) COMM-R A speech act that denotes the transfer of a spoken or writ-

ten message from the perspective of the receiver(R) of the

message. The receiver is the structural subject of the verb

read, hear, observe,

record, watch, compre-

hend

IV(+III) ACTION A physical action performed by one of the actors, in gen-

eral the structural subject of the verb. The category is

relevant as in some cases actors express an attitude by

performing this action.

run, ride, disappear, hit,

strike, stagger, stumble,

beat

IV(+III) PROCESS_STATE This is a broad and underspecified category of state and

process verbs (non-action verbs) and may be considered as

a rest category as it includes all verbs which are not in-

cluded in other categories.

grow, be, disturb, driz-

zle, muzzle,

Table (1) Semantic categorization of verbs

10

• Type IV (actor subjectivity)

In addition, we specify a fourth category. Consider the following examples:

Ex. (4) the teacher used to beat the students

Ex. (5) C.A is arrested for public intoxication by the police

Both beat and arrest are observable actions and therefore usually considered 'objective'. However, in

many contexts these verbs refer to the private state of one of the actors. In ex. (4) the teacher and the stu-

dents will have bad feelings towards each other and also in ex. (5) C.A. will have negative feelings about

the situation. Type IV thus make reference to a private state that is the source or the consequence of an

event (action, state or process) while the event is explicitly mentioned.

Verb senses which are categorized as Type I, II or III are always considered as subjective; verb senses

categorized as Type IV are only subjective if one of the annotation categories (see below for more details)

has a non-zero value; otherwise they are considered as objective.

The actor subjectivity types (type I, I and IV) tend to correlate with global semantic categories. We there-

fore include the semantic category to support more consistent manual annotation and automatic acquisi-

tion of the subjectivity of words. Table (1) presents the different semantic categories with examples for

each. The first column lists the potential subjectivity classes that can apply.

4.1.2 Nouns

Nouns are in the Cornetto Database categorized in a semantic classification which consists of the follow-

ing classes: Human (+animate), Nonhuman (+animate), Institution, Artifact (+concrete), Sub-

stance(+concrete), Concrother (+concrete: concrete entities other than artifact or substance), Time (-

animate), Place (-animate), Measure (-animate), Institution, and Abstract (-animate) .

We added 10 more subclasses to Abstract for their relevance to the subjectivity lexicon: Social Object,

Mental Object, Quality, Attitude, Behavior, Situation, Emotion, Physical State, Social State and Event.

Finally, we merged the classes Substance and Concrother into one class labeled Physical Object. Table

11

(2) presents the resulting categories with examples for each.

Name Description Examples

ARTIFACT Something made or given shape by man table, house, bicycle, sack

PHYSICAL

OBJECT

A tangible and visible entity (other than artifact) arm, sand, coffee, trash

INSTITUTION Institution or group of persons school, advisory board

NONHUMAN Reference to an animal cat, mongrel

HUMAN Reference to a person girl, nurse, monster

PHYSICAL

STATE

The physical condition of a person healthiness, virginity

SOCIAL STATE The social situation of a person or group of persons wealth, unemployment

SOCIAL OBJECT A product of groups or of their social behavior punk, law, lifestyle, religion

EMOTION Nouns that denote emotions and emotional responses joy, feeling, smile

MENTAL

OBJECT

A cognitive product of an individual reaction, idea, recommendation

ATTITUDE A mental state to act in certain ways supervision, resistance, falling out

BEHAVIOR The way a person behaves arrogance, friendliness

SITUATION Description of a state-of-affairs catastrophe, disaster

QUALITY A property of persons, objects or other entities size, intelligence, beauty

TIME A period of time hour, day, school time

PLACE Location Belgium, suburb

MEASURE

How much there is of how many there are of

something

degree, meter, yard

EVENT

Something that happens at a given time or place (and

not categorized in any of the other classes)

growth, meeting, orthodontics

Table (2) Semantic categorization of nouns

4.2 Attitude

In our model, subjectivity is defined in terms of attitude holders carrying attitude towards each other or

towards the events in the text (cf. section 3). For example, experiencers hold attitudes towards targets and

communicators express a judgment about an evaluee. We developed an annotation schema (see Table 2

12

below) which enables us to relate the attitude holders and the orientation of the attitude (positive, negative

or neutral). In the case of nouns and verbs, the syntactic valencies of the verb or noun are associated with

the respective attitude holders.

To be able to attribute the attitudes to the relevant actors we identify for each form-meaning unit the se-

mantic category, the semantic-syntactic the arguments, the associated semantic roles and some coarse

grained selection restrictions.

4.2.1 Actor’s Attitude

• Verbs

We formulated simple argument structure for the verbs specifying the number of arguments, some coarse

grained selectional restrictions, and the use of prepositions. For example:

SMB (A1) give (A0) SMT (A2) to SMB (A3)

A1, A2 and A3 are the arguments of the verbs which are lexicalized by the structural subject (A1), the

direct object (A2 or A3) and indirect/prepositional object (A2 or A3). A2 and A3 both can be syntacti-

cally realized as an NP, a PP, that-clause or infinitive clause. Each actor is associated with coarse-grained

selection restrictions: SMB (somebody; +human), SMT (something; -human) or SMB/SMT (some-

body/something; +/– human). If the syntactic realization consists of a PP-phrase the preposition is in-

cluded in the argument structure (cf. to in the argument structure of give).

Attitude (positive, negative and neutral) can be attributed to arguments if they represent +human entities

and are targeted towards the other categories either +animate or -animate entities. Annotation categories

are labeled as:

• A1A2 which means that the attitude is held by A1 and is targeted towards A2;

• A1A3 which means that the attitude is held by A1 and is targeted towards A3

• A1A0, A2A0 or A3A0, which means that the attitude is held by A1, A2 or A3, respectively and is

targeted towards A0 which is the event or situation itself.

13

The following examples illustrate how the annotations refer to the complementation patterns. Other ex-

amples can be found in table (5) below.

• Verdedigen (to defend: argue or speak in defense of)

Transliteration: He (A1) defends (A0) his decision (A2) against critique (A3)

Complementation Pattern: SMB(A1) defend (A0) SMT/SMB (A2) against SMT/SMB (A3)

Annotations: A1A2 positive; A1A3 negative; A1A0 NONE; A2A0 NONE; A3A0 NONE

• Verliezen (to lose: miss from one's possessions)

Transliteration: He (A1) loses (A0) his sunglasses (A2) like crazy

Complementation Pattern: SMB (A1) lose (A0) SMT (A2)

Annotations: A1A2 NONE; A1A3 not relevant ; A1A0 negative; A2A0 NONE;

A3A0 not relevant

• Nouns

Nouns can denote concrete objects or substances (so-called first-order entities) or they can denote events,

states, properties and relations (so-called higher-order entities). Complementation patterns of first-order

entities are usually simple, whereas patterns of higher-order entities are similar to those of verbs, but - in

our model - do not have more than two arguments. For example:

SMB’s (A1) complaint (A0) against SMT/SMB (A2)

A1 is the logical subject in the case of an event noun (like complaint ) and A2 is always realized through

a prepositioanl phrase. In the case of other semantic types, A1 may have another function, for example

‘someone having’ (A1’s arrogance, A1’s friend).

Attitude (positive, negative and neutral) can be attributed to arguments if they represent +human entities

and are targeted towards the other arguments representing either +animate or -animate entities. Annota-

tion categories are labeled as

- A1A2/0 which means that the attitude is held by A1 and is targeted towards A2 or – if there is no

A2 – towards A0;

14

- A2A0/1 which means that the attitude is held by A2 and is targeted towards A0 or A1; applied

only when different from A1A2/0.

The following examples illustrate how the annotations refer to the complementation patterns. Other ex-

amples can be found in table (4) below.

• verdriet (sadness : emotions experienced when not in a state of well-being )

Transliteration: The Dalai Lama (A1) expresses his (A1) sadness (A0) over the recent earth

quake in New Zealand (A2)

Complementation Pattern: SMB’s(A1) sadness (A0) over SMT/SMB (A2)

Annotations: A1A0/2 negative ; A2A0/1 none

• vriend (friend : a person you know well and regard with affection and trust)

Transliteration: he was my (A1) friend (A0)

Complementation Pattern: SMB’s(A1) vriend

Annotations: A1A0/2 - positive ; A2A0/1 n.r.

• tabel (table: a set of data arranged in rows and columns)

Transliteration: as can be seen from the following table (A0)

Complementation Pattern: no complementation pattern

Annotations: no annotations

• Adjectives

With respect to adjectives no complementation patterns are specified. If the adjective is labeled with

actor subjectivity and used attributively (cf. ex. 6) the noun of the adjective-noun combination is consid-

ered the attitude holder; if the adjective is labeled with actor subjectivity and used predicatively (cf. ex. 7,

and cf. section 3) the subject of the sentence is considered the attitude holder. Actor Subjectivity of ad-

jectives includes two types: the Experiencer who inactively undergoes an emotion (cf. ex. 6) and the

Agent who actively takes a stance or attitude (cf. ex. 7). In the annotation schema these two types are

both labeled as actor subjectivity (A1).

15

ex. (6) …excited fans welcome Obama…

ex. (7) McCain was critical of Iraq war from the beginning ….

Other examples can be found in table (2) below.

4.2.2 Speaker/Writer’s attitude (SW)

The attitude of the speaker is conveyed in a way that differs considerably from that of the actors. There is

no reference to the attitude holder (unless the text is written in first person like, for example, I am against

the ban on English shirt during the World Cup). The Speaker/Writer (SW) expresses his attitude by

choosing words with affective connotation (cf. ex. 8, 9 and 10) or by conveying his subjective interpreta-

tion of what happens (cf. ex. 11 and 12).

ex. 8: He gobbles down three hamburgers

ex. 9: She is a quack

ex. 10: That quack couldn’t help her

ex. 11: awesome here!

ex. 12: B. S. misleads district A voters

In ex. (8), the SW describes the eating behavior of the ‘he’ and he also expresses his negative attitude

towards this behavior by choosing the negative connotation word gobble. Since the behaviour is carried

out by the A1 (the structural subject), the judgment also applies to the A1. In ex.(9), a noun that denotes

people is used as a predicate of the subject and likewise expresses the judgment of the SW of the subject.

In the case of ex. (10), quack is the subject of the sentence and therefore SW’s judgment applies directly

to the denotation of subject. In ex. (11) the speaker expresses his positive attitude about his current situa-

tion by just describing it. Finally in ex. (12), the SW expresses his negative attitude towards the behavior

of the subject of the sentence by pointing to the negative implications of the event.

16

We use a separate property (SW) to indicate the explicit judgment of the Speaker/Writer, as can be seen

from tables 3, 4 and 5 below.

4.2.3 Everybody’s Attitude (ALL)

Some concepts are considered negative by a vast majority of people and therefore express a more general

attitude shared by most people. For example, to die will be considered negative by everybody, i.e. observ-

ers, speakers and listeners. These concepts are ‘objective’ in the sense that they refer to measurable and

observable concepts. They do not express sentiment but they may evoke it.

Examples are:

ex. (13) …this suit is water-resistant ..

ex. (14) … thousands of casualties after an earthquake in …

In general, people are positive about water-resistant suits, and negative about earthquakes and casualties.

In the annotation schema Everybody’s attitude is indicated by the feature ALL.

4.3 Example of the Full Annotation Schema

The annotation model for adjectives, nouns and verbs is illustrated by the table (3), (4) and (5), respec-

tively.

Adjectives

FORM SUMMARY A1 SW ALL

gelukkig (happy) experiencing pleasure or joy +4 - -

kritisch (critical) inclined to criticize -4 0 -

mooi (beautiful) aesthetically pleasing 0 4 -

slecht (bad) having undesirable or negative qualities 0 -5 -

waterbestendig (water-

resistant) hindering the penetration of water 0 0 +2

kreupel (crippled) disabled in the feet or legs 0 0 -3

burgerlijk (civil, civilian) applying to citizens 0 0 0

17

Explanation:

Form the annotated item (A0)

Summary short description of the meaning of the annotated item

A1 A1 has a positive or negative attitude

SW SW has a positive or negative attitude

ALL there is a general positive or negative attitude towards the state-of-affairs described by the annotated item

Table (3) Annotation Schema Adjectives

Nouns

FORM SUMMARY Complementation Pattern A1A0/2 A2A0/1 SW ALL

oplawaai (wallop) a powerful stroke with the fist

or a weapon

oplawaai (A0) van SMB(A1) aan

SMB/SMT (A2) -3 0 -2 0

arrogantie (arrogance)

overbearing pride evidenced

by a superior manner toward

inferiors

SMB’s (A1) arrogantie (A0) 0 0 -4 0

gezwam (folderol, rub-

bish) nonsensical talk or writing SMB’s (A1) gezwam (A0) 0 0 -3 0

storing (disturbance) activity that is an intrusion or

interruption - 0 0 0 -2

obsessie (obsession) an unhealthy preoccupation

with something or someone

SMB’s (A1) obsessie (A0) voor SMB/SMT

(A2) (SMB’s obsession for SMB/SMT) +5 0 -3 0

schoenmaker (shoe-

maker)

a person who makes or repairs

shoes - 0 0 0 0

dader (wrongdoer) a person who transgresses

moral or civil law

dader (A0) van SMT (A1) op/tegen

SMB/SMT (A2) 0 -3 0 0

vriend (friend) SMB’s (A1) friend (A0) +3 0 0 0

Explanation:



A1A0/2 A1 has a positive or negative attitude either towards A0 (if there is no A2) or towards A2

A2A0/1 A2 has a positive or negative attitude towards A0 or A1 (only if different from A1A0/2)

SW SW has a positive or negative attitude towards A0 and/or A1


Table (4) Annotation Schema Nouns

18

Verbs

FORM SUMMARY SEMTYPE COMPLEMENTATION A1A2 A1A3 A1A0 A2A0 A3A0 SW ALL

vreten

(devour, gobble)

eat immoderately

and hurriedly ACTION SMT (A2) 2 0 0 0 0 -4 0

afpakken

(take away)

take without the

owner’s consent ACTION SMT(A2) van SMB (A3) 0 0 0 0 -3 0 0

verliezen (lose) lose: fail to keep

or to maintain PROCESS SMT (A2) 0 0 -3 0 0 0 0

dwingen (force)

urge a person to

an action ATTITUDE SMB (A2) tot SMT (A3) -3 2 0 0 0 0 0

opscheppen (boast)

to speak with

exaggeration and

excessive pride

COMM-S over SMB/SMT (A2) 3 0 0 0 0 -4 0

helpen (help)

give help or

assistance ; be of

service

ACTION SMB(A2) met SMT (A3) 2 1 0 0 0 0 0

bekritiseren(criticize) express criticism

of COMM-S SMB (A2) -3 0 0 0 0 0 0

zwartmaken (slander)

charge falsely or

with malicious

intent

COMM-S SMB (A2) -3 0 0 0 0 -4 0

verwaarlozen (neglect) fail to attend to ATTITUDE SMB (A2) -3 0 0 0 0 -4 0

afleggen

(lay out)

prepare a dead

body ACTION SMB (A2) 0 0 0 0 0 0 -1

Explanation:



A1A2 A1 has a positive or negative attitude towards A2





SW SW has a positive or negative attitude towards A0 and/or A1


Table (5) Annotation Schema Verbs

19

5 Inter-annotator Agreement Study

To explore our hypothesis that different attitudes associated with the different attitude holders can be

modeled in an operational lexicon and to explore how far we can stretch the description of subtle subjec-

tivity relations, we performed an inter-annotator agreement study to assess the reliability of the annotation

schema.

We are aware of the fact that it is a rather complex annotation schema and that high agreement rates are

not likely to be achieved. The main goal of the annotation task is to determine to what extent this kind of

subjectivity information can be reliably identified, which parts of the annotation schema are more difficult

than others and perhaps need to be redefined. This information is especially valuable to evaluate the

automatic acquisition of the (parts of) the information specified by the annotation schema.

Annotation is performed by two linguists (i.e. both authors of this paper). We did a first annotation task

for training and discussed the problems before the gold standard annotation task was carried out. The

annotation is based upon the full description of the lexical units including glosses and illustrative exam-

ples.

5.1 Composition of the Sample

We aim at a sample that is representative for the whole lexicon and relevant for subjectivity identification

and annotation. As the sample also serves as a benchmark for evaluation and comparison of the auto-

matically annotated sentiment word lists and must help to improve the methods which derive these word

lists, we specified the following requirements:

• Frequency: most recently used goldstandards for English, like General Inquirer (Stone, 1966) and

Micro-Wnop (Cerini, 2007) seem to be unbalanced with regard to frequency which may lead to a non

representative high degree of subjectivity-ambiguous words (Su et al., 2008). Moreover, as the number of

infrequent or rare words is relatively high in subjective text (Wiebe et al., 2004), we think it is important

to include high frequency as well as low frequency words in the gold standard.

20

• Polysemy is mentioned by Andreevskaia et al. (2006) as one of the major origins of errors of their

sentiment tagging system. Especially distinguishing sentiment-laden meanings of a given polysemous

word from neutral meanings, seriously hampers the performance of their system. We therefore include

high and low polysemous words in our sample.

• Synset Size: in our own experience (Maks and Vossen (2010)) sentiment-bearing words tend to

cluster in large synsets, which should be internally consistent with respect to polarity. To be able to test

for this hypothesis, we included also some members of large synsets in the test set.

We aim at an equal distribution of test items of nouns (N), verbs (V) and adjectives (A) across these three

dimensions and across three values (high, medium and low). Thus, each item focuses on a single phe-

nomenon which distinguishes it from all other test items and the performance of systems can be evaluated

for isolated phenomena. Table (1) shows the results of this selection procedure: the test items are distrib-

uted in such a way that each dimension-value combination includes a representative number of word

senses. Word frequency is based upon two different Dutch corpora. Polysemy numbers are based on

Cornetto data: monosemous items are categorised as ‘low’; items with with 2 to 4 senses are categorised

as ‘mid’ and items with more than 4 senses are considered highly polysemous, Synset size is also based

upon Cornetto data: synsets with one synonym are considered to have ‘low’ synset-size, 2 to 5 synonyms

as ‘mid’ and synsets with more than 5 synonyms are considered ‘high’ (cf. table 6).

frequency polysemy synset-size

A N V A N V A N V

high 179 251 241 129 183 192 202 104 154

mid 164 113 178 239 236 199 256 150 241

low 266 231 162 241 176 190 151 341 186

totals 609 595 581 609 595 581 609 595 581

Table 6: Distribution of gold standard items

21

5.2 Inter-annotator Agreement Results

5.2.1 Overall inter-annotator agreement

Overall percent agreement for all annotation categories in the model is 79% (perc.) with a Cohen kappa

of 0.73 for adjectives (ADJ), 74% with a Cohen kappa of 0.62 for nouns and 66% with a Cohen kappa

of 0.62 for verbs (cf. table 7).

perc. kappa number of items

ADJ 79% 0.73k 609

NOUNS 74% 0.63k 595

VERBS 66% 0.62k 574

Table 7: Overall Results

5.2.2 Agreement with respect to Semantic Class

Table (8) gives the inter-annotator percentage agreement for each semantic class separately. ‘Number’

gives the number of test items for that particular semantic class. It can be seen that agreement differs

considerably across the different classes with low scores for the noun class ‘quality’ and the verb class

‘state-process’ and good scores for the other classes.

NOUNS Percentage Number VERBS Percentage Number

All classes 74% 595 All classes 66% 581

physical obj 79% 107 action 66% 302

human 80% 103 attitude 68% 71

artefact 86% 74 comm-r 81% 16

event 68% 73 comm-s 77% 57

mental obj 84% 50 experiencer 83% 28

behavior 69% 43 judgment 84% 25

emotion 70% 27 state_process 55% 82

situation 80% 20

22

attitude 82% 17

institution 81% 16

social obj 80% 15

nonhuman 84% 13

quality 30% 10

Table 8: Inter-annotator agreement for semantic classes

5.2.3 Inter-annotator Agreement with respect to Attitude Holder Categories

Table (9) gives the inter-annotator percentage agreement (perc.) and kappa agreement (kappa) for each

separate attitude holder category. Column 3 (with sentiment) gives the average number of items that have

been tagged as positive or negative, for the given category. This number may be seen as an indication of

the relevance of the annotation category for subjectivity analysis.

Adjectives

perc. kappa with sentiment

A1 97% 0.81k 46

SW 83% 0.75k 385

ALL 86% 0.67k 35

Nouns

A1A0/1 94% 0.81k 117

A2A1/0 98% 0.40k 11

SW 90% 0.76k 163

ALL 89% 0.32k 48

Verbs

A1A2 89% 0.73k 143

A1A3 98% 0.73k 15

A1A0 93% 0.41k 37

A2A0 94% 0.56k 42

A3A0 98% 0.54k 10

SW 91% 0.76k 137

ALL 87% 0.37k 64

Table 9: Inter-annotator agreement for attitude holder categories

23

Categories SW, A1, A1A0/1, A1A2 and A1A3 are reliably identifiable with kappa scores between 0.73

and 0.81. The smaller actor subjectivity categories A2A1/0, A1A0, A2A0, and A3A0 and the category

ALL are not reliably identifiable with kappa scores lower than 0.60.

5.2.4 Single Polarity

One single polarity value for each item is derived by collapsing all attitude holder polarity values into one

single value. If an item is tagged with different polarity values, SW subjectivity is considered as the most

important one, and precedes over the other categories. As can be seen from table (10), observed agree-

ment is 86%, 87% and 83% with kappa agreement of 0.80, 0.80 and 0.74 for adjectives, nouns and verbs,

respectively. Separate polarity computation (positive, negative, neutral and posneg) – with one polarity

value of interest and the other values combined into one non-relevant category - shows that all polarity

values, except for posneg, are reliably identifiable.

A N V

perc. kappa perc. kappa perc. kappa

single polarity 86% 0.80k 87% 0.80k 83% 0.74k

negative 95% 0.90k 93% 0.84k 91% 0.83k

positive 91% 0.81k 92% 0.75k 88% 0.69k

neutral 92% 0.76k 90% 0.79k 86% 0.71k

posneg 94% 0.38k

Table 10: agreement rates for polarity categories

5.3 Disagreement Analysis

The results differ considerably depending on the parts-of-speech and they decline as the annotation

schema becomes more complex (cf. table 7). Overall kappa scores for adjectives (0.73) are quite good

and the scores for verbs and nouns are reasonable considering the complexity of their schemas. Moreover,

we can see in table (8) that most semantic classes that are important for subjectivity analysis, also have

24

rather high percentage agreement scores. In general, relevant semantic classes seem to perform better than

less relevant classes, like Action and Process-state.

Likewise, results differ depending on the type of attitude holder (cf. table 9). SW subjectivity and ‘main’

actor (A1) subjectivity are reliably identifiable. When the attitude is attributed to a second or third actor

(category A2A1/0 for nouns and categories A1A0, A2A0 and A3A0 for verbs) the annotations are less

reliable. Confusion is found also between between verb annotations A2A0 and A1A2. For example, with

respect to misleiden (mislead), annotators agree about a negative attitude from A1 towards A2, but one

annotator marks additionally a negative attitude on behalf of A2 (A2A0: negative) whereas the other does

not. Similar cases are found with nouns. However, since the low scoring categories are rather small, we

think that the overall results for attitude-holdership are quite useful and reliable.

The category ALL seems not to be defined well. Examples of disagreements of this kind are ontwateren

(drain), droogte (dryness/waterlesness) , breuk (fraction, break)). Annotators do not agree about

whether some general positive or negative feelings are associated to them or not because the nega-

tive/positive implication depends on the context. Even though there may agreement on the most common

context, the fact that it can vary leads to differences in the annotation.

Single polarity performs quite good across all word classes (cf. table 10). It shows that a complicated and

layered annotation does not hamper overall agreement and may also produce lexicons for applications that

use single polarity only. The different performance for negative polarity on the one hand and positive and

neutral on the other, are possibly due to the linguistically unmarkedness of the neutral and positive forms

(e.g. honest, understand), whereas the category ‘negative’ includes many items that are morphologically

recognizable by a negativity marker (e.g. misunderstand, unreliable, dishonest). In fact, Kim and Hovy

(2004) reported that inter-annotator agreement of their annotations increases from 76% to 89% when

positive and neutral categories are merged.

A closer look at the data shows that disagreement also occurs when collocational information may lead

one annotator to see subjectivity in a sense and the other not. For example, houden (keep - conform one’s

action or practice to) associated with collocations like to keep appointments and to keep one’s promises is

25

considered positive (A1A2) by one annotator and neutral by the other. This seems to apply in particular

to all frequent light verbs with little semantic content like make, do and take.

Summarizing, we conclude that overall agreement is good, especially with regard to most semantic cate-

gories relevant for subjectivity analysis and with respect to the most important attitude holder categories.

When defining an operational model the small and low scoring categories will be collapsed into one un-

derspecified attitude holder category.

5.4 Different Lexicon dimensions

In this section we explore the correlations between the human annotations and the different lexicon di-

mensions: frequency, polysemy and synset-size. Table 11 gives inter-annotator percentage agreement

rates for the different word classes nouns (N), verbs (V) and adjectives (A) with regard to their classifica-

tion into the low, mid and high values of the different dimensions. For example, from column 2 (fre-

quency-NVA) can be seen that, with respect to all three word classes together, inter-annotator agreement

increases as the frequency of the item decreases (from 73% for items with high frequency to 75% for

items with low frequency). However, this is not a continuous increase as mid-frequent items have an

agreement score of 77%. The scores imply that highly frequent items are harder to identify with regard to

the annotations of the current study than less frequent items. Likewise, from column 6 (polysemy-NVA)

can be seen that agreement decreases when polysemy increases, and from column 10 (synset-size-NVA)

can be seen that agreement increases when synset-size increases.

These upward and downward trends (cf. row 6 – column 2, 6 and 10) of all items regardless of their part-

of-speech, are in line with our expectations. Agreement decreases as polysemy increases, i.e. the usually

more or less related meanings of polysemous words may be hard to separate from each other and lead to

ambiguous annotations. Polysemy and frequency are often related as frequent items are often polyse-

mous. Likewise, we see a negative correlation between agreement rates and frequency: the more frequent

the items are, the harder they are to annotate in a reliable manner.

26

frequency polysemy synset-size

ANV A N V ANV A N V ANV A N V

high 73% 72% 82% 64% 73% 78% 80% 61% 77% 85% 77% 67%

mid 77% 84% 77% 70% 76% 79% 80% 66% 73% 76% 78% 65%

low 75% 80% 75% 64% 76% 80% 75% 70% 74% 75% 77% 66%

trend +2% +8% -7% 0% +3% +2% -5% +9% -3% -10% 0% -1%

Table 11: Correlation of annotation results and lexicon dimensions

With regard to large synset membership we expect that agreement increases as synset-size increases:

items that are members of large synsets are easier to annotate than items that are not. This is explained by

the observation (Maks and Vossen (2010)) that subjective items tend to cluster in large synsets. These

‘really’ subjective items are characterised by a lack of denotation – for which they are easily grouped

together - and a high degree of connotational polarity – for which they are easy to annotate.

Whereas adjectives and verbs follow the expected trends, the results with regard to nouns differ consid-

erably from our expectations. Frequency and polysemy scores for nouns show downward trends where

upward ones are expected. Moreover, as in all cases the mid-frequent items unexpectedly perform better

than the low-frequent items, there seems to be no obvious correlation between lexicon dimension and

human annotation scores.

6 Conclusions

In this paper we presented a lexicon model for the description of verbs to be used in applications like

deeper sentiment analysis and opinion mining, describing the detailed and subtle subjectivity relations

that exist between the different participants of a verb, noun or adjective. The relations can be labeled with

subjectivity information concerning the identity of the attitude holder, the orientation (positive vs. nega-

tive) of the attitude and its target. Special attention is paid to the role of the speaker/writer of the event

whose perspective is expressed and whose views on what is happening are conveyed in the text.

27

We measured the reliability of the annotation. The results show that speaker subjectivity and, to a certain

extent, actor subjectivity can be reliably identified. As the unreliable categories have a relatively small

number of members, we think that the annotation schema is sufficiently validated. In an operational

model of the lexicon the small low scoring categories, including ALL, will be merged into one category.

An additional outcome to our study is that we created a gold standard of approximately 600 items for each

part-of-speech. In the future we will use this gold standard to test methods for the automatic detection of

subjectivity and polarity properties of word senses in order to build a rich subjectivity lexicon for Dutch.

7 Acknowledgments

This research has been carried out within the project From Text To Political Positions (http: //www2.let.vu.nl/oz/cltl/t2pp/). It is funded by the VUA Interfaculty Research Institute CAMeRA.

8 References

Andreevskaia, A. and S. Bergler (2006) Mining WordNet for Fuzzy Sentiment:Sentiment Tag Extraction from-

WordNet Glosses. In: EACL-2006, Trento, Italy.

Chen, L. (2005) Transitivity in Media Texts: negative verbal process sub-functions and narrator bias. In Interna-

tional Review of Applied Linguistics in Teaching, (IRAL-vol. 43) Mouton De Gruyter, The Hague, The Nether-

lands.

Cerini, S., Compagnoni, V., Demontis, A., Formentelli, M., and Gandini, G. (2007). Language resources and lin-

guistic theory: Typology, second language acquisition, English linguistics (Forthcoming), chapter Micro-WNOp:

A gold standard for the evaluation of automatically compiled lexical resources for opinion mining. Milano, Italy.

Choi Y. and C. Cardie (2008). Learning with Composi

tional Semantics as Structural Inference for subsentential Sentiment Analysis. Proceedings of Recent Advances

in Natural Language Processing (RANLP), Hawaii.

Duan W., Bin Gu, and A. Whinston (2008). An empirical investigation of panel data. In Decision Support Systems,

Volume 45-2008. Elsevier B.V.

Esuli, Andrea and Fabrizio Sebastiani. (2006). SentiWordNet: A Publicly Available Lexical Resource for Opinion

Mining. In: Proceedings of LREC-2006, Genova, Italy.

Hatzivassiloglou, V., McKeown, K.B. (1997) Predicting the semantic orientation of adjectives. In Proceedings of

ACL-97, Madrid, Spain.

Jia, L., Yu, C.T., Meng, W. (2009) The effect of negation on sentiment analysis and retrieval effectiveness. In

CIKM-2009, China.

28

Kaiquan X., Stephen Shaoy Liao, Jiexun Li and Yukia Song (2011) . Do online reviews matter? - An empirical

investigation of panel data. In Decision Support Systems Volume 51-2011.

Kamps, J., R. J. Mokken, M. Marx, and M. de Rijke (2004). Using WordNet to measure semantic orientation of

adjectives. In Proceedings LREC-2004, Paris.

Kim, S. and E. Hovy (2004) Determining the sentiment of opinions. In Proceedings of COLING, Geneva, Swtizer-

land.

Kim, S. and E. Hovy (2006) Extracting Opinions Expressed in Online News Media Text with Opinion Holders and

Topics. In: Proceedings of the Workshop on Sentiment and Subjectivity in Text (SST-06). Sydney, Australia.

Maks, I.and P. Vossen (2010) Modeling Attitude, Polarity and Subjectivity in Wordnet. In Proceedings of Fifth

Global Wordnet Conference, Mumbai, India.

Mathieu, Y. Y. (2005). A Computational Semantic Lexicon of French Verbs of Emotion. In: Computing Attitude

and Affect in Text: Theory and Applications J. Shanahan, Yan Qu, J.Wiebe (Eds.). Springer, Dordrecht, The

Netherlands.

Mathieu,Y.Y. and C. Felbaum (2010). Verbs of emotion in French and English. In: Proceedings of GWC-2010,

Mumbai, India.

Moilanen K. and S. Pulman. (2007). Sentiment Composition. In Proceedings of Recent Advances in Natural Lan-

guage Processing (RANLP), Bulgaria.

Ruppenhofer, J. , M. Ellsworth, M. Petruck, C. Johnson, and J. Scheffzcyk (2010) Framenet II: Theory and Practice

(e-book) http://framenet.icsi. berkeley.edu/ book/book.pdf.

C. Strapparava and A. Valitutti (2004). WordNet-Affect: an affective extension of WordNet. In Proceedings LREC

2004, Lisbon, Portugal

Pit. M. (2003). How to express yourself with a causal connective. Dissertation, University Utrecht. Rodopi, Amster-

dam.

Somasundaran S. and Janyce Wiebe, (2010), Recognizing Stances in Ideological On-line Debates. , In Proceedings

of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in

Text, Los Angeles, CA. Association for Computational Linguistics, 2010

Su, F.and K. Markert (2008). Eliciting Subjectivity and Polarity Judgements on Word Senses. In Proceedings of

Coling-2008, Manchester, UK.

Valitutti, A. and C. Strapparava (2010). Interfacing Wordnet-Affect withj OCC model of emotions. In Proceedings

of EMOTION-2010, Valletta, Malta.

Wiebe J., Theresa Wilson , Rebecca Bruce , Matthew Bell , and Melanie Martin (2004). Learning subjective lan-

guage. Computational Linguistics 30 (3).

Wiebe, Janyce and Rada Micalcea.(2006) . Word Sense and Subjectivity. In Proceedings of ACL’06, Sydney, Aus-

tralia.

.

Documents

A lexicon model for deep sentiment analysis and opinion ... lexicon model for deep sentiment analysis and opinion mining applications Isa Maks Piek Vossen VU University, Faculty of