Upload
phungminh
View
244
Download
3
Embed Size (px)
Citation preview
1
A lexicon model for deep sentiment analysis and opinion mining
applications
Isa Maks Piek Vossen
VU University, Faculty of Arts
De Boelelaan 1105, 1081 HV Amsterdam,
The Netherlands
VU University, Faculty of Arts
De Boelelaan 1105, 1081 HV Amsterdam,
The Netherlands
[email protected] [email protected]
Abstract
This paper presents a lexicon model for the description of verbs, nouns and adjectives to be used in appli-
catons like sentiment analysis and opinion mining. The model aims to describe the detailed subjectivity re-
lations that exist between the actors in a sentence expressing separate attitudes for each actor. Subjectivity
relations that exist between the different actors are labeled with information concerning both the identity of
the attitude holder and the orientation (positive vs. negative) of the attitude. The model includes a categori-
zation into semantic categories relevant to opinion mining and sentiment analysis and provides means for
the identification of the attitude holder and the polarity of the attitude and for the description of the emo-
tions and sentiments of the different actors involved in the text. Special attention is paid to the role of the
speaker/writer of the text whose perspective is expressed and whose views on what is happening are con-
veyed in the text. Finally, validation is provided by an annotation study that shows that these subtle subjec-
tivity relations are reliably identifiable by human annotators.
1 Introduction
Subjectivity can be conveyed in various ways, at various text levels, and by various types of expressions.
We explore to what extent subjectivity is a property that can be associated with words and word sense
taking into consideration the detailed and subtle subjectivity relations that exist between words in a sen-
tence.
Consider the following example:
2
Ex. (1) … Damilola’s killers were boasting about his murder...
This sentence expresses a positive sentiment of the killers towards the fact they murdered Damilola and it
expresses the negative attitude on behalf of the speaker/writer towards the murderers of Damilola. Both
attitudes are part of the semantic profile of the word and should be modelled in a subjectivity lexicon.
As opinion mining and sentiment analysis applications tend to utilize more and more the composition of
sentences (Moilanen (2007), Choi and Cardie (2008), Jia et al. (2009)) and to use the value and properties
of the words expressed by its dependency trees, there is a need for specialized lexicons where this infor-
mation can be found. For the analysis of more complex opinionated text like news, political documents,
and (online) debates the identification of the attitude holder and topic are of crucial importance. Applica-
tions that exploit the relations between the word meaning and its arguments can better determine senti-
ment at sentence-level and trace emotions and opinions to their holders.
Our model seeks to combine the insights from a rather complex model like Framenet (Ruppenhofer et al.
(2010)) with operational models like Sentiwordnet where simple polarity values (positive, negative, neu-
tral) are applied to the entire lexicon. Subjectivity relations that exist between the different actors are
labeled with information concerning both the identity of the attitude holder and the orientation (positive
vs. negative) of the attitude. The model accounts for the fact that words may express multiple attitudes. It
includes a categorization into semantic categories relevant to opinion mining and sentiment analysis and
provides means for the identification of the attitude holder and the polarity of the attitude and for the de-
scription of the emotions and sentiments of the different actors involved in the text. Attention is paid to
the role of the speaker/writer of the event whose perspective is expressed and whose views on what is
happening are conveyed in the text.
As we wish to provide a model for a lexicon that is operational and can be exploited by tools for deeper
sentiment analysis and rich opinion mining, the model is validated by an annotation study of 580 verb,
595 noun and 609 adjective lexical units.
3
2 Related Work
In recent years many sentiment analysis and opinion mining applications have been developed to
analyse opinions, feelings and attitudes about products, brands, and news, and the like. These
applications mine opinions from different sources like online forums and news sites (Soma-
sundaran and Wiebe 2010) and from movie, product and hotel reviews (Duan et al. 2008, Kai-
quan et al. 2011). Many of these tools rely on manually built or automatically derived polarity and
subjectivity lexicons and, in particular for English, a couple of smaller and larger lexicons are available.
These lexicons are lists of words (for example, Hatzivassiloglou and McKeown (1997), Kamps et al.
(2004), Kim and Hovy (2004)) or list of word senses (for example, Esuli and Sebastiani (2006), Wiebe
and Mihalcea (2006), Su and Markert, (2008)) annotated for negative or positive polarity. Important as-
pects – as illustrated by examples (2a-d) - of these existing annotation schemata are (1) tagging at word
sense level instead of word level (2) distinction between objective and subjective words and (3) not only
subjective words but also objective words can have a negative or positive connotation.
Ex. (2)
(a) (Subjective:Negative) angry—feeling or showing anger; “angry at the weather; “angry customers; an angry
silence”
(b) (Subjective:Positive) beautiful—aesthetically pleasing
(c) (Objective:NoPolarity) alarm clock, alarm – a clock that wakes the sleeper at a preset time
(d) (Objective: negative) war, warfare – the waging of armed conflict against an enemy; ”thousands of people
were killed in the war”
All the above annotation schemata attribute single polarity le polarity values (positive, negative, neutral)
to words and therefore cannot account for more complex cases like boast (cf. ex. 1) which carry both
negative and positive polarity.
A more elaborate model is proposed by Strapparava and Valitutti (2004 and 2010) forwho developed
Wordnet-Affect, an affective extension of Wordnet. It describes ‘direct’ affective synsets which denote
4
emotions and ‘indirect’ affective synsets which refer to emotion carriers. Direct affective synsets are clas-
sified into emotion, cognitive state, trait, behavior, attitude and feeling. ‘Indirect’ affective synsets are
subdivided according to a specific appraisal model of emotions (OCC) that defines the effects of the emo-
tion to the subjects (actor, actee and observer) semantically connected to it. For example, the word vic-
tory, if localized in the past, can be used to express pride (related to the actor or “winner”), and
disappointment (related to the actee or “loser”). If victory is a future event the expressed emotion is hope.
Their model is similar to ours, as we both relate attitude to the participants of the event. However, their
model focuses on a rich description of different aspects and implications of emotions for each actor
whereas we infer a single positive or negative attitude for each actor. Their model seems to focus on the
cognitive aspects of emotion whereas we aim to also model the linguistic aspects by including specifically
the attitude of the Speaker/Writer in our model. Moreover, our description is not at the level of the synset
but at the lexical unit level which enables us to differentiate gradations of the strength of emotions of the
synonyms within the synsets. Moreover, it enables us to relate the attitudes directly to the syntactic-
semantic patterns of the lexical unit.
Framenet (Ruppenhofer et al. (2010)) is another important resource that is used in opinion mining and
sentiment analysis (Kim and Hovy (2006)). Framenet (FN) is an online lexical resource for English that
contains more than 11,600 lexical units. The aim is to classify words into categories (frames) which give
for each lexical unit the range of semantic and syntactic combinatory possibilities. The semantic roles
range from general ones like Agent, Patient and Theme to specific ones such as Speaker, Message and
Addressee for Verbs of Communication. FN includes frames such as Communication, Judgment, Opin-
ion, Emotion_Directed and semantic roles such as Judge, Experiencer, Communicator which are highly
relevant for opinion mining and sentiment analysis. However, subjectivity is not systematically and not
(yet) exhaustively encoded in Framenet. For example, the verb gobble (eat hurriedly and noisily) belongs
to the frame Ingestion (consumption of food, drink or smoke) and neither the frame nor the frame ele-
ments account for the negative connotation of gobble. Yet, we think that a resource like FN with rich and
5
corpus based valency patterns is an ideal base/ starting point for including subjectivity descriptions in
frame structures that can then be used to recognize subjectivity relations in text.
None of these theories, models or resources is specifically tailored for the subjectivity description of
verbs. Studies which focus on verbs for sentiment analysis, usually refer to smaller subclasses like, for
example, emotion verbs (Mathieu, 2005, Mathieu and Fellbaum, 2010) or quotation verbs (Chen 2005,
2007).
3 Subjectivity in the Lexicon
In the context of existing sentiment and subjectivity lexicons, the notion of subjectivity focuses on posi-
tive and negative polarity. Other types of subjectivity, like for example epistemic subjectivity which deals
with how certain or uncertain a speaker is about a given state-of-affairs, or speaker subjectivity which
deals with the ‘self-expression’ of the speaker, are considered less important. Obviously, this is related to
the applications these lexicons are intended for like sentiment and opinion mining which focus on views,
emotions and beliefs pro or against a certain product, film or point-of-view. However, if we consider
definitions of subjectivity in other linguistic areas, the focus is quite different (Pit 2003). Much attention
is paid to the presence of the speaker in the text and to what extent the speaker is needed as the reference
to be able to interpret the meaning of the sentence (Pit 2003). A distinction is made between speaker sub-
jectivity and character subjectivity (which for the remainder of this paper will be called actor subjectiv-
ity). Actor subjectivity refers to the representation of a character’s thoughts, emotions, utterances, and
perceptions and Speaker subjectivity refers to the representation of the thoughts, emotions, utterances, and
perceptions of the speaker. The following examples illustrate the different types.
Ex. (3)
a) He is a liar
Negative attitude of Speaker/Writer towards ‘he’
b) Bush is bad for the economy …
6
Negative attitude of Speaker/Writer towards Bush
Sentences (3a) and (3b) are illustrations of speaker subjectivity. Both express a negative attitude of the
speaker towards the participants, i.e. ‘he’ and Bush, respectively.
c) His hatred for religion ….
Negative Attitude of ‘his’ (he) towards ‘religion’
d) Bush is angry over Obama's leeking of private conversation …
Negative attitude of Bush towards Obama
e) T.F. upset her daughter by going …
Negative Attitude of ‘her daughter’ towards ‘T.F’
f) He urges her to turn off the phone …
Negative attitude of ‘he’ towards ‘her’and Positive attitude of ‘he’ towards ‘turning off her phone’ Negative at-
titude of ‘her’ towards the ‘urging’ action
g) They are enthusiastic about the service …
Positive attitude of ´they´ towards ´the service´
h) He is boasting about his daughter
Positive attitude of ‘he’ towards ‘his daughter’ and Negative attitude of the Speaker/Writer towards ‘he’.
Sentences (3c) to (3g) are illustrations of actor subjectivity. In all cases, the attitude of one or more actors
is expressed. The speaker is present – being the reporter of the participants’ attitudes – but he is neutral.
The actors may have different syntactic functions in the sentence, e.g. subject (cf. 3d, 3f, 3g ), object (cf.
3e, 3f) , or possessive (3c ), depending on the argument structure of the sentiment word. Sentence (3h) is
an illustration of both actor and speaker subjectivity combined in one word, i.e. boast.
In sentences (3a) to (3h) the identification of the kind of subjectivity and the possible inferences about
attitude holder and attitude target, are part of the semantics of the sentiment-bearing words. For example,
if we know that enthusiastic is an adjective which triggers actor subjectivity then we know that the atti-
7
tude holder is realized either by the subject of the sentence (e.g. they are enthusiastic …) or by the modi-
fied noun (e.g. enthusiastic supporters...). We also know that the PP-complement with about realizes the
attitude target. Likewise, if we know that bad is an emotion adjective which triggers speaker subjectivity
then we know that the attitude holder is the speaker or writer of the sentence and that the attitude target is
either realized by the subject (e.g. Bush is bad) or by the modified noun (e.g. bad children..). Thus, the
subjectivity type in combination with the argument structure, can be used to identify the attitude holder
and the attitude target.
This kind of information is important for subjectivity analysis. According to Banea et al. (2010) subjectiv-
ity analysis is “the task of identifying when a private state is being expressed and identifying attributes of
the private state. Attributes of private states include who is expressing the private state, the type(s) of
attitude being expressed, about whom or what the private state is being expressed, the polarity of the pri-
vate state, (i.e., whether it is positive or negative), and so on”.
In the following section we explain how we want to incorporate these aspects in a subjectivity lexicon.
4 Design of the Lexicon Model
The proposed model is built as an extension of an already existing lexical database for Dutch, i.e. Cor-
netto (Vossen et al. 2008). Cornetto combines two resources with different semantic organizations: the
Dutch Wordnet (DWN) which has, like the Princeton Wordnet, a synset organization and the Dutch Ref-
erence Lexicon (RBN) which is organized in form-meaning composites or lexical units. The description
of the lexical units includes definitions, usage constraints, selectional restrictions, syntactic behaviors and
illustrative contexts. DWN and RBN are linked to each other by the constraint that each synonym in a
synset is linked also a lexical unit in RBN. As argued in Maks and Vossen (2010), all synonyms (lexical
units) in a synset have the same denotation and the same overall polarity (positive, negative or neutral).
8
Lexical units within the synsets can however differ with in the gradation of the polarity or the frame reali-
zation of the subjectivity. The latter subjectivity information is therefore modeled as an extra layer for
each lexical unit representing subjectivity at word sense level.
In the next sessions we will further describe the lexical model for the subjectivity relations at word sense
level. The model consists of a categorization into semantic classes (section 4.1) and complementation
patterns (section 4.2 ).
4.1 Semantic Classes
4.1.1 Verbs
For the identification of relevant semantic classes we adopt – and broaden – the definition of subjective
language by Wiebe et al. (2006). Subjective expressions are defined as words and phrases that are used to
express private states like opinions, emotions, evaluations, and speculations.
Three main types are distinguished:
• Type I :
Direct reference to private states (e.g. his alarm grew, he was boiling with anger). We include in this cate-
gory emotion verbs (like feel, love and hate) and cognitive verbs (like defend, dare, realize etc.) . Ac-
cordintype coincides with an actor subjectivity category. Type I members have actor subjectivity.
• Type II:
Reference to speech or writing events that express private states (e.g. he condemns the president, they
attack the speaker). According to our schema, this category includes all speech and writing events and the
annotation schema points out if they are neutral (say, ask) or bear polarity (condemn, praise). Type II
members have actor subjectivity.
• Type III:
9
The third category contains 'expressive subjective elements' which express indirectly the private state of
the speaker/writer (e.g. superb, that doctor is a quack). Verbs that fall in this category are often also a
member of one of the other categories. For example, boast (cf. ex. 1) is both a Type II (i.e. speech act
verb) verb and a Type III verb as it expresses the negative attitude of the speaker/writer towards the
speech event. By considering this category as combinational, we can make a distinction between
Speaker/Writer subjectivity and Actor subjectivity.
Type Semantic Category Description Example
I (+III) EXPERIENCER Verbs that denote emotions. Included are both experiencer
subject and experiencer object verbs.
hate, love, enjoy, enter-
tain, frighten, upset,
frustrate, cry
I(+III) ATTITUDE A cognitive action performed by one of the actors, in
general the structural subject of the verb. The category is
relevant as these cognitive actions may imply attitudes
between actors.
defend, think, dare,
ignore, avoid, feign,
pretend, patronize,
devote, dedicate
II(+III) JUDGMENT A judgment (mostly positive or negative) that someone
may have towards something or somebody. The verbs
directly refer to the thinking or speech act of judgment.
praise, admire, rebuke,
criticize, scold, re-
proach, value, rate,
estimate
II(+III) COMM-S A speech act that denotes the transfer of a spoken or writ-
ten message from the perspective of the sender or speaker
(S) of the message. The sender or speaker is the structural
subject of the verb.
speak, say, write, grum-
ble, stammer, talk,
email, cable, chitchat,
nag, inform
II(+III) COMM-R A speech act that denotes the transfer of a spoken or writ-
ten message from the perspective of the receiver(R) of the
message. The receiver is the structural subject of the verb
read, hear, observe,
record, watch, compre-
hend
IV(+III) ACTION A physical action performed by one of the actors, in gen-
eral the structural subject of the verb. The category is
relevant as in some cases actors express an attitude by
performing this action.
run, ride, disappear, hit,
strike, stagger, stumble,
beat
IV(+III) PROCESS_STATE This is a broad and underspecified category of state and
process verbs (non-action verbs) and may be considered as
a rest category as it includes all verbs which are not in-
cluded in other categories.
grow, be, disturb, driz-
zle, muzzle,
Table (1) Semantic categorization of verbs
10
• Type IV (actor subjectivity)
In addition, we specify a fourth category. Consider the following examples:
Ex. (4) the teacher used to beat the students
Ex. (5) C.A is arrested for public intoxication by the police
Both beat and arrest are observable actions and therefore usually considered 'objective'. However, in
many contexts these verbs refer to the private state of one of the actors. In ex. (4) the teacher and the stu-
dents will have bad feelings towards each other and also in ex. (5) C.A. will have negative feelings about
the situation. Type IV thus make reference to a private state that is the source or the consequence of an
event (action, state or process) while the event is explicitly mentioned.
Verb senses which are categorized as Type I, II or III are always considered as subjective; verb senses
categorized as Type IV are only subjective if one of the annotation categories (see below for more details)
has a non-zero value; otherwise they are considered as objective.
The actor subjectivity types (type I, I and IV) tend to correlate with global semantic categories. We there-
fore include the semantic category to support more consistent manual annotation and automatic acquisi-
tion of the subjectivity of words. Table (1) presents the different semantic categories with examples for
each. The first column lists the potential subjectivity classes that can apply.
4.1.2 Nouns
Nouns are in the Cornetto Database categorized in a semantic classification which consists of the follow-
ing classes: Human (+animate), Nonhuman (+animate), Institution, Artifact (+concrete), Sub-
stance(+concrete), Concrother (+concrete: concrete entities other than artifact or substance), Time (-
animate), Place (-animate), Measure (-animate), Institution, and Abstract (-animate) .
We added 10 more subclasses to Abstract for their relevance to the subjectivity lexicon: Social Object,
Mental Object, Quality, Attitude, Behavior, Situation, Emotion, Physical State, Social State and Event.
Finally, we merged the classes Substance and Concrother into one class labeled Physical Object. Table
11
(2) presents the resulting categories with examples for each.
Name Description Examples
ARTIFACT Something made or given shape by man table, house, bicycle, sack
PHYSICAL
OBJECT
A tangible and visible entity (other than artifact) arm, sand, coffee, trash
INSTITUTION Institution or group of persons school, advisory board
NONHUMAN Reference to an animal cat, mongrel
HUMAN Reference to a person girl, nurse, monster
PHYSICAL
STATE
The physical condition of a person healthiness, virginity
SOCIAL STATE The social situation of a person or group of persons wealth, unemployment
SOCIAL OBJECT A product of groups or of their social behavior punk, law, lifestyle, religion
EMOTION Nouns that denote emotions and emotional responses joy, feeling, smile
MENTAL
OBJECT
A cognitive product of an individual reaction, idea, recommendation
ATTITUDE A mental state to act in certain ways supervision, resistance, falling out
BEHAVIOR The way a person behaves arrogance, friendliness
SITUATION Description of a state-of-affairs catastrophe, disaster
QUALITY A property of persons, objects or other entities size, intelligence, beauty
TIME A period of time hour, day, school time
PLACE Location Belgium, suburb
MEASURE
How much there is of how many there are of
something
degree, meter, yard
EVENT
Something that happens at a given time or place (and
not categorized in any of the other classes)
growth, meeting, orthodontics
Table (2) Semantic categorization of nouns
4.2 Attitude
In our model, subjectivity is defined in terms of attitude holders carrying attitude towards each other or
towards the events in the text (cf. section 3). For example, experiencers hold attitudes towards targets and
communicators express a judgment about an evaluee. We developed an annotation schema (see Table 2
12
below) which enables us to relate the attitude holders and the orientation of the attitude (positive, negative
or neutral). In the case of nouns and verbs, the syntactic valencies of the verb or noun are associated with
the respective attitude holders.
To be able to attribute the attitudes to the relevant actors we identify for each form-meaning unit the se-
mantic category, the semantic-syntactic the arguments, the associated semantic roles and some coarse
grained selection restrictions.
4.2.1 Actor’s Attitude
• Verbs
We formulated simple argument structure for the verbs specifying the number of arguments, some coarse
grained selectional restrictions, and the use of prepositions. For example:
SMB (A1) give (A0) SMT (A2) to SMB (A3)
A1, A2 and A3 are the arguments of the verbs which are lexicalized by the structural subject (A1), the
direct object (A2 or A3) and indirect/prepositional object (A2 or A3). A2 and A3 both can be syntacti-
cally realized as an NP, a PP, that-clause or infinitive clause. Each actor is associated with coarse-grained
selection restrictions: SMB (somebody; +human), SMT (something; -human) or SMB/SMT (some-
body/something; +/– human). If the syntactic realization consists of a PP-phrase the preposition is in-
cluded in the argument structure (cf. to in the argument structure of give).
Attitude (positive, negative and neutral) can be attributed to arguments if they represent +human entities
and are targeted towards the other categories either +animate or -animate entities. Annotation categories
are labeled as:
• A1A2 which means that the attitude is held by A1 and is targeted towards A2;
• A1A3 which means that the attitude is held by A1 and is targeted towards A3
• A1A0, A2A0 or A3A0, which means that the attitude is held by A1, A2 or A3, respectively and is
targeted towards A0 which is the event or situation itself.
13
The following examples illustrate how the annotations refer to the complementation patterns. Other ex-
amples can be found in table (5) below.
• Verdedigen (to defend: argue or speak in defense of)
Transliteration: He (A1) defends (A0) his decision (A2) against critique (A3)
Complementation Pattern: SMB(A1) defend (A0) SMT/SMB (A2) against SMT/SMB (A3)
Annotations: A1A2 positive; A1A3 negative; A1A0 NONE; A2A0 NONE; A3A0 NONE
• Verliezen (to lose: miss from one's possessions)
Transliteration: He (A1) loses (A0) his sunglasses (A2) like crazy
Complementation Pattern: SMB (A1) lose (A0) SMT (A2)
Annotations: A1A2 NONE; A1A3 not relevant ; A1A0 negative; A2A0 NONE;
A3A0 not relevant
• Nouns
Nouns can denote concrete objects or substances (so-called first-order entities) or they can denote events,
states, properties and relations (so-called higher-order entities). Complementation patterns of first-order
entities are usually simple, whereas patterns of higher-order entities are similar to those of verbs, but - in
our model - do not have more than two arguments. For example:
SMB’s (A1) complaint (A0) against SMT/SMB (A2)
A1 is the logical subject in the case of an event noun (like complaint ) and A2 is always realized through
a prepositioanl phrase. In the case of other semantic types, A1 may have another function, for example
‘someone having’ (A1’s arrogance, A1’s friend).
Attitude (positive, negative and neutral) can be attributed to arguments if they represent +human entities
and are targeted towards the other arguments representing either +animate or -animate entities. Annota-
tion categories are labeled as
- A1A2/0 which means that the attitude is held by A1 and is targeted towards A2 or – if there is no
A2 – towards A0;
14
- A2A0/1 which means that the attitude is held by A2 and is targeted towards A0 or A1; applied
only when different from A1A2/0.
The following examples illustrate how the annotations refer to the complementation patterns. Other ex-
amples can be found in table (4) below.
• verdriet (sadness : emotions experienced when not in a state of well-being )
Transliteration: The Dalai Lama (A1) expresses his (A1) sadness (A0) over the recent earth
quake in New Zealand (A2)
Complementation Pattern: SMB’s(A1) sadness (A0) over SMT/SMB (A2)
Annotations: A1A0/2 negative ; A2A0/1 none
• vriend (friend : a person you know well and regard with affection and trust)
Transliteration: he was my (A1) friend (A0)
Complementation Pattern: SMB’s(A1) vriend
Annotations: A1A0/2 - positive ; A2A0/1 n.r.
• tabel (table: a set of data arranged in rows and columns)
Transliteration: as can be seen from the following table (A0)
Complementation Pattern: no complementation pattern
Annotations: no annotations
• Adjectives
With respect to adjectives no complementation patterns are specified. If the adjective is labeled with
actor subjectivity and used attributively (cf. ex. 6) the noun of the adjective-noun combination is consid-
ered the attitude holder; if the adjective is labeled with actor subjectivity and used predicatively (cf. ex. 7,
and cf. section 3) the subject of the sentence is considered the attitude holder. Actor Subjectivity of ad-
jectives includes two types: the Experiencer who inactively undergoes an emotion (cf. ex. 6) and the
Agent who actively takes a stance or attitude (cf. ex. 7). In the annotation schema these two types are
both labeled as actor subjectivity (A1).
15
ex. (6) …excited fans welcome Obama…
ex. (7) McCain was critical of Iraq war from the beginning ….
Other examples can be found in table (2) below.
4.2.2 Speaker/Writer’s attitude (SW)
The attitude of the speaker is conveyed in a way that differs considerably from that of the actors. There is
no reference to the attitude holder (unless the text is written in first person like, for example, I am against
the ban on English shirt during the World Cup). The Speaker/Writer (SW) expresses his attitude by
choosing words with affective connotation (cf. ex. 8, 9 and 10) or by conveying his subjective interpreta-
tion of what happens (cf. ex. 11 and 12).
ex. 8: He gobbles down three hamburgers
ex. 9: She is a quack
ex. 10: That quack couldn’t help her
ex. 11: awesome here!
ex. 12: B. S. misleads district A voters
In ex. (8), the SW describes the eating behavior of the ‘he’ and he also expresses his negative attitude
towards this behavior by choosing the negative connotation word gobble. Since the behaviour is carried
out by the A1 (the structural subject), the judgment also applies to the A1. In ex.(9), a noun that denotes
people is used as a predicate of the subject and likewise expresses the judgment of the SW of the subject.
In the case of ex. (10), quack is the subject of the sentence and therefore SW’s judgment applies directly
to the denotation of subject. In ex. (11) the speaker expresses his positive attitude about his current situa-
tion by just describing it. Finally in ex. (12), the SW expresses his negative attitude towards the behavior
of the subject of the sentence by pointing to the negative implications of the event.
16
We use a separate property (SW) to indicate the explicit judgment of the Speaker/Writer, as can be seen
from tables 3, 4 and 5 below.
4.2.3 Everybody’s Attitude (ALL)
Some concepts are considered negative by a vast majority of people and therefore express a more general
attitude shared by most people. For example, to die will be considered negative by everybody, i.e. observ-
ers, speakers and listeners. These concepts are ‘objective’ in the sense that they refer to measurable and
observable concepts. They do not express sentiment but they may evoke it.
Examples are:
ex. (13) …this suit is water-resistant ..
ex. (14) … thousands of casualties after an earthquake in …
In general, people are positive about water-resistant suits, and negative about earthquakes and casualties.
In the annotation schema Everybody’s attitude is indicated by the feature ALL.
4.3 Example of the Full Annotation Schema
The annotation model for adjectives, nouns and verbs is illustrated by the table (3), (4) and (5), respec-
tively.
Adjectives
FORM SUMMARY A1 SW ALL
gelukkig (happy) experiencing pleasure or joy +4 - -
kritisch (critical) inclined to criticize -4 0 -
mooi (beautiful) aesthetically pleasing 0 4 -
slecht (bad) having undesirable or negative qualities 0 -5 -
waterbestendig (water-
resistant) hindering the penetration of water 0 0 +2
kreupel (crippled) disabled in the feet or legs 0 0 -3
burgerlijk (civil, civilian) applying to citizens 0 0 0
17
Explanation:
Form the annotated item (A0)
Summary short description of the meaning of the annotated item
A1 A1 has a positive or negative attitude
SW SW has a positive or negative attitude
ALL there is a general positive or negative attitude towards the state-of-affairs described by the annotated item
Table (3) Annotation Schema Adjectives
Nouns
FORM SUMMARY Complementation Pattern A1A0/2 A2A0/1 SW ALL
oplawaai (wallop) a powerful stroke with the fist
or a weapon
oplawaai (A0) van SMB(A1) aan
SMB/SMT (A2) -3 0 -2 0
arrogantie (arrogance)
overbearing pride evidenced
by a superior manner toward
inferiors
SMB’s (A1) arrogantie (A0) 0 0 -4 0
gezwam (folderol, rub-
bish) nonsensical talk or writing SMB’s (A1) gezwam (A0) 0 0 -3 0
storing (disturbance) activity that is an intrusion or
interruption - 0 0 0 -2
obsessie (obsession) an unhealthy preoccupation
with something or someone
SMB’s (A1) obsessie (A0) voor SMB/SMT
(A2) (SMB’s obsession for SMB/SMT) +5 0 -3 0
schoenmaker (shoe-
maker)
a person who makes or repairs
shoes - 0 0 0 0
dader (wrongdoer) a person who transgresses
moral or civil law
dader (A0) van SMT (A1) op/tegen
SMB/SMT (A2) 0 -3 0 0
vriend (friend) SMB’s (A1) friend (A0) +3 0 0 0
Explanation:
Form the annotated item (A0)
Summary short description of the meaning of the annotated item
A1A0/2 A1 has a positive or negative attitude either towards A0 (if there is no A2) or towards A2
A2A0/1 A2 has a positive or negative attitude towards A0 or A1 (only if different from A1A0/2)
SW SW has a positive or negative attitude towards A0 and/or A1
ALL there is a general positive or negative attitude towards the state-of-affairs described by the annotated item
Table (4) Annotation Schema Nouns
18
Verbs
FORM SUMMARY SEMTYPE COMPLEMENTATION A1A2 A1A3 A1A0 A2A0 A3A0 SW ALL
vreten
(devour, gobble)
eat immoderately
and hurriedly ACTION SMT (A2) 2 0 0 0 0 -4 0
afpakken
(take away)
take without the
owner’s consent ACTION SMT(A2) van SMB (A3) 0 0 0 0 -3 0 0
verliezen (lose) lose: fail to keep
or to maintain PROCESS SMT (A2) 0 0 -3 0 0 0 0
dwingen (force)
urge a person to
an action ATTITUDE SMB (A2) tot SMT (A3) -3 2 0 0 0 0 0
opscheppen (boast)
to speak with
exaggeration and
excessive pride
COMM-S over SMB/SMT (A2) 3 0 0 0 0 -4 0
helpen (help)
give help or
assistance ; be of
service
ACTION SMB(A2) met SMT (A3) 2 1 0 0 0 0 0
bekritiseren(criticize) express criticism
of COMM-S SMB (A2) -3 0 0 0 0 0 0
zwartmaken (slander)
charge falsely or
with malicious
intent
COMM-S SMB (A2) -3 0 0 0 0 -4 0
verwaarlozen (neglect) fail to attend to ATTITUDE SMB (A2) -3 0 0 0 0 -4 0
afleggen
(lay out)
prepare a dead
body ACTION SMB (A2) 0 0 0 0 0 0 -1
Explanation:
Form the annotated item (A0)
Summary short description of the meaning of the annotated item
A1A2 A1 has a positive or negative attitude towards A2
A1A3 A1 has a positive or negative attitude towards A3
A1A0 A1 has a positive or negative attitude towards A0
A2A0 A2 has a positive or negative attitude towards A0
A3A0 A3 has a positive or negative attitude towards A0
SW SW has a positive or negative attitude towards A0 and/or A1
ALL there is a general positive or negative attitude towards the state-of-affairs described by the annotated item
Table (5) Annotation Schema Verbs
19
5 Inter-annotator Agreement Study
To explore our hypothesis that different attitudes associated with the different attitude holders can be
modeled in an operational lexicon and to explore how far we can stretch the description of subtle subjec-
tivity relations, we performed an inter-annotator agreement study to assess the reliability of the annotation
schema.
We are aware of the fact that it is a rather complex annotation schema and that high agreement rates are
not likely to be achieved. The main goal of the annotation task is to determine to what extent this kind of
subjectivity information can be reliably identified, which parts of the annotation schema are more difficult
than others and perhaps need to be redefined. This information is especially valuable to evaluate the
automatic acquisition of the (parts of) the information specified by the annotation schema.
Annotation is performed by two linguists (i.e. both authors of this paper). We did a first annotation task
for training and discussed the problems before the gold standard annotation task was carried out. The
annotation is based upon the full description of the lexical units including glosses and illustrative exam-
ples.
5.1 Composition of the Sample
We aim at a sample that is representative for the whole lexicon and relevant for subjectivity identification
and annotation. As the sample also serves as a benchmark for evaluation and comparison of the auto-
matically annotated sentiment word lists and must help to improve the methods which derive these word
lists, we specified the following requirements:
• Frequency: most recently used goldstandards for English, like General Inquirer (Stone, 1966) and
Micro-Wnop (Cerini, 2007) seem to be unbalanced with regard to frequency which may lead to a non
representative high degree of subjectivity-ambiguous words (Su et al., 2008). Moreover, as the number of
infrequent or rare words is relatively high in subjective text (Wiebe et al., 2004), we think it is important
to include high frequency as well as low frequency words in the gold standard.
20
• Polysemy is mentioned by Andreevskaia et al. (2006) as one of the major origins of errors of their
sentiment tagging system. Especially distinguishing sentiment-laden meanings of a given polysemous
word from neutral meanings, seriously hampers the performance of their system. We therefore include
high and low polysemous words in our sample.
• Synset Size: in our own experience (Maks and Vossen (2010)) sentiment-bearing words tend to
cluster in large synsets, which should be internally consistent with respect to polarity. To be able to test
for this hypothesis, we included also some members of large synsets in the test set.
We aim at an equal distribution of test items of nouns (N), verbs (V) and adjectives (A) across these three
dimensions and across three values (high, medium and low). Thus, each item focuses on a single phe-
nomenon which distinguishes it from all other test items and the performance of systems can be evaluated
for isolated phenomena. Table (1) shows the results of this selection procedure: the test items are distrib-
uted in such a way that each dimension-value combination includes a representative number of word
senses. Word frequency is based upon two different Dutch corpora. Polysemy numbers are based on
Cornetto data: monosemous items are categorised as ‘low’; items with with 2 to 4 senses are categorised
as ‘mid’ and items with more than 4 senses are considered highly polysemous, Synset size is also based
upon Cornetto data: synsets with one synonym are considered to have ‘low’ synset-size, 2 to 5 synonyms
as ‘mid’ and synsets with more than 5 synonyms are considered ‘high’ (cf. table 6).
frequency polysemy synset-size
A N V A N V A N V
high 179 251 241 129 183 192 202 104 154
mid 164 113 178 239 236 199 256 150 241
low 266 231 162 241 176 190 151 341 186
totals 609 595 581 609 595 581 609 595 581
Table 6: Distribution of gold standard items
21
5.2 Inter-annotator Agreement Results
5.2.1 Overall inter-annotator agreement
Overall percent agreement for all annotation categories in the model is 79% (perc.) with a Cohen kappa
of 0.73 for adjectives (ADJ), 74% with a Cohen kappa of 0.62 for nouns and 66% with a Cohen kappa
of 0.62 for verbs (cf. table 7).
perc. kappa number of items
ADJ 79% 0.73k 609
NOUNS 74% 0.63k 595
VERBS 66% 0.62k 574
Table 7: Overall Results
5.2.2 Agreement with respect to Semantic Class
Table (8) gives the inter-annotator percentage agreement for each semantic class separately. ‘Number’
gives the number of test items for that particular semantic class. It can be seen that agreement differs
considerably across the different classes with low scores for the noun class ‘quality’ and the verb class
‘state-process’ and good scores for the other classes.
NOUNS Percentage Number VERBS Percentage Number
All classes 74% 595 All classes 66% 581
physical obj 79% 107 action 66% 302
human 80% 103 attitude 68% 71
artefact 86% 74 comm-r 81% 16
event 68% 73 comm-s 77% 57
mental obj 84% 50 experiencer 83% 28
behavior 69% 43 judgment 84% 25
emotion 70% 27 state_process 55% 82
situation 80% 20
22
attitude 82% 17
institution 81% 16
social obj 80% 15
nonhuman 84% 13
quality 30% 10
Table 8: Inter-annotator agreement for semantic classes
5.2.3 Inter-annotator Agreement with respect to Attitude Holder Categories
Table (9) gives the inter-annotator percentage agreement (perc.) and kappa agreement (kappa) for each
separate attitude holder category. Column 3 (with sentiment) gives the average number of items that have
been tagged as positive or negative, for the given category. This number may be seen as an indication of
the relevance of the annotation category for subjectivity analysis.
Adjectives
perc. kappa with sentiment
A1 97% 0.81k 46
SW 83% 0.75k 385
ALL 86% 0.67k 35
Nouns
A1A0/1 94% 0.81k 117
A2A1/0 98% 0.40k 11
SW 90% 0.76k 163
ALL 89% 0.32k 48
Verbs
A1A2 89% 0.73k 143
A1A3 98% 0.73k 15
A1A0 93% 0.41k 37
A2A0 94% 0.56k 42
A3A0 98% 0.54k 10
SW 91% 0.76k 137
ALL 87% 0.37k 64
Table 9: Inter-annotator agreement for attitude holder categories
23
Categories SW, A1, A1A0/1, A1A2 and A1A3 are reliably identifiable with kappa scores between 0.73
and 0.81. The smaller actor subjectivity categories A2A1/0, A1A0, A2A0, and A3A0 and the category
ALL are not reliably identifiable with kappa scores lower than 0.60.
5.2.4 Single Polarity
One single polarity value for each item is derived by collapsing all attitude holder polarity values into one
single value. If an item is tagged with different polarity values, SW subjectivity is considered as the most
important one, and precedes over the other categories. As can be seen from table (10), observed agree-
ment is 86%, 87% and 83% with kappa agreement of 0.80, 0.80 and 0.74 for adjectives, nouns and verbs,
respectively. Separate polarity computation (positive, negative, neutral and posneg) – with one polarity
value of interest and the other values combined into one non-relevant category - shows that all polarity
values, except for posneg, are reliably identifiable.
A N V
perc. kappa perc. kappa perc. kappa
single polarity 86% 0.80k 87% 0.80k 83% 0.74k
negative 95% 0.90k 93% 0.84k 91% 0.83k
positive 91% 0.81k 92% 0.75k 88% 0.69k
neutral 92% 0.76k 90% 0.79k 86% 0.71k
posneg 94% 0.38k
Table 10: agreement rates for polarity categories
5.3 Disagreement Analysis
The results differ considerably depending on the parts-of-speech and they decline as the annotation
schema becomes more complex (cf. table 7). Overall kappa scores for adjectives (0.73) are quite good
and the scores for verbs and nouns are reasonable considering the complexity of their schemas. Moreover,
we can see in table (8) that most semantic classes that are important for subjectivity analysis, also have
24
rather high percentage agreement scores. In general, relevant semantic classes seem to perform better than
less relevant classes, like Action and Process-state.
Likewise, results differ depending on the type of attitude holder (cf. table 9). SW subjectivity and ‘main’
actor (A1) subjectivity are reliably identifiable. When the attitude is attributed to a second or third actor
(category A2A1/0 for nouns and categories A1A0, A2A0 and A3A0 for verbs) the annotations are less
reliable. Confusion is found also between between verb annotations A2A0 and A1A2. For example, with
respect to misleiden (mislead), annotators agree about a negative attitude from A1 towards A2, but one
annotator marks additionally a negative attitude on behalf of A2 (A2A0: negative) whereas the other does
not. Similar cases are found with nouns. However, since the low scoring categories are rather small, we
think that the overall results for attitude-holdership are quite useful and reliable.
The category ALL seems not to be defined well. Examples of disagreements of this kind are ontwateren
(drain), droogte (dryness/waterlesness) , breuk (fraction, break)). Annotators do not agree about
whether some general positive or negative feelings are associated to them or not because the nega-
tive/positive implication depends on the context. Even though there may agreement on the most common
context, the fact that it can vary leads to differences in the annotation.
Single polarity performs quite good across all word classes (cf. table 10). It shows that a complicated and
layered annotation does not hamper overall agreement and may also produce lexicons for applications that
use single polarity only. The different performance for negative polarity on the one hand and positive and
neutral on the other, are possibly due to the linguistically unmarkedness of the neutral and positive forms
(e.g. honest, understand), whereas the category ‘negative’ includes many items that are morphologically
recognizable by a negativity marker (e.g. misunderstand, unreliable, dishonest). In fact, Kim and Hovy
(2004) reported that inter-annotator agreement of their annotations increases from 76% to 89% when
positive and neutral categories are merged.
A closer look at the data shows that disagreement also occurs when collocational information may lead
one annotator to see subjectivity in a sense and the other not. For example, houden (keep - conform one’s
action or practice to) associated with collocations like to keep appointments and to keep one’s promises is
25
considered positive (A1A2) by one annotator and neutral by the other. This seems to apply in particular
to all frequent light verbs with little semantic content like make, do and take.
Summarizing, we conclude that overall agreement is good, especially with regard to most semantic cate-
gories relevant for subjectivity analysis and with respect to the most important attitude holder categories.
When defining an operational model the small and low scoring categories will be collapsed into one un-
derspecified attitude holder category.
5.4 Different Lexicon dimensions
In this section we explore the correlations between the human annotations and the different lexicon di-
mensions: frequency, polysemy and synset-size. Table 11 gives inter-annotator percentage agreement
rates for the different word classes nouns (N), verbs (V) and adjectives (A) with regard to their classifica-
tion into the low, mid and high values of the different dimensions. For example, from column 2 (fre-
quency-NVA) can be seen that, with respect to all three word classes together, inter-annotator agreement
increases as the frequency of the item decreases (from 73% for items with high frequency to 75% for
items with low frequency). However, this is not a continuous increase as mid-frequent items have an
agreement score of 77%. The scores imply that highly frequent items are harder to identify with regard to
the annotations of the current study than less frequent items. Likewise, from column 6 (polysemy-NVA)
can be seen that agreement decreases when polysemy increases, and from column 10 (synset-size-NVA)
can be seen that agreement increases when synset-size increases.
These upward and downward trends (cf. row 6 – column 2, 6 and 10) of all items regardless of their part-
of-speech, are in line with our expectations. Agreement decreases as polysemy increases, i.e. the usually
more or less related meanings of polysemous words may be hard to separate from each other and lead to
ambiguous annotations. Polysemy and frequency are often related as frequent items are often polyse-
mous. Likewise, we see a negative correlation between agreement rates and frequency: the more frequent
the items are, the harder they are to annotate in a reliable manner.
26
frequency polysemy synset-size
ANV A N V ANV A N V ANV A N V
high 73% 72% 82% 64% 73% 78% 80% 61% 77% 85% 77% 67%
mid 77% 84% 77% 70% 76% 79% 80% 66% 73% 76% 78% 65%
low 75% 80% 75% 64% 76% 80% 75% 70% 74% 75% 77% 66%
trend +2% +8% -7% 0% +3% +2% -5% +9% -3% -10% 0% -1%
Table 11: Correlation of annotation results and lexicon dimensions
With regard to large synset membership we expect that agreement increases as synset-size increases:
items that are members of large synsets are easier to annotate than items that are not. This is explained by
the observation (Maks and Vossen (2010)) that subjective items tend to cluster in large synsets. These
‘really’ subjective items are characterised by a lack of denotation – for which they are easily grouped
together - and a high degree of connotational polarity – for which they are easy to annotate.
Whereas adjectives and verbs follow the expected trends, the results with regard to nouns differ consid-
erably from our expectations. Frequency and polysemy scores for nouns show downward trends where
upward ones are expected. Moreover, as in all cases the mid-frequent items unexpectedly perform better
than the low-frequent items, there seems to be no obvious correlation between lexicon dimension and
human annotation scores.
6 Conclusions
In this paper we presented a lexicon model for the description of verbs to be used in applications like
deeper sentiment analysis and opinion mining, describing the detailed and subtle subjectivity relations
that exist between the different participants of a verb, noun or adjective. The relations can be labeled with
subjectivity information concerning the identity of the attitude holder, the orientation (positive vs. nega-
tive) of the attitude and its target. Special attention is paid to the role of the speaker/writer of the event
whose perspective is expressed and whose views on what is happening are conveyed in the text.
27
We measured the reliability of the annotation. The results show that speaker subjectivity and, to a certain
extent, actor subjectivity can be reliably identified. As the unreliable categories have a relatively small
number of members, we think that the annotation schema is sufficiently validated. In an operational
model of the lexicon the small low scoring categories, including ALL, will be merged into one category.
An additional outcome to our study is that we created a gold standard of approximately 600 items for each
part-of-speech. In the future we will use this gold standard to test methods for the automatic detection of
subjectivity and polarity properties of word senses in order to build a rich subjectivity lexicon for Dutch.
7 Acknowledgments
This research has been carried out within the project From Text To Political Positions (http: //www2.let.vu.nl/oz/cltl/t2pp/). It is funded by the VUA Interfaculty Research Institute CAMeRA.
8 References
Andreevskaia, A. and S. Bergler (2006) Mining WordNet for Fuzzy Sentiment:Sentiment Tag Extraction from-
WordNet Glosses. In: EACL-2006, Trento, Italy.
Chen, L. (2005) Transitivity in Media Texts: negative verbal process sub-functions and narrator bias. In Interna-
tional Review of Applied Linguistics in Teaching, (IRAL-vol. 43) Mouton De Gruyter, The Hague, The Nether-
lands.
Cerini, S., Compagnoni, V., Demontis, A., Formentelli, M., and Gandini, G. (2007). Language resources and lin-
guistic theory: Typology, second language acquisition, English linguistics (Forthcoming), chapter Micro-WNOp:
A gold standard for the evaluation of automatically compiled lexical resources for opinion mining. Milano, Italy.
Choi Y. and C. Cardie (2008). Learning with Composi
tional Semantics as Structural Inference for subsentential Sentiment Analysis. Proceedings of Recent Advances
in Natural Language Processing (RANLP), Hawaii.
Duan W., Bin Gu, and A. Whinston (2008). An empirical investigation of panel data. In Decision Support Systems,
Volume 45-2008. Elsevier B.V.
Esuli, Andrea and Fabrizio Sebastiani. (2006). SentiWordNet: A Publicly Available Lexical Resource for Opinion
Mining. In: Proceedings of LREC-2006, Genova, Italy.
Hatzivassiloglou, V., McKeown, K.B. (1997) Predicting the semantic orientation of adjectives. In Proceedings of
ACL-97, Madrid, Spain.
Jia, L., Yu, C.T., Meng, W. (2009) The effect of negation on sentiment analysis and retrieval effectiveness. In
CIKM-2009, China.
28
Kaiquan X., Stephen Shaoy Liao, Jiexun Li and Yukia Song (2011) . Do online reviews matter? - An empirical
investigation of panel data. In Decision Support Systems Volume 51-2011.
Kamps, J., R. J. Mokken, M. Marx, and M. de Rijke (2004). Using WordNet to measure semantic orientation of
adjectives. In Proceedings LREC-2004, Paris.
Kim, S. and E. Hovy (2004) Determining the sentiment of opinions. In Proceedings of COLING, Geneva, Swtizer-
land.
Kim, S. and E. Hovy (2006) Extracting Opinions Expressed in Online News Media Text with Opinion Holders and
Topics. In: Proceedings of the Workshop on Sentiment and Subjectivity in Text (SST-06). Sydney, Australia.
Maks, I.and P. Vossen (2010) Modeling Attitude, Polarity and Subjectivity in Wordnet. In Proceedings of Fifth
Global Wordnet Conference, Mumbai, India.
Mathieu, Y. Y. (2005). A Computational Semantic Lexicon of French Verbs of Emotion. In: Computing Attitude
and Affect in Text: Theory and Applications J. Shanahan, Yan Qu, J.Wiebe (Eds.). Springer, Dordrecht, The
Netherlands.
Mathieu,Y.Y. and C. Felbaum (2010). Verbs of emotion in French and English. In: Proceedings of GWC-2010,
Mumbai, India.
Moilanen K. and S. Pulman. (2007). Sentiment Composition. In Proceedings of Recent Advances in Natural Lan-
guage Processing (RANLP), Bulgaria.
Ruppenhofer, J. , M. Ellsworth, M. Petruck, C. Johnson, and J. Scheffzcyk (2010) Framenet II: Theory and Practice
(e-book) http://framenet.icsi. berkeley.edu/ book/book.pdf.
C. Strapparava and A. Valitutti (2004). WordNet-Affect: an affective extension of WordNet. In Proceedings LREC
2004, Lisbon, Portugal
Pit. M. (2003). How to express yourself with a causal connective. Dissertation, University Utrecht. Rodopi, Amster-
dam.
Somasundaran S. and Janyce Wiebe, (2010), Recognizing Stances in Ideological On-line Debates. , In Proceedings
of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in
Text, Los Angeles, CA. Association for Computational Linguistics, 2010
Su, F.and K. Markert (2008). Eliciting Subjectivity and Polarity Judgements on Word Senses. In Proceedings of
Coling-2008, Manchester, UK.
Valitutti, A. and C. Strapparava (2010). Interfacing Wordnet-Affect withj OCC model of emotions. In Proceedings
of EMOTION-2010, Valletta, Malta.
Wiebe J., Theresa Wilson , Rebecca Bruce , Matthew Bell , and Melanie Martin (2004). Learning subjective lan-
guage. Computational Linguistics 30 (3).
Wiebe, Janyce and Rada Micalcea.(2006) . Word Sense and Subjectivity. In Proceedings of ACL’06, Sydney, Aus-
tralia.
.