Upload
ronat
View
60
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Virtual Knowledge Studio (VKS). Information Studies. SentiStrength: Sentiment Strength Detection in MySpace and Twitter. Mike Thelwall Statistical Cybermetrics Research Group University of Wolverhampton, UK. SentiStrength Objective. - PowerPoint PPT Presentation
Citation preview
SentiStrength: Sentiment Strength Detection in MySpace and Twitter
Mike ThelwallStatistical Cybermetrics Research GroupUniversity of Wolverhampton, UK
Virtual Knowledge Studio (VKS)
Information Studies
SentiStrength Objective
1. Detect positive and negative sentiment strength in short informal text
1. Develop workarounds for lack of standard grammar and spelling
2. Harness emotion expression forms unique to MySpace or CMC (e.g., :-) or haaappppyyy!!!)
3. Classify simultaneously as positive 1-5 AND negative 1-5 sentiment
2. Apply to MySpace comments and social issues
SentiStrength Algorithm - Core
List of 890 positive and negative sentiment terms and strengths (1 to 5), e.g. ache = -2, dislike = -3, hate=-4,
excruciating -5 encourage = 2, coolest = 3, lover = 4
Sentiment strength is highest in sentence; or highest sentence if multiple sentences
Examples
My legs ache.
You are the coolest.
I hate Paul but encourage him.
-2
3
-4 2
1, -2
positive, negative
3, -1
2, -4
Term Strength Optimisation
Term strengths (e.g., ache = -2) initially fixed by human coderTerm strengths optimised on training set with 10-fold cross-validation Adjust term strengths to give best
training set results then evaluate on test set
E.g., training set: “My legs ache”: coder sentiment = 1,-3 => adjust sentiment of “ache” from -2 to -3.
SentiStrength Algorithm -Extra
Spelling correction for repeated letters Helllllo -> Hello (emphasis: llll)
Tagging approach used (see next slide)
Extra heuristics Emphasis acts to enhance + or – emotion Emotion words ignored in questions Take strongest positive or negative
expression in whole comment Booster words (e.g., very, some)
Tagging
HIIIIII MY MATE!!!!!!!! <w equiv="HI" em="IIIII">HIIIIII</w><w>MY</w><w>MATE</w><p equiv="!" em="!!!!!!!">!!!!!!!!
</p>HI MY MATE!2 3
Overall 3, -1mate = 2
Experiments
Development data = 2600 MySpace comments coded by 1 coderTest data = 1041 MySpace comments coded by 3 independent codersComparison against a range of standard machine learning algorithms
Inter-coder agreement
Comparison +veagree-ment
-veagree-ment
Coder 1 vs. 2 51.0% 67.3%
Coder 1 vs. 3 55.7% 76.3%
Coder 2 vs. 3 61.4% 68.2%
Krippendorff’s inter-coderweighted alpha = 0.5743for positive and 0.5634for negative sentiment
Only moderate agreementbetween codersbut it is a hard 5-category task
Machine learning methods +ve
Machine learning methods -ve
Results:+ve sentiment strength
Algorithm Opt.Feat.
Accu-racy
Acc.+/- 1 class
Corr. Mean % abs. error
SentiStrength - 60.6% 96.9% .599 22.0%
Simple logistic regression 700 58.5% 96.1% .557 23.2%
SVM (SMO) 800 57.6% 95.4% .538 24.4%
J48 classification tree 700 55.2% 95.9% .548 24.7%
JRip rule-based classifier 700 54.3% 96.4% .476 28.2%
SVM regression (SMO) 100 54.1% 97.3% .469 28.2%
AdaBoost 100 53.3% 97.5% .464 28.5%
Decision table 200 53.3% 96.7% .431 28.2%
Multilayer Perceptron 100 50.0% 94.1% .422 30.2%
Naïve Bayes 100 49.1% 91.4% .567 27.5%
Baseline - 47.3% 94.0% - 31.2%
Random - 19.8% 56.9% .016 82.5%
Results:-ve sentiment strength
Algorithm Opt.feat.
Accuracy Acc.+/- 1 class
Corr. Mean % absoluteerror
SVM (SMO) 100 73.5% 92.7% .421 16.5%
SVM regression (SMO) 300 73.2% 91.9% .363 17.6%
Simple logistic regression
800 72.9% 92.2% .364 17.3%
SentiStrength - 72.8% 95.1% .564 18.3%
Decision table 100 72.7% 92.1% .346 17.0%
JRip rule-based classifier 500 72.2% 91.5% .309 17.3%
J48 classification tree 400 71.1% 91.6% .235 18.8%
Multilayer Perceptron 100 70.1% 92.5% .346 20.0%
AdaBoost 100 69.9% 90.6% - 16.8%
Baseline - 69.9% 90.6% - 16.8%
Naïve Bayes 200 68.0% 89.8% .311 27.3%
Random - 20.5% 46.0% .010 157.7%
SentiStrength ComponentsType %
Consecutive +ve words not used as boosters 61.2
Emoticons ignored 61.2
Negating words not switch (e.g., not happy) 61.0
SentiStrength standard configuration 60.9
Booster words ignored (e.g., very) 60.7
Automatic spelling correction disabled 60.6
Exclamation marks not given a strength of 2 60.6
Extra multiple letters not used as boosters 60.4
Neutral words with emphasis not counted as +ve 60.1
SentiStrength with all the above changes 57.5
Example differences/errors
THINK 4 THE ADD Computer (1,-1), Human (2,-1)
0MG 0MG 0MG 0MG 0MG 0MG 0MG 0MG!!!!!!!!!!!!!!!!!!!!N33N3R!!!!!!!!!!!!!!!! Computer (2,-1), Human (5,-1)
Selected variations tested
Modification (for positive sentiment)
Accuracy +/- 1class
corr. MeanAbs.% err.
Negating words not used to switch following sentiment (e.g., not happy)
60.87% 97.50% .6206 21.28%
SentiStrength standard algorithm 60.64% 96.90% .5986 21.96%
Exclamation marks not given a strength of 2
60.51% 96.62% .6035 21.47%
Automatic spelling correction disabled 60.39% 96.88% .5961 22.05%
Extra multiple letters not used as emotion boosters
60.21% 96.81% .5952 22.16%
Neutral words with emphasis not counted as positive emotion
60.13% 96.79% .5966 21.90%
SentiStrength with no extras 57.44% 96.07% .6073 21.91%
Application - Evidence of emotion homophily in MySpace
Automatic analysis of sentiment in 2 million comments exchanged between MySpace friends Correlation of 0.227 for +ve emotion strength and 0.254 for –vePeople tend to use similar but not identical levels of emotion to their friends in messages
CYBEREMOTIONS = data gathering + complex systems methods + ICT outputs
Collective Emotionsin Cyberspace
Sentistrength
Application – sentiment in Twitter events
Analysis of a corpus of 1 month of English Twitter postsAutomatic detection of spikes (events)Sentiment strength classification of all postsAssessment of whether sentiment strength increases during important events Result – negative sentiment normally increases,
positive sentiment might tend to increase
Automatically-identified Twitter spikes
Chile
Hawaii
#oscars
Tiger Woods
Conclusion
Automatic classification of emotion on a 5 point positive and negative scale seems possible for MySpace…And other similar short computer text messages?Hard to get accuracy much over 60%?Next = analyse emotion inonline debates
Publication
Thelwall, M., Buckley, K., Paltoglou, G. Cai, D., & Kappas, A. (in press). Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology.
Thelwall, M., Wilkinson, D. & Uppal, S.(2010). Data mining emotion in social network communication: Gender differences in MySpace, Journal of the American Society for Information Science and Technology, 61(1), 190-199.