Towards Cross-Language Sentiment Analysis through Universal Star Ratings

KMO 2012

Malissa BalErasmus University Rotterdam

malissa.bal@xs4all.nl

Flavius FrasincarErasmus University Rotterdam

frasincar@ese.eur.nl

Daniella BalErasmus University Rotterdam

daniella.bal@xs4all.nl

Alexander HogenboomErasmus University Rotterdam

hogenboom@ese.eur.nl

July 12, 2012

Introduction (1)

• The Web offers an overwhelming amount of multi-lingual textual data, containing traces of sentiment

• Information monitoring tools for tracking sentiment are crucial for today’s businesses

KMO 2012

Introduction (2)

• Language-specific sentiment analysis typically produces sentiment scores

• Sentiment scores are incomparable across languages as they originate from language-specific phenomena

• Star ratings are universal classifications of people’s intended sentiment and thus enable comparison

• As star ratings are not always available, we need a mapping from language-specific scores of sentiment conveyed by words to universal star ratings

KMO 2012

Cross-Language Sentiment Analysis (1)

• Sentiment analysis is typically focused on determining the polarity of natural language text

• Applications exist in summarizing reviews, determining a general mood (consumer confidence, politics)

• State-of-the-art approaches classify polarity of natural language text by analyzing vector representations using, e.g., machine learning techniques

• Alternative approaches are lexicon-based, which renders them robust across domains and texts and enables linguistic analysis at a deeper level

• Challenge: other languages require other approaches

KMO 2012

Cross-Language Sentiment Analysis (2)

• Typical solutions:– Machine translation into reference language– Lexicon creation for new languages– Construction of new sentiment analysis frameworks

• Most approaches yield incomparable sentiment scores• Sentiment-carrying text is an artifact of the use of

language rather than a culture-free analytical tool• Capturing the relation between the use of language

and universal star ratings reflecting intended sentiment is crucial for cross-language sentiment analysis

KMO 2012

From Sentiment Scores to Star Ratings (1)

• Assumptions:– Star ratings are universal classifications of intended sentiment– Classes are defined on an ordinal scale– Higher language-specific sentiment scores are associated

with higher star ratings

• We model the relation between sentiment scores and star ratings as a monotonically increasing step function

• We thus construct language-specific sentiment maps, where subsets of texts with (largely) similar star ratings are separated by boundaries

KMO 2012

From Sentiment Scores to Star Ratings (2)

• The challenge is to find optimal set of boundaries, minimizing the total number of misclassifications

• Boundaries must be non-overlapping and ordered• Non-trivial optimization can be performed by means of:

– Greedy algorithms– Machine learning– Brute force

KMO 2012

Evaluation (1)

• Exploration of how sentiment scores can be converted into universal star ratings and how such mappings differ across languages (10-fold cross-validated)

• Data set of short movie reviews in Dutch (1,759) and English (46,315), rated by their authors, with normally distributed ratings over five star classes

• Sentiment analysis by means of an existing lexicon-based multi-lingual framework, yielding language-specific sentiment scores in the interval [-1.5, 1.5]

• Paired observations of sentiment scores and star ratings can be used to construct sentiment maps by means of a brute force approach (step size 0.1)

KMO 2012

Evaluation (2)

KMO 2012

Evaluation (3)

• Sentiment map for Dutch:

• Sentiment map for English:

• Dutch sentiment map yields a star rating classification accuracy for Dutch documents of about 20%

• Star rating classification accuracy for English documents equals about 40% when using the English sentiment map

KMO 2012

Conclusions

• We propose to compare the sentiment conveyed by words in unstructured text across languages through universal star ratings for intended sentiment

• The way in which words in natural language text reveal people's intended sentiment differs across languages

• Language-specific sentiment scores can separate universal classes of intended sentiment from one another only to a limited extent

KMO 2012

Future Work

• Allow for a non-linear relation between sentiment scores and star ratings

• Drop the monotonicity constraint• Assess other proxies for star ratings, e.g., emoticons

or rhetorical structure

KMO 2012

Questions?

Alexander HogenboomErasmus School of EconomicsErasmus University RotterdamP.O. Box 1738, NL-3000 DRRotterdam, the Netherlands

hogenboom@ese.eur.nl

KMO 2012

Towards Cross-Language Sentiment Analysis through Universal Star Ratings

Documents

Negative Sentiment (or "Sentiment Analysis is Sh*te")

Universal Joints - Springfix Linkages · 1 Technical Information Universal Joints Torque Ratings for Plain Bearing Universal Joints 12 10 8 6 5 4 3 1 0,6 0,5 0,4 0,8 0,3 0,2 0,04

UNIVERSAL STUDENT RATINGS OF INSTRUCTION (USRI) Course Evaluation Information Session Oct 2014 Jean Gomes, Office of Institutional Analysis

Sentiment Analysis Introduction to - unipi.itdidawiki.di.unipi.it/.../mds/txa/introduction_to_sentiment_analysis.pdf · Sentiment Analysis is not a dataset. Sentiment Analysis is

Tree Communication Models for Sentiment Analysis · sentiment analysis over Stanford Sentiment Treebank, which allows the sentiment signals over hierarchical phrase structures to

Harnessing Ratings and Aspect-Sentiment to Estimate ...Introduction Overview Detection and intensity of contradiction Experimental evaluation Conclusion Overview Contradiction and

Sentiment Analysis in Social Streamsarantxa.ii.uam.es/~cantador/doc/2015/epps-socialstreams15.pdf · reviews and ratings in blogs and recommender systems, ... emotions, and tastes

L'amor a l'antiga Roma - ddd.uab.cat · 3 Introducció L’amor és un sentiment universal que moltes cultures han plasmat en textos o altres formes d’art. Un sentiment que ha provocat

BUSINESS » INDUSTRY - careratings.com price... · This sentiment is backed by CARE Ratings and Anand Rathi Shares and Stock Brokers Ltd. in their respective reports. Last week, in

A Joint Model of Text and Aspect Ratings for Sentiment Summarization Ivan Titov (University of Illinois) Ryan McDonald (Google Inc.) ACL 2008

Unpaired Sentiment-to-Sentiment Translation: A Cycled … · 2018-10-10 · Unpaired Sentiment -to Sentiment Translation: A Cycled Reinforcement Learning Approach 15 07 2018 4 of

apr 2014maps sentiment 0 o O O o o 00 0 0 00 sentiment opage maps sentiment opage maps maps sentiment sentiment ©page maps sentiment opage maps maps i sent sentiment

Introduction to Sentiment Analysis - ETH Z · sentiment ! Sentiment analysis is also known as opinion mining L Sanders 3 What is Sentiment Analysis Sentiment analysis is the operation

Fitch Ratings: Definitions of Ratings

Sentiment Analysis & Opinion Mining€¦ · Sentiment Analysis Sentiment Classification System Experimente Perspektiven * Abbildung dem Sinn nach entnommen aus Heyer (2006: 5). Sentiment

Fitch Ratings FI Ratings Methodology

ML for IR: Sentiment Analysis and Multi-label Categorization · dataset positive negative IMDB 25,000 reviews with ratings 7-10 25,000 reviews with ratings 1-4 Amazon Baby Product

Towards Cross-Language Sentiment Analysis through Universal Star Ratings KMO 2012 Malissa Bal Erasmus University Rotterdam malissa.bal@xs4all.nl Flavius

Geo-spatial Multimedia Sentiment Analysis in Disasters · sentiment analysis (a.k.a. sentiment classification). Sentiment analysis has been widely used and applied in various studies

Sentiment Analysis and Opinion Miningliub/FBS/Sentiment...Sentiment Analysis and Opinion Mining 7 CHAPTER 1 Sentiment Analysis: A Fascinating Problem Sentiment analysis, also called