22
Twitter Sentiment Analysis Twitter Sentiment Analysis Presented by: Loitongbam Gyanendro Singh

Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

  • Upload
    others

  • View
    15

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

Twitter Sentiment AnalysisTwitter Sentiment Analysis

Presented by:Loitongbam Gyanendro Singh

Page 2: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

What is “Sentiment analysis”?● Study that aims to identify the orientation of opinions in a text

Page 3: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

Source ofSentimentInformation

Source of image: http://jameskaskade.com/?p=2336

Page 4: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

Advent of various social media platforms➔ Given netizen the liberty to openly express their views and opinions➔ Large volume of data to get these information➔ Knowing “what people think”➔ Studies of SA deals:

◆ Product and services reviews,◆ Celebrities,◆ Government policies,◆ Event,◆ and many more…

Why Sentiment Analysis?

OM: Study the subjectivity of opinionSA: Study the sentiment of opinion

Page 5: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

➔ An opinion is quintuple: (Bing Liu, 2012)◆ (ei,aij,sijkl,hk,tl)

➔ Example:◆ The picture quality of my new Nikon V3 camera is great◆ (Nikon V3, picture quality, positive, User, Time)

➔ Where can we find opinionated text?◆ Blogs◆ Microblogs◆ Consumer forum/sites, etc.

Opinion

Page 6: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

➔ Microblogs contains a large amount of opinionated text➔ There are many microblogging platforms available

◆ Twitter◆ Tumbler◆ FourSquare◆ Google+◆ LinkedIn

➔ Twitter provides an easy way to access and download published posts

Microblogs

Page 7: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

➔ Microblogs contains a large amount of opinionated text➔ There are many microblogging platforms available

◆ Twitter◆ Tumbler◆ FourSquare◆ Google+◆ LinkedIn

➔ Twitter provides an easy way to access and download published posts

Microblogs

Page 8: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

➔ Majority of TSA studies deals on building sentiment classifierTwitter Sentiment Analysis

Page 9: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

➔ Text length➔ Topic relevance➔ Noisy text➔ Data sparsity➔ Negation➔ Stopwords➔ Tokenization➔ Multilingual content➔ Multimodal content

TSA challenges

Page 10: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

➔ Semantic➔ Syntactic➔ Stylistic➔ Twitter specific features

Features Opinion words, Sentiment words,Semantic concepts,Negation, etc.

Page 11: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

➔ Semantic➔ Syntactic➔ Stylistic➔ Twitter specific features

Features Unigrams,Bigrams,N-grams,Terms’ frequencies,POS,Dependency tree, etc

Page 12: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

➔ Semantic➔ Syntactic➔ Stylistic➔ Twitter specific features

FeaturesEmoticons,Intensifiers,Abbreviations,Slang terms,Punctuation marks, etc.

Page 13: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

➔ Tweet➔ User➔ Mention➔ Replies➔ Follower➔ Retweet➔ Hashtag➔ Privacy

Twitter Specific Features

Page 14: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

➔ Manual selection

➔ Statistical analysis

➔ Dimensionality reduction

➔ Representation learning

Features Selection

Page 15: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

Statistical Approach● Entropy,

○ H(X) = -∑i C∈ [P(xi) * log(P(xi))]

● Strength of Association via Pointwise Mutual Information, ○ PMI(x,S) = log(P(x,S)/{P(x)*P(S)})○ SOA(x,S) = PMI(x,S) - PMI( x,S)⅂

Page 16: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

Latent Representation Methods● Eigen Value Decomposition (EVD)

● Singular Value Decomposition (SVD)

● Word Embedding via Word2Vec, etc.

Page 17: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

➔ Machine Learning

➔ Lexicon-based

➔ Hybrid-based

Classification Approach

Page 18: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

DNN Classification approach

Convolution Neural Networkd = embedding dimensionm = window sizes = max lengthn = no. of filters

Paper Title: Twitter Sentiment Analysis with Deep Convolutional Neural Networks - Aliaksei Severyn and Alessandro Moschitti

Page 19: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

➔ Accuracy

➔ Precision

➔ Recall

➔ F-score

Evaluation Metrics

Page 20: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

Related fields● Twitter-based Opinion Retrieval

● Tracking Sentiment over Time

● Irony Detection on Tweets

● Emotion Detection on Tweets

● Tweet Sentiment Quantification

Page 21: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

References● Like It or Not: A Survey of Twitter Sentiment Analysis Methods

(Authors: Anastasia Giachanou, Fabio Crestani)

● Sentiment Analysis and Opinion Mining(Author: Bing Liu)

● Google

● Twitter

Page 22: Twitter Sentiment Analysis - Indian Institute of ... · Twitter Sentiment Analysis Text length Topic relevance Noisy text Data sparsity Negation Stopwords Tokenization Multilingual

Thank you