5
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 10. Sentiment Analysis

1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 10. Sentiment Analysis

Embed Size (px)

Citation preview

Page 1: 1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 10. Sentiment Analysis

1

CSC 594 Topics in AI –Text Mining and Analytics

Fall 2015/16

10. Sentiment Analysis

Page 2: 1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 10. Sentiment Analysis

• Sentiment Analysis is to extract and identify the polarity of sentiments expressed in texts.

• Lately sentiment analysis has been widely applied to reviews/opinion pieces and texts from social media.

• But there are many challenges in conducting sentiment analysis, e.g.1. Judgement of sentiment (existence, degree/granularity) is not clear-cut.

2. Sentiments are dependent on the domains and contexts (e.g. “addictive”)

3. Sentences with negations (“not”, “no”, “__n’t”, etc.).

4. Sentences with comparatives (“A is better than B, but still have problems”).

5. User texts contain spelling errors, irregular typography (e.g. emoticons), and ungrammatical sentences.

6. Words/expressions that imply sentiments are subtle (sentiment lexicon).

7. Multiple sentiments could be expressed in one sentence/document.

8. Possibility of sarcasm.

Sentiment Analysis

2

Page 3: 1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 10. Sentiment Analysis

Supervised:•Classify documents into sentiment categories (positive, negative, neutral, etc.)

• Goals/End Products:– Predictive models for sentiment categorization– “Important/relevant features” that determine the sentiments.

look at features which are weighted heavier in the resulting model.

•Text Pre-processing:– Standard pre-processing – stemming/lemmatizing, removing stop words– Part-of-speech tagging – often focus on adjectives and nouns– Term weighting– N-grams or noun groups/phrases – unigram is too small of a unit

•Common techniques (in machine learning):– Typical classification algorithms, such as SVM, Decision Tree, KNN.– Naïve Bayes (as with general text classification)

Sentiment Analysis Tasks (1)

3

Page 4: 1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 10. Sentiment Analysis

UnSupervised:•Typical goal is to mine opinions for features/aspects

– Example: product features (e.g. “awesome graphics”)

– Features/aspects are often pre-defined (for specific domains).

– Sometimes (pre-defined) sentiment lexicons are also used.

– However, automatic identification of features or sentiment lexicon could be possible as well.

•Text Pre-processing:– Standard pre-processing, POS-tagging and possible n-grams (or noun

groups) are applied.

– Processing is done at the sentence-level – to get narrower context.

– Deeper NLP is often applied to extract precise/accurate result.

•Common techniques:– Word Association/Collocations – PMI, Likelihood

– Clustering – to obtain general topics of the opinions in a corpus

Sentiment Analysis Tasks (2)

4

Page 5: 1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 10. Sentiment Analysis

• Sentiment Lexicon for English (around 6800 words) – from (Hu and Liu, KDD-2004), https://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html

5