Upload
platt
View
65
Download
0
Tags:
Embed Size (px)
DESCRIPTION
TWEETSENSE: RECOMMENDING HASHTAGS FOR ORPHANED TWEETS BY EXPLOITING SOCIAL SIGNALS IN TWITTER. Manikandan Vijayakumar Arizona State University School of Computing, Informatics, and Decision Systems Engineering Master’s Thesis Defense – July 7 th , 2014. Orphaned T weets. Orphaned Tweets. - PowerPoint PPT Presentation
Citation preview
TWEETSENSE: RECOMMENDING HASHTAGS FOR ORPHANED TWEETS BY EXPLOITING SOCIAL SIGNALS IN TWITTER
Manikandan VijayakumarArizona State UniversitySchool of Computing, Informatics, and Decision Systems EngineeringMaster’s Thesis Defense – July 7th, 2014
Orphaned Tweets
2Source: Twitter
Orphaned Tweets
Overview
3
Overview
4
Twitter• Twitter is a micro-blogging platform where users can be • Social • Informational or •Both
• Twitter is, in essence, also a Web search engine Real-Time News media Medium to connect with friends
Image Source: Google
5
Why people
use Twitter?
According to Research charts, people use Twitter for•Breaking news• Content Discovery• Information Sharing•News Reporting•Daily Chatter• Conversations
Source: Deutsche Bank Markets
Why people use Twitter?
6
According to Cowen & Co Predictions & Report:
Twitter had 241 million monthly active users at
the end of 2013 Twitter will reach only 270 million monthly active users by the end of 2014
Twitter will be overtaken by Instagram with 288 million monthly active users
Users are not happy in Twitter
But..
But..
7
Twitter Noise
8
Noise in
Missing hashtags
9
Noise in
User may use incorrect hashtags
10
Noise in
User may use many hashtags
11
Possible Solutions
Importance of using hashtag Hashtags provide context or metadata for arcane tweets Hashtags are used to organize the information in the tweets for retrieval
Helps to find latest trends Helps to get more audience
Missing Hashtag problem - Hashtags are supposed to help
12
Importance of Context in Tweet
13
Orphaned Tweets Non-Orphaned Tweets
14
Problem Solved? Not all users use hashtags with their tweets.
Without Hashtag
87%
With Hashtag13%
EVA et. al. - 300Million tweets -2013
Without HashtagWith Hashtag Without Hashtag
76%
With Hashtag24%
TweetSense Dataset- 8Million tweets -2014
Without Hashtag With Hashtag
But, Problem Still Exist.
15
Existing systems addresses this problem by recommending hashtags based on:
Collaborative filtering- [Kywe et.al. SocInfo,Springer’2012] Optimization-based graph method -[Feng et.al,KDD’2012] Neighborhood- [Meshary et.al.CNS’2013, April] Temporality– [Chen et.al. VLDB’2013, August] Crowd wisdom [Fang et.al. WWW’2013, May] Topic Models – [Godin et.al. WWW’2013,May] On the impact of text similarity functions on hashtag recommendations in
microblogging environments”, Eva Zangerle, Wolfgang Gassler, Günther Specht: Social Network Analysis and Mining; Springer, December 2013, Volume 3, Issue 4, pp 889-898
Existing Methods
16
Objective How can we solve the problem of finding missing hashtags for orphaned tweets by providing more accurate suggestions for Twitter users?
Users tweet history Social graph Influential friends Temporal Information
Objective
17
Impact
Aggregate Tweets from users who doesn’t use hashtags for opinion mining
Identify Context Named entity problems Sentiment evaluation on topics Reduce noise in Twitter Increase active online user and social engagement
18
TweetSense
(Chapter 4) Ranking Methods
(Chapter 8) Conclusions
(Chapter 3) Modeling the Problem
(Chapter 7) Evaluation
(Chapter 5) Binary Classification
(Chapter 6) Experimental Setup
Outline
Modeling the Problem
19
Modeling the Problem
20
Problem Statement Hashtag Rectification Problem
What is the probability P(h/T,V) of a hashtag h given tweet T of user V?
Orphan Tweet VU
System
Recommends Hashtags
Problem Statement
21
TweetSense
(Chapter 4) Ranking Methods
(Chapter 8) Conclusions
(Chapter 3) Modeling the Problem
(Chapter 7) Evaluation
(Chapter 5) Binary Classification
(Chapter 6) Experimental Setup
Outline
22
TweetSense
23
Architecture
Twitter Dataset
Retrieve User’s Candidate Hashtags from their Timeline
Username & Query tweet
Top K hashtags
#hashtag 1#hashtag 2
.
.#hashtag K
Ranking Model
User
Source: http://en.wikipedia.org/wiki/File:MLR-search-engine-example.png
Indexer
Crawler
Learning Algorithm
Training Data
Architecture
24
Hypothesis When a user uses a hashtag,
she might reuse a hashtag which she created before – present in her user timeline
she may also reuse hashtags which she sees from her home timeline (created by the friends she follows) more likely to reuse the tweets from her most
influential friends hashtags which are temporally close enough
A Generative Model for Tweet Hashtags
25
To build a statistical model, we need to model P(<tweet-hashtag>| <tweet-social features> <tweet-content features>)
Rather than build a generative model, I go with a discriminative model
Discriminative model avoids characterizing the correlations between the tweet features
Freedom to develop a rich class of social features. I learn the discriminative model using logistic regression
Build Discriminative model over Generative Model
26
Candidate Tweet Set
Retrieving Candidate Tweet Set
Global Twitter Data
User’s Timeline
U
27
Two inputs to my system: Orphaned tweet and User who posted it.
Tweet content related features
Tweet text
Temporal Information
Popularity
Feature Selection – Tweet Content Related
28
User related features@mentionsFavoritesCo-occurrence of hashtagsMutual FriendsMutual FollowersFollower-Followee Relation
• Features are selected based on my generative model that users reuse hashtags from her timeline, from the most influential user and that are temporally close enough
Feature Selection – User Related
Friends
29
Architecture
Twitter Dataset
Retrieve User’s Candidate Hashtags from their Timeline
Username & Query tweet
Top K hashtags
#hashtag 1#hashtag 2
.
.#hashtag K
Ranking Model
User
Source: http://en.wikipedia.org/wiki/File:MLR-search-engine-example.png
Indexer
Crawler
Learning Algorithm
Training Data
Architecture
30
TweetSense
(Chapter 4) Ranking Methods
(Chapter 8) Conclusions
(Chapter 3) Modeling the Problem
(Chapter 7) Results
(Chapter 5) Binary Classification
(Chapter 6) Experimental Setup
Outline
Ranking Methods
31
Ranking Methods
32
List of Feature Scores
Similarity ScoreRecency ScoreSocial Trend ScoreAttention score Favorite scoreMutual Friend Score Mutual Follower ScoreCommon Hashtags ScoreReciprocal Score
Tweet textTemporal Information Popularity@mentionsFavoritesMutual FriendsMutual FollowersCo-occurrence of hashtagsFollower-Followee Relation
List of Feature Scores
33
Cosine Similarity is the most appropriate similarity measure over others (Zangerle et.al.)
Cosine Similarity between Query tweet Qi and candidate tweet Tj
Similarity Score
34
Exponential decay function to compute the recency score of a hashtag:
k = 3, which is set for a window of 75 hoursqt = Input query tweetCt = Candidate tweet
Recency Score
35
Social Trend Score
Popularity of hashtags h within the candidate hashtag set H Social Trend score is computed based on the "One person, One vote" approach.
Total counts of frequently used hashtag in Hj is computed.
Max normalization
Social Trend Score
36
Attentionscore
&Favorites
score
Attention score and Favorites Score captures the social signals between the users
Ranks the user based on recent conversation and favorite activity
Determine which users are more likely to share topic of common interests
Attention score & Favorites score
37
Attentionscore
&Favorites
scoreEquation
Attention score & Favorites score Equation
38
Gives similarity between users Mutual friends - > people who are friends with both you and the person whose Timeline you’re viewing
Mutual Followers -> people who follow both you and the person whose Timeline you’re viewing
Score is computed using well-known Jaccard Coefficient
Mutual Friend Score & Mutual Followers Score
39
Ranks the users based on the co-occurrence of hashtags in their timelines.
I use the same Jaccard Coefficient
Common Hashtags Score
40
Twitter is asymmetric This score differentiates friends from just topics of interest like news channel, celebrities, etc.,
Reciprocal Score
41
How to combine
the scores?
Combine all the feature scores to one final score to recommend hashtags
Model this as a classification problem to learn weights While each hashtags can be thought of as its own class Modeling the problem as a multi-class classification problem has certain challenges as my class labels are in thousands
So, I model this as binary classification problem
How to combine the scores?
42
Architecture
Twitter Dataset
Retrieve User’s Candidate Hashtags from their Timeline
Username & Query tweet
Top K hashtags
#hashtag 1#hashtag 2
.
.#hashtag K
Ranking Model
User
Source: http://en.wikipedia.org/wiki/File:MLR-search-engine-example.png
Indexer
Crawler
Learning Algorithm
Training Data
Architecture
43
TweetSense
(Chapter 4) Ranking Methods
(Chapter 8) Conclusions
(Chapter 3) Modeling the Problem
(Chapter 7) Evaluation
(Chapter 5) Binary Classification
(Chapter 6) Experimental Setup
Outline
Binary Classification
44
Binary Classification
45
Problem Setup Training Dataset: Tweet and Hashtag pair < Ti ,Hj >
Tweets with known hashtags
Test Dataset: Tweet without hashtag < Ti ,?> Existing hashtags removed from tweets to provide ground truth.
Problem Setup
Training Dataset
The training dataset is a feature matrix containing the features scores of all < CTi ,CHj > pair belonging to each < Ti ,Hj > pair.
The class label is 1, if CHj = Hj , 0 otherwise. Multiple hashtag occurrence are handled as single instance
<CT1 - CH1,CH2,CH3 > = <CT1,CH1> ,<CT1,CH2>, <CT1,CH3> <Tweet(T1), Hashtag(H1) Pair>
<Candidate Tweet, Candidate Hashtag>CT1,CH1CT2,CH2
.
.CTi,CHj
SimilarityScore
RecencyScore
SocialTrendScore
AttentionScore
FavoriteScore
MutualFriendScore
MutualFollowersScore
CommonHashtag Score
Reciprocal Rank
ClassLabel
CT1,CH1 0.095 0.0 0.00015 0.00162 0.0805 0.11345 0.0022 0.0117 1 1
CT2,CH2 0.0 0.00061 0.520 0.0236 0.0024 0.00153 0.097 0.0031 0.5 0
Training Dataset
47
Occurrence of ground truth hashtag Hj in a candidate tweet < Ti ,Hj > is very few in number.
Higher number of negative samples In multiple occurrences my training dataset has a class distribution of 95% of negative samples and 5% of positive samples
Learning the model on an imbalanced dataset causes low precision
Imbalanced Training Dataset
48
SMOTE Over
Sampling
Possible solutions is under sampling and over sampling. SMOTE - Synthetic Minority Oversampling Technique to resample to a balanced dataset of 50% of positive samples and negative samples
SMOTE does over-sampling by creating synthetic examples rather than over-sampling with replacement.
It takes each minority class sample and introduces synthetic examples along the line segments joining any/all of the k minority class nearest neighbors
This approach effectively forces the decision region of the minority class to become more general.
SMOTE: Synthetic Minority Over-sampling Technique (2002) by Nitesh V. Chawla , Kevin W. Bowyer , Lawrence O. Hall , W. Philip Kegelmeye: Journal of Artificial Intelligence Research
SMOTE Over Sampling
49
Logistic Regression
Model
<Tweet(T1), Hashtag(H1) Pair>
<Candidate Tweet, Candidate Hashtag>CT1,CH1CT2,CH2
.
.CTi,CHj
1
Class Labels +ve samples
-ve samples
0
0 <Tweet(T2), Hashtag(H2) Pair>
<Candidate Tweet, Candidate Hashtag>CT1,CH1CT2,CH2
.
.CTi,CHj
11
0
<Tweet(Ti), Hashtag(Hj) Pair>
<Candidate Tweet, Candidate Hashtag>CT1,CH1CT2,CH2
.
.CTi,CHj
00
1
Feature Matrix
λ1 λ3
λ2
λ4
λ6λ5λ7
λ8λ9
Learning – Logistic Regression I use Logistic Regression Model over a generative model such as NBC or Bayes
networks as my features have lot of correlation. ( shown in evaluation )
50
Test Dataset
My test dataset is represented in the same format as my training dataset as a feature matrix with the class labels unknown (removed).
<Tweet(T1), ?>
<Candidate Tweet, Candidate Hashtag>CT1,CH1CT2,CH2
.
.CTi,CHj
SimilarityScore
RecencyScore
SocialTrendScore
AttentionScore
FavoriteScore
MutualFriendScore
MutualFollowersScore
CommonHashtag Score
Reciprocal Rank
ClassLabel
CT1,CH1 0.034 0.7 0.0135 0.0621 0.0205 0.11345 0.22 0.611 1 ?
CT2,CH2 0.0 0.613 0.215 0.316 0.0224 0.0523 0.057 0.0301 0.5 ?
Test Dataset
51
Classification
If the predicted probability is greater than 0.5 then the model labels the hashtag as 1 or 0 otherwise.
The hashtags labeled as 1 are likely to be the suitable hashtag.
I rank the top K recommended hashtags based on their probabilities.
Classification
Class Labels
1
0
Feature Matrix
??
?
<Query Tweet(Qi), ? >
<Candidate Tweet, Candidate Hashtag>CT1,CH1CT2,CH2
.
.CTi,CHj
Logistic Regression
Model
52
Implementation – System Example 1
TweetSense (Top 10)
Baseline-SimGlobal (Top 10)
Baseline-SimTime (Top 10)
Baseline-SimRecCount(Top 10)
#KUWTK 0.989970778#tfiosmovie 0.985176542#CatchingFire 0.981380129#ANTM 0.968851541#GoTSeason4 0.946418848#Jofferyisdead 0.944493746#TFIOS 0.941791929#Lunch 0.940883835#MockingjayPart1trailer0.9344869#JoffreysWedding 0.934201161
#KUWTK 0.824264068712 #ANTM 0.583979541687 #Glee 0.453373612475 #NowPlaying 0.439078783215#Scandal 0.435994273991 #XFactor 0.425513196481 #Spotify 0.42500253688 #LALivin 0.424264068712 #PansBack 0.424264068712 #ornah 0.424264068712
#Scandal 0.82326311013#ornah 0.819013620132#LALivin 0.816627941101#KUWTK 0.814775850946#Glee 0.778570381907#SURFBOARD 0.746003141257#latergram 0.745075687756#Spotify 0.744375215512#NowPlaying 0.744375215512#EFCvAFC 0.730686523119
#Scandal 0.428809523257 #KUWTK 0.428809523257 #LALivin 0.426536795985 #PansBack 0.426536795985 #ornah 0.426536795985 #Glee 0.381746046493 #goodcompany 0.348682888787 #SURFBOARD 0.348682888787 #JLSQuiz 0.348682888787 #HungryAfricans 0.348682888787
53
Implementation – System Example 2
TweetSense(Top 5)
Baseline-SimGlobal(Top 5)
Baseline-SimTime(Top 5)
Baseline-SimRecCount(Top 5)
#Eurovision 0.998892319#EurovisionSongContest2014 0.997934085#garybarlo0.989491417#UKIP 0.988958194#parents0.98511502
#photogeeks 0.6#FSTVLfeed 0.476912544#FestivalFriday 0.424264069#barkerscreeklife 0.420229873#IPv6 0.4
#photogeeks 0.907490888#FSTVLfeed 0.823842681#FestivalFriday 0.82085025#Pub49 0.745300825#monumentvalleygame0.738922
#photogeeks 0.600706714#FSTVLfeed 0.429211065#FestivalFriday 0.424970782#Pub49 0.353477299#sma20130.348530303
54
Implementation – System Example 3
TweetSense(Top 5)
Baseline-SimGlobal(Top 5)
Baseline-SimTime(Top 5)
Baseline-SimRecCount(Top 5)
#boxing 0.996480078#GoldenBoyLive 0.9336961478#USC 0.913498443#AngelOsuna 0.911312201#paparazzi 0.90625792
#BoxeoBoricua 0.346937709#ListoParaHacerHistoria 0.2889#CaneloAngulo 0.272852636#6pm 0.261133502#Vallarta 0.252135503
#TU 0.517962946#regardless 0.489156945#legggoo 0.476362923#Shoutout 0.464033604#TeamH 0.44947086
#BoxeoBoricua 0.34687581#ListoParaHacerHistoria 0.2893#CaneloAngulo 0.27221214 #6pm 0.42458613#sonorasRest 0.42458613
55
TweetSense
(Chapter 4) Ranking Methods
(Chapter 8) Conclusions
(Chapter 3) Modeling the Problem
(Chapter 7) Evaluation
(Chapter 5) Binary Classification
(Chapter 6) Experimental Setup
Outline
Experimental Setup
56
Experimental Setup
57
Dataset I randomly picked 63 users from a partial random distribution by navigating through the trending hashtags in Twitter.
Characteristic of the Dataset
Characteristics Value PercentageTotal number of users 63 N/ATotal Tweets Crawled 7,945,253 100%Tweets with Hashtags 1,883,086 23.70%Tweets without Hashtags 6,062,167 76.30% Tweets with exactly one Hashtag 1,322,237 16.64%Tweets with more than one Hashtag 560,849 7.06%Total number of tweets with user @mentions
716,738 58.63%
Total number of Favorite Tweets 4,658,659 9.02%Total number of tweets with Retweets 1,375,194 17.31%
Dataset
58
Randomly pick the tweet with only one hashtag – avoids getting credit for recommending generic hashtags
Deliberately remove the hashtag and its retweets for evaluation
Pass the tweet as an input to my system TweetSense Get the recommended hashtag list Compare if the ground truth hashtag in the recommended list
Evaluation Method
59
TweetSense
(Chapter 4) Ranking Methods
(Chapter 8) Conclusions
(Chapter 3) Modeling the Problem
(Chapter 7) Evaluation
(Chapter 5) Binary Classification
(Chapter 6) Experimental Setup
Outline
Results
60
Evaluation
61
External Evaluation
with Baseline for all 3 ranking
methods
Test users : 45 users & 1599 tweet Samples
5 1 0 1 5 2 00%
10%
20%
30%
40%
50%
60%
70%
45%
53%56%
59%
30%34%
38%42%
26%
33%37%
40%
24%29%
32%35%
External Evaluation with baseline on PRECISION @ N
TweetSense SimTime SimGlobal SimRecCount
Top N Hashtags recommended by the systemperc
enta
ge o
f sam
ple
twee
ts fo
r w
hich
the
hahs
tags
are
re
com
men
ded
corre
ctly
Total Number of Sample tweets : 1599 Total number of tweets for which hashtags are recommended correctly FOR PRECISON @ K=5 :TweetSense : 720 | SimTime: 487 | SimGlobal : 422 | SimRec: 384 |
TweetSense
Baseline
62
Ranking Quality
RANKING QUALITY - TWEETSENSE
63
Odds Ratio –
Feature Comparison
Similarity Score
Recency Score
Social Trend Score
Attention Score
Favorite Score
Mutual Friends Score
Mutual Followers Score
Common Hashtags Score
Reciprocal Score
0 2000 4000 6000 8000 10000 12000 14000 16000
0.0942
0.0022
0.0017
0
0.2837
13538.6542
0.0923
0
0.7144
ODDS RATIO - FEATURE COMPARISON – WITH ALL FEATURES
64
ODDS RATIO - FEATURE COMPARISON – WITHOUT MUTUALFRIEND SCORE
Similarity Score
Recency Score
Social Trend Score
Attention Score
Favorite Score
Mutual Followers Score
Common Hashtags Score
Reciprocal Score
0 0.5 1 1.5 2 2.5 3 3.5
0.1123
0.0024
0.0017
0
0.24
3.115
0
0.7717
65
ODDS RATIO - FEATURE COMPARISON – WITHOUT MUTUAL FRIEND, FOLLOWERS,RECIPROCAL SCORE
Similarity Score
Recency Score
Social Trend Score
Attention Score
Favorite Score
Common Hashtags Score
0 0.05 0.1 0.15 0.2 0.25
0.1134
0.0026
0.0016
0
0.2112
0
66
Odds Ratio –
Feature Comparison
ODDS RATIO - FEATURE COMPARISON – ONLY MUTUAL FRIEND SCORE
Mutual Friends Score
0 0.05 0.1 0.15 0.2 0.25
0.2081
67
Precision @n-
Only Mutual Friend
Feature Score
5 1 0 1 5 2 00%
10%
20%
30%
40%
50%
60%
70%
45%
53%56%
59%
30%34%
38%42%
26%
33%37%
40%
24%29%
32%35%
2%5%
8%11%
Feature Score comparison on PRECISION @ N with only mutual friend score
TweetSense SimTime SimGlobal SimRecCount OnlyMutualFriendScore
Top N Hashtags recommended by the system
perc
enta
ge o
f sam
ple
twee
ts fo
r w
hich
the
hahs
tags
are
re
com
men
ded
corre
ctly
Total Number of Sample tweets : 1599 Total number of tweets for which hashtags are recommended correctly FOR PRECISON @ K=5 :TweetSense : 720 | SimTime: 487 | SimGlobal : 422 | SimRec: 384 | OnlyMutualFriendRank: 37
TweetSense
Baseline
With only Mutual Friend Score
68
TweetSense
(Chapter 4) Ranking Methods
(Chapter 8) Conclusions
(Chapter 3) Modeling the Problem
(Chapter 7) Results
(Chapter 5) Binary Classification
(Chapter 6) Experimental Setup
Outline
Conclusion
69
Conclusion
70
Proposed a system called TweetSense, which finds additional context for an orphaned tweet by recommending hashtags.
Proposed a better approach on choosing the candidate tweet set by looking at user’s social graph
Exploit the social signals along with the user’s tweet history to recommend personalized hashtags.
I do internal and external evaluation of my system Showed how my system performs better than the current state of art system
Summary
71
Rectifying incorrect/irrelevant hashtags for tweets by identifying and/or adding the right hashtag for the tweets
“Named hashtag recognition” – Aggregate processing of tweets for sentiment and opinion mining
Use topic models to recommend hashtags based on topic distributions
Do a incremental learning version and make it as a online application.
Future Works