Upload
byron-lawrence
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Predictive Semantic Social Media Analysis
David A. Ostrowski System Analytics and Environmental Sciences
Research and Advanced Engineering
Ford Motor Company
Social media
• Influential• Sample of the web
– News driven• CRM
– Real-time– Less biased
• Unique opportunities for analytics
Opportunities
• Old Model– Reactionary
• Damage control• Inquiries• Confirm positive reaction
• New Model– Preemptive
• Focused engagement– Promotions– Events– Media
• Anticipatory
Social Dimensions
• Describes affiliations across a network
• Values / Community
• Reinforced by relationships
• Utilize to predict purchase behavior
Relational Learning
• ‘Birds of a Feather’
• Leverage each local network to semantic understanding
• Relational Learning =>Social dimensions
Framework Overview
• Relational learning– Strengthen representation– Support knowledge
• Unsupervised classification– Generation of dimensions
• Supervised classification– Dimensions => behavior
Movies Television Shows associationsschools
Fb identifier Fb identifier Fb identifier
Political affiliations Issues positions
values
Buying habits
Religious views
Framework Overview
Localnetwork
taxonomylabels
SocialDimension
RNclassification
K-meanscluster
features
Supv.classification
behaviorsfeatures
Higher level features
Case Study One
• 4000 facebook identifiers
• Associations to two vehicle lines
• Question:– What can we extract to characterize between these
two purchase behaviors
Relational Learning Step
• Extracted data from FB
• Consolidated interests
• Applied the RN algorithm
• Guided by taxonomy
45 50 55 60 65 70 75 80 85 90
0
10
20
30
40
50
60
70
80
90
100
Facebook Accounts
missing labels (normalized)
Acc
ura
cy
RNBayesk-Means
Preliminary cluster statistics
1 2 3 4 5 6veh1 k=3 46 39 13veh2 k=3 21 42 36veh1 k=4 44 16 12 26veh2 k=4 14 27 24 32veh1 k=5 21 8 1 0.3 45veh2 k=5 35 22 12 15 14veh1 k=6 7 43 6 13 9 19veh2 k=6 20 14 16 8 9 35
normalized differences between vehicle lines
Extracted social dimensions
• Applied feature sets to k-means (3-6)
• Each classification attempt to characterize between vehicle line and a social dimension (value / interest ..)
• All classification to be considered towards behavioral training
• Also considered community detection– Via maximization of a modularity matrix via leading eigenvectors
Applied Supervised Classification for the Behavior prediction
•Applied sets through three Machine Learning algorithm
•Simple Bayesprecision .7 , recall .69
• Weightily Averaged One-dependence Estimators(WAODE)precision .69 recall .70
•J48precision .69 recall .70
Case Study 2
• 20000 Facebook IDs across four vehicle lines
• Relational modeling– Similar performance as first case study
• Social Dimensions generated for k=(3-7)– Not as much separation after k=6 clustering
• Precision recall (among simple bayes, WAODE, J48).469, .483.591, .588.534, .536
Next Steps
• Institutionalization– Extract / define exactly what our dimensions are
explaining in our data sets.
• Relate to specific association – Values– community
Q/ASee me for friends and neighbors discount…. [email protected]