Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
Social Media Intelligence
Text, Network Mining and Predictive
Analytics Combined
Phil Winters
Tobias Kötter
www.knime.com
Social Media Analysis Water Water Everywhere, and not a drop to drink
Approaches and Challenges:
Cloud-based Approach: No Access to Data
In-House Dashboard: No Analytics
In-House Text Mining: Sentiment but no relevance
In-House Network Mining: Relevance but no Sentiment
2
Case Study: Major European Telco
Very rich new data sources about customers !
Combine – Text mining
– Network Analysis
– Classic Predictive Analytics • Modeling, Clustering, Time Series, etc
Combine with internal Data makes the text „relevant“ – Include Product names/Categories
– exclude Staff Members
– Include number of web hits per page...
– Include existing marketing positioning
– Include major campaign information
3
Social Media Intelligence:
Major European Telco
4
Our Goal in Social Media Analysis
5
Text Mining for Sentiment
Drill Down on special cases
Network Mining for Relevance
Analytics for Prediction
Case Study Example: Slashdot Data
“News for Nerds, Stuff that Matters“
6
Basic Facts:
• 24532 users
• 491 threads with
15 – 843 responses
from 12 – 507 users
• 113505 posts
(text mining on posts)
• 60 main topics
Text Mining Remove anonymous users,
group by PostID Words Tagging
Positive words
Negative words
MPQA
Corpus
BoW
Sta
nd
ard
Nam
ed
En
tity
Filt
er
Word
Fre
qu
en
cy
User Bins
Word cloud for selected users 7
Slashdot – Text Mining
List of negative and positive words (MPQA Opinion Corpus)
Tag positive and negative words
Count words in posts
Aggregate over users
Negative + Positive User.
Most positive user: dada21 (2838 positive / 1725 negative words)
Most negative user: pNutz (43 positive / 109 negative words)
16016 positive users
7107 negative users
Which Topics have positive users in common ?
– Government
– People
– Law/s
– Money
– Market
– Parties
8
Slashdot – Text Mining
Most negative post:
9
10
Installation
Network Mining Feature:
labs.knime.org
Documentation:
http://tech.knime.org/network-mining
Data Structure
Supported networks:
– (un)directed
– (un)weighted
– hypergraph
– k-partite
11
Nodes
12
13 planned
Internal based on Jung 2.0.1
Cytoscape
visone
Gephi
Visualization
14
Network Creation
User1
User2 User3
User6
User4 User5
15
Network Creation
Networking Mining the Slashdot Data
16
Topic Graphs
17
18
Topic Graphs
Topic Graphs
19
NASA
Sci-Fi
Hubs & Authorities
20
• Hubs = Follower
• Authorities = Leader
Filtering anonymous users and creating network Centrality index to
define hub weight
and authority weight
Users with hub and
authority weights and
other features
Hubs & Authorities
21
dada21
Doc Ruby
Carl Bialik
pNutz
Tube Steak
Combining Text and Network Mining
22
Network Analysis
Text Analysis
Hub and Authority Score
per User
Attitude Level per User
23
Carl Bialik
dada21
Doc Ruby
WebH
osting G
uy
pNutz
Tube Steak
Catbeller
Hubs, Authorities &Attitudes
from the WSJ
What we have found ...
- The positive leaders
- The neutral leaders
- The negative leaders
- The inactive users
24
What identifies each group?
How do I identify a new user?
How do I handle each user?
The k-Means Clusters
25
Superfans
Negative
users
Neutral
users
Fans
The operational Workflow
26
Pre-processing Cluster Extraction
Assignment of new data
Lessons Learned Data Manipulation is the key…. The decision science flows from that
Sentiment analysis is all about the Corpus !
27
Network Analysis
Sentiment Analysis
Capturing the data Options Available: From fee-paying to open source !
NOTE
Examples, workflows (ie: the complete
programs) as well as white papers are
available for download on:
www.knime.com
29
Copyright © 2013 by KNIME.com AG All Rights Reserved - Confidential
Mark Your Calendars:
KNIME’s 7th User Group Meeting
19.-20. February 2014 Zurich, Switzerland
www.KNIME.com
30