Upload
emanuele-della-valle
View
107
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Bottari is a LarKC application http://www.larkc.eu/. It offers a real-time personalized recommendation service for restaurants in Insa-dong(Seoul) listening to the reputation of the restaurants on social media. Social media anlytics is powered by LarKC inductive and deductive stream reasoning solution. Learn more at http://larkc.cefriel.it/lbsma/bottari/ .
Citation preview
BOTTARI: Location based Social Media Analysis with Semantic
WebEmanuele Della Valle
Joint work with:CEFRIEL: Irene Celino, Daniele Dell’Aglio, Marco Balduini
SALTLUX: Tony Lee, Seonho Kim SIEMENS: Volker Tresp, Yi Huang
Watch this first :-)
226.10.2011 - SW Challenge 2011, ISWC 2011, Bonn, Germany
http://www.youtube.com/watch?v=c1FmZUz5BOo
• An augmented reality application for personalized recommendation of restaurants in Seoul
What have you seen?
326.10.2011 - SW Challenge 2011, ISWC 2011, Bonn, Germany
• Yes and no!
• Same use case, more “democratic”
• We do “reality mining” by listening to the social media
Yet another ?
426.10.2011 - SW Challenge 2011, ISWC 2011, Bonn, Germany
Architecture
526.10.2011 - SW Challenge 2011, ISWC 2011, Bonn, Germany
out
Query Rewriter
Query Evaluator
RDF2Matrix Plug-in
Streaming Linked Data
Server
SOR Invoker
SOR geo-spatial
KB
Social Media Crawler and Sentiment
Miner
HTTP
PU
LL
: Qu
ery
Initi
ate
d
PUSH: Data Initiated
SPARQL
androjena
Sentiment Mining
626.10.2011 - SW Challenge 2011, ISWC 2011, Bonn, Germany
Micropost message
MorphologicallyAnalyzable?
Rule based Analysis
Auto generated rules
Auto generated rules
Learneddocuments
SVMs
Syllable KernelSyllable Kernel
Sentiment of the tweet
Yes No
• Precision tests:– Auto-generated
rules ≈ 70%
– Manually-coded rules ≈ 90%
– Syllable kernel ≈ 50~60%
• Our target > 85%
SOR - Geo-Spatial KB
726.10.2011 - SW Challenge 2011, ISWC 2011, Bonn, Germany
C-SPARQL and Streaming Linked Data Server
826.10.2011 - SW Challenge 2011, ISWC 2011, Bonn, Germany
• A machine learning framework for inductive materialization
– Detects interesting data patterns– Predics RDF-triples
• i.e., which restaurant a user will tweet positively about
• Caractheristics – Capability to deal with sparse, high-dimensional
and incomplete data– Multivariate latent space based approach– Modularized approach for easily integrating contextual
information
SUNS (Statistical Unit Node Sets)
926.10.2011 - SW Challenge 2011, ISWC 2011, Bonn, Germany
SELECT DISTINCT ?poi ?name ?lat ?long ?numPos ?prob WHERE { ?poi a ns:NamedPlace ; ns:name ?name ; geo:lat ?lat ; geo:long ?long . FILTER (f:within_distance(37.5, 126.9, ?lat, ?long, 200)) FILTER (f:dest_point_viewing(37.5, 126.9, ?lat, ?long, 90, 200)) { :someUser sioc:creator_of ?tweet . ?tweet twd:talksAboutPositively ?poi . WITH PROBABILITY ?prob ENSURE PROBABILITY [0.5..1) } ?poi twd:numberOfPositiveTweets ?numPos . } ORDER BY DESC(?numPos), ?prob, f:distance(37.5, 126.9, ?lat, ?long)LIMIT 10
Query Processing
1026.10.2011 - SW Challenge 2011, ISWC 2011, Bonn, Germany
GEO-SPATIAL
PROBABILISTIC
STREAMING
LarKC At Work
1126.10.2011 - SW Challenge 2011, ISWC 2011, Bonn, Germany
out
Query Rewriter
Query Evaluator
RDF2Matrix Plug-in
Streaming Linked Data
Server
SOR Invoker
SOR geo-spatial
KB
Social Media Crawler and Sentiment
Miner
HTTP
PU
LL
: Qu
ery
Initi
ate
d
PUSH: Data Initiated
SPARQL
androjena
Probabilistic part of the query to get personalized
recommendations (the “for me” button in BOTTARI)
Geo-Spatial part of the query
to get POIs closer to user
location
Streaming part of the query to get trends in users' sentiment
(the “emerging” button in BOTTARI)
Input user query is split
Results of the different
computations are joined
Evaluation - Efficacy
1226.10.2011 - SW Challenge 2011, ISWC 2011, Bonn, Germany
5 10 15 20 25 30
0,7
random
knnItem
emerging (C-SPARQL)
for me (SUNS)
SUNS + C-SPARQL
0,6
0,5
0,4
0,3
0,2
0,1
Evaluation - Efficiency
1326.10.2011 - SW Challenge 2011, ISWC 2011, Bonn, Germany
Hardware: 2.66 GHz Intel Core 2 Duo with 8 GB RAM
Evaluation – Scalability
1426.10.2011 - SW Challenge 2011, ISWC 2011, Bonn, Germany
Number of concurrent users
Que
ry L
aten
cy (
sec)
• End-user application
• Attractive and functional interface
• Real-world dynamic data
• Fully based on Semantic Web technologogies– RDF as common data format between heterogenous
components– SPARQL as query language
• Rigorously evaluated– Effective– High throughput for handling dynamic data– Scalable in number of concurrent users
• Commercial Potential
Conclusions
1526.10.2011 - SW Challenge 2011, ISWC 2011, Bonn, Germany
Emanuele Della ValleJoint work with:
CEFRIEL: Irene Celino, Daniele Dell’Aglio, Marco Balduini SALTLUX: Tony Lee, Seonho Kim SIEMENS: Volker Tresp, Yi Huang
Any question?