24
Delft University of Technology Leveraging User Modeling on the Social Web with Linked Data #ICWE2012 Berlin, Germany, July 27 th 2012 Fabian Abel, Claudia Hauff, Geert-Jan Houben, Ke Tao Web Information Systems, TU Delft, the Netherlands

Leveraging User Modeling on the Social Web with Linked Data

  • Upload
    ke-tao

  • View
    420

  • Download
    0

Embed Size (px)

DESCRIPTION

Talk give by Ke Tao at #ICWE2012 titled Leveraging User Modeling on the Social Web with Linked Data

Citation preview

Page 1: Leveraging User Modeling on the Social Web with Linked Data

Delft University of Technology

Leveraging User Modeling on the Social Web with Linked Data

#ICWE2012 Berlin, Germany, July 27th 2012

Fabian Abel, Claudia Hauff, Geert-Jan Houben, Ke Tao

Web Information Systems, TU Delft, the Netherlands

Page 2: Leveraging User Modeling on the Social Web with Linked Data

2 Leveraging User Modeling on the Social Web with Linked Data

Personalized Recommendations

Personalized Search Adaptive Systems

What we do: Science and Engineering for the Personal Web

Social Web

Analysis and User Modeling

user/usage data

Semantic Enrichment, Linkage and Alignment

domains: news social media cultural heritage public data e-learning

recommending points of interest

Page 3: Leveraging User Modeling on the Social Web with Linked Data

3 Leveraging User Modeling on the Social Web with Linked Data

Kyle

Hometown: South Park, Colorado

Page 4: Leveraging User Modeling on the Social Web with Linked Data

4 Leveraging User Modeling on the Social Web with Linked Data

Kyle recently uploaded photos to Flickr

During his trip to the Netherlands, he uploaded pictures to Flickr.

tags: delft, vermeer geo: The Hague

tags: girl with pearl earring geo: The Hague

Page 5: Leveraging User Modeling on the Social Web with Linked Data

5 Leveraging User Modeling on the Social Web with Linked Data

Kyle tweets about his upcoming trip

Looking forward to visit Paris next week!

Page 6: Leveraging User Modeling on the Social Web with Linked Data

6 Leveraging User Modeling on the Social Web with Linked Data

Interests of Kyle?

tags: delft, vermeer geo: The Hague

tags: girl with pearl earring geo: The Hague

Looking forward to visit Paris next week!

• Given Kyle’s Flickr and Twitter activities, can we infer Kyle’s interests?

• Knowing that Kyle will visit Paris, France, can we recommend him places that might be interesting for him?

Page 7: Leveraging User Modeling on the Social Web with Linked Data

7 Leveraging User Modeling on the Social Web with Linked Data

Challenges • How to create a meaningful profile that supports the given

application? à how to bridge between the Social Web chatter of a user and the candidate items of a recommender system?

User  Profile  concept      weight  

?  Applica0on  

that  demands  user    interest  profile  regarding  

                 -­‐concepts  

c1  c5  

c6  

c9  

Social Web d5  

c1   c3  

c2  d1   d2   d3   d4  

d4   d3  

Candidate Items Recommendations

Page 8: Leveraging User Modeling on the Social Web with Linked Data

8 Leveraging User Modeling on the Social Web with Linked Data

Challenge of Recommending Points of Interests (POIs) to Kyle

Johannes Vermeer

dbpedia:Louvre Looking forward to visit Paris next week!

dbpedia:Paris

The lacemaker

The astronomer

Page 9: Leveraging User Modeling on the Social Web with Linked Data

9 Leveraging User Modeling on the Social Web with Linked Data

c1  

c4  

c5  

c6  

LOD-based User Modeling

weigh0ng  strategies  

Applica0on  that  demands  user    

interest  profile  regarding                    -­‐concepts  

c2  

c3  cx  

cy  

c9  

User  Profile  concept      weight  

0.4  

0.1  

0.2  

c1  

c2  

c3  …  …  

concepts  that  can  be  extracted  from  the  user  data    

user  data  

Social  Web  

background  knowledge    (graph  structures)  

Linked  Data  

Page 10: Leveraging User Modeling on the Social Web with Linked Data

10 Leveraging User Modeling on the Social Web with Linked Data

User Modeling Building Blocks

Page 11: Leveraging User Modeling on the Social Web with Linked Data

11 Leveraging User Modeling on the Social Web with Linked Data

tags: girl with pearl earring geo: The Hague

dbpedia:Girl_with_pearl_earring

A  

Artifact

B  

The lacemaker

C  

The astronomer

…  

rdf:type

Johannes Vermeer foaf:maker

foaf:maker

Strategies for exploiting the RDF-based background knowledge graph

Direct Mention Indirect Mention @RDF_statement Indirect Mention

@RDF_graph

dbpedia:The_Hague

dbpedia:Louvre dbpprop:location

Page 12: Leveraging User Modeling on the Social Web with Linked Data

12 Leveraging User Modeling on the Social Web with Linked Data

Weighting Scheme

Weighting scheme: count the number of occurrences of a given graph pattern.

User  Profile  concept      weight  

?  

?  

?  

c1  

c2  

c3  …  …  

User  Profile  concept      weight  

374  

152  

73  

c1  

c2  

c3  …  …  

Page 13: Leveraging User Modeling on the Social Web with Linked Data

13 Leveraging User Modeling on the Social Web with Linked Data

Source of User Data

• Twitter • Flickr

Page 14: Leveraging User Modeling on the Social Web with Linked Data

14 Leveraging User Modeling on the Social Web with Linked Data

Mining the Geographic Origins of User Data

• Tweets or Flickr images posted by a GPS-enabled device;

• Images geo-tagged manually on Flickr world map

• Otherwise : exploit the title and the tags the users assign to their images

Page 15: Leveraging User Modeling on the Social Web with Linked Data

15 Leveraging User Modeling on the Social Web with Linked Data

Evaluation

Page 16: Leveraging User Modeling on the Social Web with Linked Data

16 Leveraging User Modeling on the Social Web with Linked Data

Research Questions

1.  How does the source of user data influence the quality in deducing user preferences for POIs?

2.  How does the consideration of background knowledge from the Linked Open Data Cloud impact the quality of the user modeling?

3.  What (combination of) user modeling strategies allows for the best quality?

Page 17: Leveraging User Modeling on the Social Web with Linked Data

17 Leveraging User Modeling on the Social Web with Linked Data

Dataset

users 394

tweets 2,489,088 location 11% pictures 833,441 location 70.6%(within 10km)

duration 11 months

Page 18: Leveraging User Modeling on the Social Web with Linked Data

18 Leveraging User Modeling on the Social Web with Linked Data

Dataset Characteristics: Profile Sizes

0% 20% 40% 60% 80% 100%user percentiles

0

0.001

0.01

0.1

1

10

100ra

tio b

etw

een

Flic

kr-b

ased

pro

file

size

a

nd T

witt

er-b

ased

pro

file

size size of Flickr-based profile is greater than

size of Twitter-based profile

size of Twitter-based profile is greater than size of Flickr-based profile Flickr-based

profile is empty

81.4% of the users, Twitter-based profiles are

bigger than Flickr-based profiles

Page 19: Leveraging User Modeling on the Social Web with Linked Data

19 Leveraging User Modeling on the Social Web with Linked Data

Experimental Setup • Task:

= Recommending POIs = Predicting POIs which a user will visit

• Ground truth: •  split data into training data (= first nine months) and test data (= last

two months) •  POIs that the user visited in the last two months are considered as

relevant

• Metrics: •  Precision@k, Recall@k and F-Measure@k: precision, recall and f-

measure within the top k of the ranking of recommended items

Page 20: Leveraging User Modeling on the Social Web with Linked Data

20 Leveraging User Modeling on the Social Web with Linked Data

Results: Impact of User Data Source Selection

!"!#$"!#%"!#&"!#'"!#("!#)"!#*"!#+"

,-./01" 23.451" ,-./01"6"23.451"

!"#$%&%'()*+#$,--*,(.

*/01

#,&2"#*

78$!"

98$!"

,8$!"

Combination of Twitter and Flickr user data

allows for best performance

Twitter seems to be more valuable for the

given application

Page 21: Leveraging User Modeling on the Social Web with Linked Data

21 Leveraging User Modeling on the Social Web with Linked Data

Results: Impact of Strategies for Exploiting RDF-based Background Knowledge

!"!#$"!#%"!#&"!#'"!#("!#)"!#*"!#+"!#,"

-./012"304564"

.4-./012"304564"7"

.4-./012"304564"77"

!"#$%&%'()*+#$,--*,(.

*/01

#,&2"#*

89$!"

:9$!"

;9$!"

The more background information the better

the user modeling performance

Page 22: Leveraging User Modeling on the Social Web with Linked Data

22 Leveraging User Modeling on the Social Web with Linked Data

Results: Combining different Strategies Leveraging User Modeling on the Social Web with Linked Data 13

Table 2. Overview on the di�erent strategies for integrating background knowledge.Twitter and Flickr are exploited as user data source.

strategy P@10 R@10 F@10 P@20 R@20 F@20

core strategies:

direct mentions 0.715 0.260 0.412 0.580 0.179 0.298

indirect mentions I 0.699 0.268 0.426 0.566 0.185 0.308

indirect mentions II 0.820 0.312 0.475 0.727 0.436 0.569

combined strategies:

direct & indirect mentions I 0.733 0.216 0.360 0.608 0.287 0.416

direct & indirect mentions II 0.836 0.333 0.4975 0.747 0.466 0.596

indirect mentions I + II 0.830 0.325 0.489 0.739 0.456 0.587

direct & indirect mentions I + II 0.839 0.337 0.502 0.751 0.473 0.603

which does not exploit RDF statements from the Linked Open Data Cloud,the F@20 performance is more than doubled. The consideration of backgroundknowledge obtained from the Linked Open Data Cloud therefore clearly improvesthe performance of user modeling on the Social Web which answers our mainresearch question.

Furthermore, we can answer the specific research questions raised at thebeginning of this section as follows. For the task of recommending points ofinterests, it turns out that (1) the aggregation of Twitter and Flickr user dataallows for the best user modeling performance and that (2) the user modelingquality increases the more background information we consider from the LinkedOpen Data Cloud. Finally, (3) the best performance is achieved by combiningthe di�erent graph patterns for acquiring background information and inferringuser preferences.

6 ConclusionsIn this paper, we proposed a framework for leveraging user modeling on the So-cial Web with information from the Linked Open Data Cloud. Our frameworkmonitors user activities on Social Web platforms such as Twitter and Flickr, in-fers the semantic meaning of user activities and provides strategies for gatheringbackground information from the Web of Data to generate semantically mean-ingful user profiles that support a given application. We showcase and evaluateour framework in the context of a geospatial recommender system where the corechallenge relies in deducing user preferences for points of interests. Therefore, wealso present a method that allows for the semantic enrichment of Flickr picturesby (i) estimating the geographical location where a picture was taken and (ii)exploiting GeoNames in order to identify related DBpedia concepts.

Our evaluation proves the e�ectiveness of our user modeling framework.Based on a large Twitter and Flickr dataset of more than 2.4 million tweets

Leveraging User Modeling on the Social Web with Linked Data 13

Table 2. Overview on the di�erent strategies for integrating background knowledge.Twitter and Flickr are exploited as user data source.

strategy P@10 R@10 F@10 P@20 R@20 F@20

core strategies:

direct mentions 0.715 0.260 0.412 0.580 0.179 0.298

indirect mentions I 0.699 0.268 0.426 0.566 0.185 0.308

indirect mentions II 0.820 0.312 0.475 0.727 0.436 0.569

combined strategies:

direct & indirect mentions I 0.733 0.216 0.360 0.608 0.287 0.416

direct & indirect mentions II 0.836 0.333 0.4975 0.747 0.466 0.596

indirect mentions I + II 0.830 0.325 0.489 0.739 0.456 0.587

direct & indirect mentions I + II 0.839 0.337 0.502 0.751 0.473 0.603

which does not exploit RDF statements from the Linked Open Data Cloud,the F@20 performance is more than doubled. The consideration of backgroundknowledge obtained from the Linked Open Data Cloud therefore clearly improvesthe performance of user modeling on the Social Web which answers our mainresearch question.

Furthermore, we can answer the specific research questions raised at thebeginning of this section as follows. For the task of recommending points ofinterests, it turns out that (1) the aggregation of Twitter and Flickr user dataallows for the best user modeling performance and that (2) the user modelingquality increases the more background information we consider from the LinkedOpen Data Cloud. Finally, (3) the best performance is achieved by combiningthe di�erent graph patterns for acquiring background information and inferringuser preferences.

6 ConclusionsIn this paper, we proposed a framework for leveraging user modeling on the So-cial Web with information from the Linked Open Data Cloud. Our frameworkmonitors user activities on Social Web platforms such as Twitter and Flickr, in-fers the semantic meaning of user activities and provides strategies for gatheringbackground information from the Web of Data to generate semantically mean-ingful user profiles that support a given application. We showcase and evaluateour framework in the context of a geospatial recommender system where the corechallenge relies in deducing user preferences for points of interests. Therefore, wealso present a method that allows for the semantic enrichment of Flickr picturesby (i) estimating the geographical location where a picture was taken and (ii)exploiting GeoNames in order to identify related DBpedia concepts.

Our evaluation proves the e�ectiveness of our user modeling framework.Based on a large Twitter and Flickr dataset of more than 2.4 million tweets

Leveraging User Modeling on the Social Web with Linked Data 13

Table 2. Overview on the di�erent strategies for integrating background knowledge.Twitter and Flickr are exploited as user data source.

strategy P@10 R@10 F@10 P@20 R@20 F@20

core strategies:

direct mentions 0.715 0.260 0.412 0.580 0.179 0.298

indirect mentions I 0.699 0.268 0.426 0.566 0.185 0.308

indirect mentions II 0.820 0.312 0.475 0.727 0.436 0.569

combined strategies:

direct & indirect mentions I 0.733 0.216 0.360 0.608 0.287 0.416

direct & indirect mentions II 0.836 0.333 0.4975 0.747 0.466 0.596

indirect mentions I + II 0.830 0.325 0.489 0.739 0.456 0.587

direct & indirect mentions I + II 0.839 0.337 0.502 0.751 0.473 0.603

which does not exploit RDF statements from the Linked Open Data Cloud,the F@20 performance is more than doubled. The consideration of backgroundknowledge obtained from the Linked Open Data Cloud therefore clearly improvesthe performance of user modeling on the Social Web which answers our mainresearch question.

Furthermore, we can answer the specific research questions raised at thebeginning of this section as follows. For the task of recommending points ofinterests, it turns out that (1) the aggregation of Twitter and Flickr user dataallows for the best user modeling performance and that (2) the user modelingquality increases the more background information we consider from the LinkedOpen Data Cloud. Finally, (3) the best performance is achieved by combiningthe di�erent graph patterns for acquiring background information and inferringuser preferences.

6 ConclusionsIn this paper, we proposed a framework for leveraging user modeling on the So-cial Web with information from the Linked Open Data Cloud. Our frameworkmonitors user activities on Social Web platforms such as Twitter and Flickr, in-fers the semantic meaning of user activities and provides strategies for gatheringbackground information from the Web of Data to generate semantically mean-ingful user profiles that support a given application. We showcase and evaluateour framework in the context of a geospatial recommender system where the corechallenge relies in deducing user preferences for points of interests. Therefore, wealso present a method that allows for the semantic enrichment of Flickr picturesby (i) estimating the geographical location where a picture was taken and (ii)exploiting GeoNames in order to identify related DBpedia concepts.

Our evaluation proves the e�ectiveness of our user modeling framework.Based on a large Twitter and Flickr dataset of more than 2.4 million tweets

Combining all background exploitation strategies improves the user modeling performance clearly

Page 23: Leveraging User Modeling on the Social Web with Linked Data

23 Leveraging User Modeling on the Social Web with Linked Data

Conclusions

What we did: • LOD-based User Modeling on the Social Web • Different strategies for exploiting RDF-based background

knowledge Findings: 1.  Combination of different user data sources (Flickr & Twitter) is

beneficial for the user modeling performance 2.  User modeling quality increases the more background

knowledge one considers 3.  Combination of strategies achieves the best performance

Future work: • Investigate weighting schemes that weight the different RDF

graph patterns for acquiring background knowledge differently

Page 24: Leveraging User Modeling on the Social Web with Linked Data

24 Leveraging User Modeling on the Social Web with Linked Data

Thank you!

Email: [email protected] Twitter: @wisdelft @taubau http://persweb.org