57
Thanh Vu Computing and Communications Department The Open University Dynamic User Profiling for Search Personalisation

Dynamic User Profiling for Search Personalisation

Embed Size (px)

Citation preview

Page 1: Dynamic User Profiling for Search Personalisation

Thanh VuComputing and Communications

DepartmentThe Open University

Dynamic User Profiling for Search Personalisation

Page 2: Dynamic User Profiling for Search Personalisation

Classical Search Systems

2

AOL, Altavista return search results based onThe user input queryRegardless of the user searching preferences

Different users submit the same input query will get the same returned result list

Queries are usually short and ambiguous, e.g., Michael Jordan, Java, etc.

Different users have different information needs with the same input query

Page 3: Dynamic User Profiling for Search Personalisation

Search PersonalisationReturn search results based on

The input queryThe user searching interests

Different users submit the same input query will probably get different search result lists

Even an individual user will get different search results at different search times (e.g., Open US)

3

Page 4: Dynamic User Profiling for Search Personalisation

4

Part I: Dynamic group formation

Page 5: Dynamic User Profiling for Search Personalisation

The performance of search personalisation

depends onthe richness of a user

profileJ. Teevan, M. R. Morris, and S. Bush. Discovering and using groups to improve personalized search. In WSDM’2009

5

Page 6: Dynamic User Profiling for Search Personalisation

Topic-based user profilesUse Human generated ontology (ODP –

dmoz.org) to extract topics from all clicked/relevant documents of a specific user to build her profile

1. R. W. White, et al., Enhancing Personalized Search by Mining and Modeling Task Behavior. In WWW’20132. P. N. Bennett, et al., Modeling the impact of short- and long-term behavior on search personalization. In SIGIR’2012

6

Page 7: Dynamic User Profiling for Search Personalisation

Challenges for Human Generated OntologyNew topics which are not covered in the

Ontology will possibly emerge overtimeExpensive human effort to classify/maintain

each document into correct categories

7

Page 8: Dynamic User Profiling for Search Personalisation

Enriching a user profileUse information of the group of users who

share common interests

R. W. White, W. Chu, A. Hassan, X. He, Y. Song, and H. Wang. Enhancing personalized search by mining and modeling task behavior. WWW '13, pages 1411-1420, Switzerland, 2013. ACM8

Page 9: Dynamic User Profiling for Search Personalisation

Challenges for grouping methodsConstruct groups statically using some

predetermined criterions such as common clicked documentsUsers in a group may have different interests

on different topics w.r.t the input query

Z. Dou, R. Song, and J.-R. Wen. A large-scale evaluation and analysis of personalized search strategies. WWW '07, pages 581-590, NY, USA, 2007. ACM.9

Page 10: Dynamic User Profiling for Search Personalisation

Research QuestionHow can we enrich user profiles with dynamic group formation?

1. How can we dynamically group users who share common interests?

2. How can we enrich user profiles with group information?

3. Can enriched user profiles help to improve search performance?

10

Page 11: Dynamic User Profiling for Search Personalisation

Dynamic group formationThe groups should be dynamically

constructed in response to the user’s input query

11

Page 12: Dynamic User Profiling for Search Personalisation

Applying Latent Dirichlet Allocation

12

Page 13: Dynamic User Profiling for Search Personalisation

Constructing a user profileAverage the relevant documents over

topics

13

Page 14: Dynamic User Profiling for Search Personalisation

Query-dependent user groupingConstruct shared user profilesUse the input query as an indicator for

grouping users

14

Page 15: Dynamic User Profiling for Search Personalisation

Constructing a shared user profile

15

Page 16: Dynamic User Profiling for Search Personalisation

Query-dependent user groupingP(q|z) =

16

Page 17: Dynamic User Profiling for Search Personalisation

Query-dependent user grouping

The 2-nearest users

0.450.350.20

17

Page 18: Dynamic User Profiling for Search Personalisation

Enriching a user profileAverage all users in the group over topics

18

Page 19: Dynamic User Profiling for Search Personalisation

Re-ranking search resultsFor each input query q

Download the top n ranked search results from the search engine

Compute a personalised score for each web page d given the current user u – p(d|u)

Combine the personalised score p(d|u) and the original rank r(q,d), to get a final score

),()|(),|(

dqrudpqudf

19

Page 20: Dynamic User Profiling for Search Personalisation

Re-ranking search results Query: MU

20

Page 21: Dynamic User Profiling for Search Personalisation

DatasetQuery logs from Bing search engine for 15

days from 1st to 15th July 2012, 106 anonymous users

A relevant document is a click with dwell time of at least 30 seconds or the last click in a session (SAT click)

21

Page 22: Dynamic User Profiling for Search Personalisation

Evaluation metricsInverse Average Rank (IAR)

Personalisation Gain (P-Gain)

22

Page 23: Dynamic User Profiling for Search Personalisation

Baseline and Personalisation StrategiesBaseline and Personalisation Strategies

Baseline: The original ranked results from Bing

S_Profile: Use only the current user profileS_Group: Enrich the profile with static groupD_Group: Enrich the profile with dynamic

group

23

Page 24: Dynamic User Profiling for Search Personalisation

Overall Performance

24

Page 25: Dynamic User Profiling for Search Personalisation

25

Part II: Temporal User Profiles

Page 26: Dynamic User Profiling for Search Personalisation

Challenges for Time-awarenessPrevious methods use all the

clicked/relevant documents of a user to build her searching profile

The documents are treated equally without considering temporal features (i.e., the time of documents being clicked and viewed)The profile is too broad Cannot fully express the current interest of

the user1. T. T. Vu, et al., Improving search personalisation with dynamic group formation. In SIGIR’20142. K. Raman, et al., Toward whole-session relevance: Exploring intrinsic diversity in web search. In SIGIR’2013

26

Page 27: Dynamic User Profiling for Search Personalisation

Research QuestionHow can we build user profiles with time-awareness?

1. How can we build temporal user profiles?2. Can the time-aware profiles help improve

search performance?

27

Page 28: Dynamic User Profiling for Search Personalisation

Building temporal user profiles (1)Non-temporal method

4th 1st2nd3rd

FootballLawHealthOS

0.510.330.110.05

Clicked documents

FootballLawOSHealth

0.550.270.100.08

LawOSHealthFootball

0.410.370.120.10

OSLawFootballHealth

0.650.210.100.04

Distribution over topics

FootballLawOSHealth

0.320.300.290.09

Means over topics

The topic-based user profile

28

Page 29: Dynamic User Profiling for Search Personalisation

Building temporal user profiles (2)Our method

1st

FootballLawHealthOS

0.510.330.110.05

FootballLawHealthOS

0.510.330.110.05

The temporal topic user profile

0.90

29

Page 30: Dynamic User Profiling for Search Personalisation

FootballLawHealthOS

0.530.300.090.08

Building temporal user profiles (2)

2nd 1st

FootballLawHealthOS

0.510.330.110.05

FootballLawOSHealth

0.550.270.100.08

The temporal topic user profile

0.91 0.90

30

Page 31: Dynamic User Profiling for Search Personalisation

FootballLawOSHealth

0.370.340.190.10

0.91

0.92

Building temporal user profiles (2)

3rd 1st2nd

FootballLawHealthOS

0.510.330.110.05

FootballHealthOSLaw

0.550.270.100.08

LawOSHealthFootball

0.410.370.120.10

The temporal topic user profile

0.90

31

Page 32: Dynamic User Profiling for Search Personalisation

OSLawFootballHealth

0.320.300.290.09

Building temporal user profiles (2)

4th 1st2nd3rd

FootballLawHealthOS

0.510.330.110.05

FootballHealthOSLaw

0.550.270.100.08

LawOSHealthFootball

0.410.370.120.10

OSLawFootballHealth

0.650.210.100.04

Temporal topic profile

0.93

0.92

0.91

0.90

FootballLawOSHealth

0.320.300.290.09

Non-temporal topic profile

32

Page 33: Dynamic User Profiling for Search Personalisation

Building temporal user profiles (3)Du = {d1, d2, …, dn} is a relevant document

set of the user uThe user profile of u is a distribution over

the topic Z (extracted by LDA)

tdi = n indicates that di is the nth most relevant/clicked document of u

α is the decay parameter; K is the normalisation factor

33

Page 34: Dynamic User Profiling for Search Personalisation

Building temporal user profiles (4)Long-term user profile

Use relevant documents extracted from the user’s whole search history

Daily user profileUse relevant documents extracted from the

search history of the user in the current searching day

Session user profileUse relevant documents extracted from the

search history of the user in the current search session

34

Page 35: Dynamic User Profiling for Search Personalisation

Re-ranking search results (1)1 32

HealthLawFootballOS

0.510.330.110.05

FootballLawHealthOS

0.550.270.130.05

FootballOSHealthLaw

0.410.370.120.10

Original Rank

132

HealthLawFootballOS

0.510.330.110.05

FootballLawHealthOS

0.550.270.130.05

FootballOSHealthLaw

0.410.370.120.10

After re-ranking

FootballLawOSHealth

0.470.240.160.12

The user profile (p)

35

Page 36: Dynamic User Profiling for Search Personalisation

Re-ranking search results (2)Personalised scores

Use Jensen-Shannon divergence (DJS[d||p] )

1 32

HealthLawFootballOS

0.510.330.110.05

FootballLawHealthOS

0.550.270.130.05

FootballOSHealthLaw

0.410.370.120.10

FootballLawOSHealth

0.470.240.160.12

Returned documents (d)

The user profile (p)

36

Page 37: Dynamic User Profiling for Search Personalisation

Re-ranking search results (3)Re-ranking Features

Re-Ranking Algorithm: LambdaMART[1]

1. C. J. Burges, et al., Learning to rank with non-smooth cost functions. In NIPS’2007.

Feature DescriptionPersonalised FeaturesLongTermScore

Personalised score between document and long-term profile

DailyScore Personalised score between document and daily profile

SessionScore Personalised score between document and session profile

Non-personalised FeaturesDocRank Rank of document on original returned listQuerySim Cosine similarity score between current and

previous queriesQueryNo Total number of queries that have been submitted in

the current search session (included the current query)

37

Page 38: Dynamic User Profiling for Search Personalisation

EvaluationDatasetThe query logs of 1166 anonymous users in four

weeks, from 01st to 28th July 2012A log entity consists of an anonymous user

identifier, a query, top-10 returned URLs, and clicked documents along with the user’s dwell time

Download all the URLs’ content for learning topicsA search session is demarcated by 30 minutes of

user inactivityA relevant document is a click with dwell time of at

least 30 seconds or the last click in a session (SAT click)

38

Page 39: Dynamic User Profiling for Search Personalisation

Evaluation methodologyAssign a positive (relevant) label to a

returned URL ifit is a SAT click in the current queryit is a SAT click in one of the other repeated

queries in the same search sessionAssign negative (irrelevant) labels to the

rest of URLs

39

Page 40: Dynamic User Profiling for Search Personalisation

Personalisation Methods and BaselinesPersonalisation Methods

LON uses only LongTermScore from long-term profileDAI uses only DailyScore from daily profileSES uses SessionScore from session profileALL uses all personalised scores from three profiles

(ALL)Baselines

Default is the default ranking returned by the search engine

Static uses the LongTermScore from long-term profile without time-awareness (i.e., not using decay function)

40

Page 41: Dynamic User Profiling for Search Personalisation

ResultsEvaluation metrics

Mean Average Precision (MAP)Precision (P@k)Mean Reciprocal Rank (MRR)Normalized Discounted Cumulative Gain

(nDCG@k) For each evaluation metric, the higher

value indicates the better ranking

41

Page 42: Dynamic User Profiling for Search Personalisation

Overall Performance

• All the improvements over the baselines are significant with paired t-test of p < 0.001

42

Page 43: Dynamic User Profiling for Search Personalisation

Overall Performance

43

Page 44: Dynamic User Profiling for Search Personalisation

Overall Performance

44

Page 45: Dynamic User Profiling for Search Personalisation

Overall Performance

45

Page 46: Dynamic User Profiling for Search Personalisation

Overall Performance

46

Page 47: Dynamic User Profiling for Search Personalisation

TakeawaysDynamic Grouping

Grouping improves search performanceDynamic grouping outperforms static grouping

Temporal profilesThree temporal profiles help to improve

search performance over the default ranking and the use of non-temporal profile

Using all features (ALL) achieves the highest performance

The short-term profile achieves better performance than the longer-term profile

47

Page 48: Dynamic User Profiling for Search Personalisation

Thank you!Any questions?

48

Page 49: Dynamic User Profiling for Search Personalisation

Dataset (2)

49

Page 50: Dynamic User Profiling for Search Personalisation

Example of query logs

50

Page 51: Dynamic User Profiling for Search Personalisation

Click EntropiesP(d|q) is the percentage of the clicks on

document d among all the clicks for qA smaller query click entropy value

indicates more agreement between users on clicking a small number of web pages

51

Page 52: Dynamic User Profiling for Search Personalisation

Click entropies

52

Page 53: Dynamic User Profiling for Search Personalisation

Query Positions in Search SessionAim to study whether the position of a

query has any effect on the performance of the temporal latent topic profiles

Label the queries by their positions during the search

53

Page 54: Dynamic User Profiling for Search Personalisation

FootballLawHealthOS

0.510.330.110.05

Clicked documents

FootballHealthOSLaw

0.550.270.130.05

LawOSHealthFootball

0.410.370.120.10

OSLawFootballHealth

0.650.150.110.09

Distribution over topics

FootballLawOSHealth

0.320.290.280.11

Means over topics

The topic-based user profile

54

Page 55: Dynamic User Profiling for Search Personalisation

Re-ranking search results (1) Query: MU

55

Page 56: Dynamic User Profiling for Search Personalisation

Pre-processingRemove the queries whose positive label

set is empty from the datasetDiscard the domain-related queries (e.g.,

Facebook, Youtube)

56

Page 57: Dynamic User Profiling for Search Personalisation

Overall Performance

57