Determining Relevance Rankings from Search Click Logs

Determining Relevance Rankings

from Search Click Logs

Dr. Carson Kai-Sang Leung

Inderjeet Singh(Database and Data Mining Lab)

04/11/2023Comp 7220 2

Road Map Introduction

Problem

Solution Methodology

Evaluation

04/11/2023Comp 7220 3

What are search click logs?

Mining user behaviour/preferences Predict document relevance Re-rank the search results Compare different ranking functions (train/test) Optimize the ad. performance Query suggestions

How Big are these logs?◦10+ terabyte of entries each day◦Composed of billions of distinct (query, url)’s

04/11/2023Comp 7220 4

What are their uses?

04/11/2023Comp 7220 5

Introduction

Documents/results presented in order of the relevance to the

query

Many ranking factors considered when

ranking these results

Ranking factors depend on query,

document and query-document pair

Improving ranking based on user preferences

(likes/dislikes)

Personalized search +Social search

Recency (temporal) ranking

04/11/2023Comp 7220 6

[David Green; blog]

04/11/2023Comp 7220 7

User clicks as Votes (Intrinsic feedback)

# of clicks received

[CIKM'09 Tutorial]

04/11/2023Comp 7220 8

ProblemTrust factor: Preferences to certain URLs more than the other, e.g., wikipedia.com, stackoverflow.com, Yahoo answers, about.com

What is missing (in previous models) ? Modelling trust factor Clicks on sponsored results Related queries/searches (sidebars) Realistic and flexible assumptions on user behaviour

04/11/2023Comp 7220 9

04/11/2023Comp 7220 10

Different intents for search

1. Informational query – “DDR3 memory”, “SATA 3 hard drives”, “American history”

2. Navigational query – “gmail”, “digg”, “CIBC”, “CIBC credit cards”

How user behaviour modelling works?

04/11/2023Comp 7220 11

Snippet Examine?

Snippet Attractive?

Enough Utility?

Yes

Yes

Snippet Examine?

Snippet Attractive?

Enough Utility?

Yes

Yes

No

No

No

No

No

No

Yes YesEnd End

04/11/2023Comp 7220 12

Solution MethodologyRealistic and flexible assumptions on user behaviour (session modelling)

Consider trust bias (trust factor)

Order results for particular query by relevance scores predicted by model

Comparison of this order to the editorial ranking

Is it good model? If orderings agree upto a considerable extent

What NEXT

04/11/2023Comp 7220 13

Ranking function tests with different class of queries for metric gains

If metric gains over baseline ranking function? Model insights can be used as a feature in ranking function

Deriving retrieval/ranking function

Deploy this model as a feature/factor for predicting relevance in learning to rank algorithm

04/11/2023Comp 7220 14

EvaluationMetrics• Discounted Cumulative Gain (DCG)• Normalized DCG (NDCG)• Precision• Recall

Two types of data1. Search click logs (from real or meta search engines)2. Benchmarking dataset LEarning TO Rank (LETOR) for

information retrieval

04/11/2023Comp 7220 15

[Chapelle and Zhang, 2009]

[Guo et al., 2009]

David Green Blog. http://davidgreen.com/comparative-value-of-google-search-rankings (accessed 20th-April-2011)

Fan Guo and Chao Liu. Statistical Models for Web Search Click Log Analysis. Tutorial, 2009

Fan Guo, Chao Liu, and Yi Min Wang. Efficient multiple-click models in web search. In Proceedings of Second Web Search and Data Mining (WSDM) Conference, Barcelona, Spain, pages 124-131. ACM, 9-11 February, 2009

Olivier Chapelle and Ye Zhang. A dynamic bayesian network click model for web search and ranking. In Proceedings of the 18th International Conference on World Wide web (WWW), Madrid, Spain, pages 1-10, ACM, 20-24 April, 2009

04/11/2023Comp 7220 16

References

04/11/2023Comp 7220 17

[Tmcnet.com Blog]

Technology

Determining Relevance Rankings from Search Click Logs