Upload
inderjeet-singh
View
841
Download
0
Tags:
Embed Size (px)
Citation preview
Determining Relevance Rankings
from Search Click Logs
Dr. Carson Kai-Sang Leung
Inderjeet Singh(Database and Data Mining Lab)
04/11/2023Comp 7220 2
Road Map Introduction
Problem
Solution Methodology
Evaluation
04/11/2023Comp 7220 3
What are search click logs?
Mining user behaviour/preferences Predict document relevance Re-rank the search results Compare different ranking functions (train/test) Optimize the ad. performance Query suggestions
How Big are these logs?◦10+ terabyte of entries each day◦Composed of billions of distinct (query, url)’s
04/11/2023Comp 7220 4
What are their uses?
04/11/2023Comp 7220 5
Introduction
Documents/results presented in order of the relevance to the
query
Many ranking factors considered when
ranking these results
Ranking factors depend on query,
document and query-document pair
Improving ranking based on user preferences
(likes/dislikes)
Personalized search +Social search
Recency (temporal) ranking
04/11/2023Comp 7220 6
[David Green; blog]
04/11/2023Comp 7220 7
User clicks as Votes (Intrinsic feedback)
# of clicks received
[CIKM'09 Tutorial]
04/11/2023Comp 7220 8
ProblemTrust factor: Preferences to certain URLs more than the other, e.g., wikipedia.com, stackoverflow.com, Yahoo answers, about.com
What is missing (in previous models) ? Modelling trust factor Clicks on sponsored results Related queries/searches (sidebars) Realistic and flexible assumptions on user behaviour
04/11/2023Comp 7220 9
04/11/2023Comp 7220 10
Different intents for search
1. Informational query – “DDR3 memory”, “SATA 3 hard drives”, “American history”
2. Navigational query – “gmail”, “digg”, “CIBC”, “CIBC credit cards”
How user behaviour modelling works?
04/11/2023Comp 7220 11
Snippet Examine?
Snippet Attractive?
Enough Utility?
Yes
Yes
Snippet Examine?
Snippet Attractive?
Enough Utility?
Yes
Yes
No
No
No
No
No
No
Yes YesEnd End
04/11/2023Comp 7220 12
Solution MethodologyRealistic and flexible assumptions on user behaviour (session modelling)
Consider trust bias (trust factor)
Order results for particular query by relevance scores predicted by model
Comparison of this order to the editorial ranking
Is it good model? If orderings agree upto a considerable extent
What NEXT
04/11/2023Comp 7220 13
Ranking function tests with different class of queries for metric gains
If metric gains over baseline ranking function? Model insights can be used as a feature in ranking function
Deriving retrieval/ranking function
Deploy this model as a feature/factor for predicting relevance in learning to rank algorithm
04/11/2023Comp 7220 14
EvaluationMetrics• Discounted Cumulative Gain (DCG)• Normalized DCG (NDCG)• Precision• Recall
Two types of data1. Search click logs (from real or meta search engines)2. Benchmarking dataset LEarning TO Rank (LETOR) for
information retrieval
04/11/2023Comp 7220 15
[Chapelle and Zhang, 2009]
[Guo et al., 2009]
David Green Blog. http://davidgreen.com/comparative-value-of-google-search-rankings (accessed 20th-April-2011)
Fan Guo and Chao Liu. Statistical Models for Web Search Click Log Analysis. Tutorial, 2009
Fan Guo, Chao Liu, and Yi Min Wang. Efficient multiple-click models in web search. In Proceedings of Second Web Search and Data Mining (WSDM) Conference, Barcelona, Spain, pages 124-131. ACM, 9-11 February, 2009
Olivier Chapelle and Ye Zhang. A dynamic bayesian network click model for web search and ranking. In Proceedings of the 18th International Conference on World Wide web (WWW), Madrid, Spain, pages 1-10, ACM, 20-24 April, 2009
04/11/2023Comp 7220 16
References
04/11/2023Comp 7220 17
[Tmcnet.com Blog]