67
How search logs can help improve future searches Arjen P. de Vries [email protected]

Twente ir-course 20-10-2010

Embed Size (px)

DESCRIPTION

Guest Lecture for MSc Information Retrieval course, October 20th, 2010, University of Twente.

Citation preview

Page 1: Twente ir-course 20-10-2010

How search logs can help improve future searches

Arjen P. de Vries

[email protected]

Page 2: Twente ir-course 20-10-2010

User Content

User UserContent Metadata

Content Indexing Interactions among users

Interaction with Content

Page 3: Twente ir-course 20-10-2010

User Content

Content Metadata

Content Indexing

Interaction with Content

Page 4: Twente ir-course 20-10-2010

(C) 2008, The New York Times Company

Anchor tekst: “continue reading”

Page 5: Twente ir-course 20-10-2010

Not much text to get you here...

A fan’s hyves page:Kyteman's HipHop Orchestra: www.kyteman.com

Ticket Sales Luxor theatre:May 22nd - Kyteman's hiphop Orchestra - www.kyteman.com

Kluun.nl:De site van Kyteman

Blog Rockin’ Beats:The 21-year-old Kyteman (trumpet player, composer and Producer Colin Benders), has worked for 3 years on his debute:the Hermit sessions.

Jazzenzo:... a performance by the popularKyteman’s Hiphop Orkest

Page 6: Twente ir-course 20-10-2010

‘Co-creation’ Social Media:

Consumer becomes a co-creator ‘Data consumption’ traces

In essence: many new sources to play the role of anchor text!

Page 7: Twente ir-course 20-10-2010

Tweets about blip.tv E.g.: http://blip.tv/file/2168377

Amazing Watching “World’s most realistic 3D city

models?” Google Earth/Maps killer Ludvig Emgard shows how maps/satellite pics

on web is done (learn Google and MS!) and ~120 more Tweets

Page 8: Twente ir-course 20-10-2010
Page 9: Twente ir-course 20-10-2010

Information NeedRepresentation

ResultRepresentation

ResultRepresentation

Click

ResultRepresentation

Click

Result Representation

Click

Anchor text

Weblink

Anchor text

Weblink

Anchor text

Weblink

Anc

hor

text

Rel

evan

ce f

eedb

ack

Every search request is metadata!

That metadata is useful as expanded content representation, to capture more diverse views on the same content, and reduce the vocabulary difference between creators of

content, indexers, and users, as a means to adapt retrieval systems to the user context, and even as training data for machine learning of multimedia ‘detectors’!

Page 10: Twente ir-course 20-10-2010

Types of feedback Explicit user feedback

Images/videos marked as relevant/non-relevant Selected keywords that are added to the query Selected concepts that are added to the query

Implicit user feedback Clicking on retrieved images/videos (click-through

data) Bookmarking or sharing an image/video Downloading/buying an image/video

Page 11: Twente ir-course 20-10-2010

Who interact with the data? Interactive relevance feedback

Current user in current search Personalisation

Current user in logged past searches Context adaptation

Users similar to current user in logged past searches

Collective knowledge All users in logged past searches

Page 12: Twente ir-course 20-10-2010

Applications exploiting feedback

Given a query, rank all images/videos based on past users feedback

Given an image/video, rank all images/videos based on past users feedback

Page 13: Twente ir-course 20-10-2010

Applications exploiting feedback

Interactive relevance feedback Modify query and re-rank, based on current

user's explicit feedback (and current ranking)

Blind relevance feedback Modify query and re-rank, based on feedback

by past users and current ranking

Page 14: Twente ir-course 20-10-2010

Applications exploiting feedback

Query suggestion Recommend keywords/concepts to support

users in interactive query modification (refinement or expansion)

Page 15: Twente ir-course 20-10-2010
Page 16: Twente ir-course 20-10-2010

‘Police Sting’Sting performs with The Police

‘Elton Diana’Sting attends Versace memorial

service

‘Led Zeppelin’Sting performs at Led Zeppelin concert

Page 17: Twente ir-course 20-10-2010

Exploiting User Logs (FP6 Vitalas T4.2)

Aim Understand the information-searching process

of professional users of a picture portal Method

Building in collaboration with Belga an increasingly large dataset that contains the log of Belga's users' search interactions

Processing, analysing, and investigating the use of this collective knowledge stored in search logs in a variety of tasks

Page 18: Twente ir-course 20-10-2010

Search logs

Search logs in Vitalas Searches performed by users through Belga's web

interface from 22/06/2007 to 12/10/2007 (101 days) 402,388 tuples <date,time,userid,action>

"SEARCH_PICTURES" (138,275) | "SHOW_PHOTO" (192,168) | "DOWNLOAD_PICTURE" (38,070) | "BROWSE_GALLERIES" (8,878) | "SHOW_GALLERY" (24,352) | "CONNECT_IMAGE_FORUM" (645)

17,861 unique (‘lightly normalised’) queries 96,420 clicked images

Web image search (Craswell and Szummer, 2007):

Pruned graph has 1.1 million edges, 505,000 URLs and 202,000 queries

Page 19: Twente ir-course 20-10-2010

Search Logs Analysis

Page 20: Twente ir-course 20-10-2010

Clijsters

Henin

Page 21: Twente ir-course 20-10-2010

What could we learn? Goals

What do users search for?

User context How do professionals search image archives,

when compared to the average user?

Query modifications How do users reformulate their queries within

a search session

Page 22: Twente ir-course 20-10-2010
Page 23: Twente ir-course 20-10-2010
Page 24: Twente ir-course 20-10-2010
Page 25: Twente ir-course 20-10-2010
Page 26: Twente ir-course 20-10-2010
Page 27: Twente ir-course 20-10-2010

Professionals search longer

Page 28: Twente ir-course 20-10-2010

Semantic analysis Most studies investigate the search

logs at the syntactic (term-based) level

Our idea: map the term occurrences into linked open data (LOD)

Page 29: Twente ir-course 20-10-2010

Semantic Log Analysis Method:

Map queries into linked data cloud, find 'abstract' patterns, and re-use those for query suggestion, e.g.:

A and B play-soccer-in-team X A is-spouse-of B

Advantages: Reduces sparseness of the raw search log data Provides higher level insights in the data Right mix of statistics and semantics?

Overcomes the query drift problem of thesaurus-based query expansion

Page 30: Twente ir-course 20-10-2010

Assign Query Types

Page 31: Twente ir-course 20-10-2010
Page 32: Twente ir-course 20-10-2010
Page 33: Twente ir-course 20-10-2010
Page 34: Twente ir-course 20-10-2010
Page 35: Twente ir-course 20-10-2010

Detect High-level Relations…

Page 36: Twente ir-course 20-10-2010

… transformed into modification patterns

Page 37: Twente ir-course 20-10-2010
Page 38: Twente ir-course 20-10-2010
Page 39: Twente ir-course 20-10-2010

Implications Guide the selection of

ontologies/lexicons/etc. most suited for your user population

Distinguish between successful and unsuccessful queries when making search suggestions

Improve session boundary detection

Page 40: Twente ir-course 20-10-2010

Finally… a ‘wild idea’ Image data is seldomly annoted

adequately i.e., adequately to support search

Automatic image annotation or ‘concept detection’ Supervised machine learning Requires labelled samples as training data; a

laborious and expensive task

Page 41: Twente ir-course 20-10-2010

FP6 Vitalas IP Phase 1 – collect training data

Select ~500 concepten with collection owner Manually select ~1000 positive and negative

examples for each concept

Page 42: Twente ir-course 20-10-2010

How to obtain training data? Can we use click-through data

instead of manually labelled samples? Advantages:

Large quantities, no user intervention, collective assessments

Disadvantages: Noisy & sparse Queries not based on strict visual criteria

Page 43: Twente ir-course 20-10-2010

Automatic Image Annotation Research questions:

How to annotate images with concepts using click-through data?

How reliable are click-through data based annotations?

What is the effectiveness of these annotations as training samples for concept classifiers?

Page 44: Twente ir-course 20-10-2010

Manual annotations

annotations per concept positive samples negative samplesMEAN 1020.02 89.44 930.57MEDIAN 998 30 970STDEV 164.64 132.84 186.21

Page 45: Twente ir-course 20-10-2010

Manual vs. search logs based

Page 46: Twente ir-course 20-10-2010

1. How to annotate? (1/4) Use the queries for which images were clicked Challenges:

Inherent noise: gap between queries/captions and concepts queries describe the content+context of images to be retrieved clicked images retrieved using their captions: content+context concept-based annotations: based on visual content-only criteria

Sparsity: only cover part of the collection previously accessed Mismatch between terms in concept descriptions and queries

Page 47: Twente ir-course 20-10-2010

How to annotate (2/4) Basic ‘global’ method:

Given the keywords of a query Q Find the query Q' in search logs that is most

textually similar to Q Find the images I clicked for Q' Find the queries Q'' for which these images

have been clicked Rank the queries Q'' based on the number of

images clicked for them

Page 48: Twente ir-course 20-10-2010

How to annotate (3/4) Exact: images clicked for queries exactly matching

the concept name Example: 'traffic' -> 'traffic jam', 'E40', 'vacances', 'transport‘

Search log-based image representations: Images represented by all queries for which they have been

clicked Retrieval based on language models (smoothing, stemming) Example: 'traffic' -> 'infrabel', 'deutsche bahn', 'traffic lights‘

Random walks over the click graph Example: 'hurricane' -> 'dean', 'mexico', 'dean haiti', 'dean

mexico'

Page 49: Twente ir-course 20-10-2010

How to annotate (4/4)

Local method: given the keywords of a query Q and its top

ranked images Find the queries Q'' for which these images have

been clicked Rank the queries Q'' based on the number of images

clicked for them

Page 50: Twente ir-course 20-10-2010

•Compare agreement of click-through-based annotations to manual ones,

examining the 111 VITALAS concepts with at least 10 images (for at least one of the methods) in the overlap of clicked and manually annotated images

• Levels of agreement vary greatly across concepts • 20% of concepts per method reach agreement of at least 0.8

What type of concepts can be reliably annotated using clickthrough data?• defined categories? not informative

activities, animals, events, graphics, people,image_theme, objects, setting/scene/site

Possible future research on types of concepts• named entities?• specific vs. broad?• low-level features?

2. Reliability

Page 51: Twente ir-course 20-10-2010

Train the classifiers for each of 25 conceptspositive samples: images selected by each method

negative samples: selected by random sampling the 100k setexclude those already selected as positive samples

low-level visual features FW:texture descriptionintegrated Weibull distribution extracted from overlapping image regions

low-level textual features FT:a vocabulary of most frequent terms in captions is built for each conceptcompare each image caption is against each concept vocabularybuild a frequency-histogram for each concept SVM classifiers with RBF kernel (and cross validation)

3. Effectiveness (1/3)

Page 52: Twente ir-course 20-10-2010

3. Effectiviness study (2/3)•Experiment 1 (visual features):–training: search-log based annotations–test set for each concept: manual annotations (~1000 images)–feasibility study: in most cases, AP considerably higher than the prior

3. Effectiveness (2/3)

Page 53: Twente ir-course 20-10-2010

•Experiments 2,3,4 (visual or textual features):–Experiment 2 training: search-log based annotations–Experiment 3 training: manual + search-log based annotations–Experiment 4 training: manual annotations–common test set: 56,605 images (subset of the 100,000 collection) –contribution of search-log based annotations to training is positive–particularly in combination with manual annotations

3. Effectiviness (3/3)

Page 54: Twente ir-course 20-10-2010

manually annotated positive samples search log based annotated positive samples

test set results

View results at: http://olympus.ee.auth.gr/~diou/searchlogs/

Example: Soccer

Page 55: Twente ir-course 20-10-2010

Paris

Page 56: Twente ir-course 20-10-2010

or... Paris

Page 57: Twente ir-course 20-10-2010

Diversity from User Logs Present different query variants'

clicked images in clustered view Merge different query variants'

clicked images in a round robin fashion into one list (CLEF)

Page 58: Twente ir-course 20-10-2010

ImageCLEF'Olympics'

Olympic games

Olympic torch

Olympic village

Olympic rings

Olympic flag

Olympic Belgium

Olympic stadium

Other

Page 59: Twente ir-course 20-10-2010
Page 60: Twente ir-course 20-10-2010
Page 61: Twente ir-course 20-10-2010
Page 62: Twente ir-course 20-10-2010

ImageCLEF'Tennis'

Page 63: Twente ir-course 20-10-2010
Page 64: Twente ir-course 20-10-2010
Page 65: Twente ir-course 20-10-2010
Page 66: Twente ir-course 20-10-2010

ImageCLEF Findings Many queries (>20%) without clicked

images Corpus and available logs originated from

different time frame

Page 67: Twente ir-course 20-10-2010

Best results combine text search in metadata with image click data for topic title and each of the cluster titles

Using query variants derived from the logs increases recall with 50-100% However, also topic drift; reduced early

precision

ImageCLEF Findings