TREC 2009 Review Lanbo Zhang. 7 tracks Web track Relevance Feedback track (RF) Entity track Blog track Legal track Million Query track (MQ) Chemical IR

TREC 2009 Review

Lanbo Zhang

7 tracks

• Web track• Relevance Feedback track (RF)• Entity track• Blog track• Legal track• Million Query track (MQ)• Chemical IR track

67 Participating Groups

The new dataset: ClueWeb09

• 1 billion web pages, in 10 languages, half are in English

• Crawled by CMU in Jan. and Feb. 2009• 5 TB (compressed), 25 TB (uncompressed)• Subset B

– 50 million English pages– Includes all Wikipedia pages

• The original dataset and the Indri index of subset B are available on our lab machines

Tracks


Web Track

• Two tasks– Adhoc Retrieval Task– Diversity Task

• Return a ranked list of pages that together provide complete coverage for a query, while avoiding excessive redundancy in the return list.

Web Track• Topic type 1: ambiguous

Web Track• Topic type 2: faceted

Web Track• Results of adhoc task

Web Track• Results of diversity task

Waterloo at Web track

• Two runs– Top 10000 docs in the entire collection– Top 10000 docs in the Wikipedia set

• Wikipedia docs as pseudo relevance feedback• Machine learning methods to re-rank the top

20000 docs, and return the top 1000• Diversity task

– A Naïve Bayes classifier designed to re-rank the top 20000 to exclude duplicates

MSRA at Web track

• Mining subtopics for a query by– Anchor texts– Search results clusters– Sites of search results

• Search results diversification– A greedy algorithm to iteratively select the next best

document

Tracks


Relevance Feedback Track

• Tasks– Phase 1: find a set of 5 documents that are good

for relevance feedback.– Phase 2: develop an RF algorithm to do retrieval

based on the relevance judgments of 5 docs.

Results of RF track: Phase 1

Results of RF track: Phase 2

UCSC at RF track

• Phase 1: documents selection– Clustering top ranked documents– Transductive Experimental Design (TED)

• Phase 2: RF algorithm– Combining different document representations

• Title, anchor, heading, document

– Incorporating term position information• Phrase match, text window match

– Incorporating document similarities to labeled docs

UMas at RF track

• A supervised method to estimate the weights of expanded terms for RF

• Train collection: wt10g• Term features given a query:

– Term frequency in FB docs and entire collection– Co-occurrence with query terms– Term proximity to query terms– Document frequency

UMas at RF track

• Model: Boosting

Tracks


Entity Track• Task

– Given an input entity, find the related entities• Return 100 related entities and their homepages

Results of Entity track

Purdue at Entity track

• Entity Extraction– Hierarchical Relevance Model– Three levels of relevance: document, passage,

entity

Purdue at Entity track• Homepage Finding for Entities

– Logistic Regression model

Tracks


Blog Track

• Tasks– Faceted Blog Distillation– Top Stories Identification

• Collection: Blogs08– Crawled between 01/14/2008 and 02/10/2009– 1.3 million unique blogs

Blog Track

• Task 1: Faceted Blog Distillation– Given a topic and the faceted restriction, find the relevant

blogs.

– Facets• Opinionated vs. Factual• Personal vs. Official• In-depth vs. Shallow

– Topic example

Blog Track

• Task 2: Top Stories Identification– Given a date, find the hottest news headlines for

that day and select the relevant and diverse blog posts for those headlines

– News headlines from New York Times used– Topic example

Results of Blog track

• Faceted Blog Distillation

Results of Blog track

• Top Stories Identification– Find the hottest news headlines

– Identify the related blog posts

BUPT at Blog track

• Faceted Blog Distillation– Scoring function:

• The title section of a topic plus automatically selected terms from the DESC and NARR sections

• Phrase match

– Facets Analysis• Opinionated v.s. Factual: a sentiment analysis model• Personal v.s Official: the maximum frequency of an organization

entity occurring in a blog (Stanford Named Entity Recognizer)• In-depth v.s. Shallow: post length

– Linear combination of the above two parts

),()(

1),(

)(

1

qpscorebN

qbscorebN

ii

r

Univ. of Glasgow at Blog track

• Top Stories Identification– The model:

– Incorporating the following days

– Using Wikipedia to enrich news headline terms and keep the top 10 terms for each headline

),(1000

)),(exp(),(dhCp top

hpscoredhscore

Tracks


Legal Track

• Tasks– Interactive task (Enron email collection)

• Retrieval with topic authorities involved, participants can ask topic authorities to clarify topics, judge the relevance of sample docs

– Batch task (IIT CDIP 1.0)• Retrieval with relevance evidence (RF)

Results of Legal track

Waterloo at Legal track

• Interactive task– Phase 1: interactive search and judging

• To find a large and diverse set of training examples

– Phase 2: interactive learning• To find more potentially relevant documents

• Batch task– Run three spam filters on every document:

• An on-line logistic regression filter,• A Naïve Bayes spam filter• An on-line version of BM25 RF method

Tracks


Million Query Track• Tasks

– Adhoc retrieval for 40000 queries– Predict query types

• Query intent: Precision-oriented vs. Recall-oriented• Query difficulty: Hard vs. Easy• Precision-oriented

– Navigational: Find a specific URL or web page.– Closed: Find a short, unambiguous answer to a specific question.– Resource: Locate a web-based resource or download.

• Recall-oriented– Open: Answer an open-ended question, or nd all available information

about a topic.– Advice : Find advice or ideas regarding a general question or problem.– List: Find a list of results that will help satisfy an open-ended goal.

Results of Million Query track

Precision vs. Recall

Hard vs. Easy

Northeastern Univ. at MQ track

• Query-specific learning to rank– Learn different ranking functions for queries in different

classes

• Using SVM to classify queries– Training data: MQ 2008 dataset

• Features– Document features: document length, TF, IDF, TF*IDF, normalized TF,

Robertson’s TF, Robertson’s IDF, BM25, Language Models (Laplace, Dirichlet, JM).

– Field features: title, heading, anchor text, and URL– Web graph features

Tracks


Chemical IR Track

• Tasks– Technical Survey Task

• Retrieve documents in response to each topic given by chemical patent experts

– Prior Art Search Task• Find relevant patents with respect to a set of 1000

existing patents

Results of Chemical track

Geneva at Chemical track

• Document Representation:– Title, Description, Abstract, Claims, Applicants, Inventors, IPC codes,

Patent references

• Exploiting Citation Networks–

• Query expansion using chemical annotations• Filtering based on IPC codes• Re-ranking based on claims

Documents

TREC 2009 Review Lanbo Zhang. 7 tracks Web track Relevance Feedback track (RF) Entity track Blog track Legal track Million Query track (MQ) Chemical IR