11
Inforadar @ UPRM Computing Systems Research Group Prof. Bienvenido Vélez-Rivera – Leader José Enseñat – Graduate student Juan Torres – Undergraduate student University of Puerto Rico University of Puerto Rico Mayagüez Campus Mayagüez Campus PRECISE Project PRECISE Project Mayagüez, October 07, 2000 Mayagüez, October 07, 2000

Inforadar @ UPRM Computing Systems Research Group Prof. Bienvenido Vélez-Rivera – Leader José Enseñat – Graduate student Juan Torres – Undergraduate student

Embed Size (px)

Citation preview

Page 1: Inforadar @ UPRM Computing Systems Research Group Prof. Bienvenido Vélez-Rivera – Leader José Enseñat – Graduate student Juan Torres – Undergraduate student

Inforadar @ UPRM

Computing Systems Research GroupProf. Bienvenido Vélez-Rivera – Leader

José Enseñat – Graduate student

Juan Torres – Undergraduate student

University of Puerto RicoUniversity of Puerto Rico

Mayagüez CampusMayagüez Campus

PRECISE ProjectPRECISE Project

Mayagüez, October 07, 2000Mayagüez, October 07, 2000

Page 2: Inforadar @ UPRM Computing Systems Research Group Prof. Bienvenido Vélez-Rivera – Leader José Enseñat – Graduate student Juan Torres – Undergraduate student

Problem Statement

Page 3: Inforadar @ UPRM Computing Systems Research Group Prof. Bienvenido Vélez-Rivera – Leader José Enseñat – Graduate student Juan Torres – Undergraduate student

Query-based Web Search

large result-set

short query

BUT- queries hard to write-sequential access to result set inadequate

Page 4: Inforadar @ UPRM Computing Systems Research Group Prof. Bienvenido Vélez-Rivera – Leader José Enseñat – Graduate student Juan Torres – Undergraduate student

Proposed Solution

Page 5: Inforadar @ UPRM Computing Systems Research Group Prof. Bienvenido Vélez-Rivera – Leader José Enseñat – Graduate student Juan Torres – Undergraduate student

Proposed SolutionInforadar’sInforadar’sInteractive

queryhierarchies

seed query

result set forselected query

dynamiccategoriesare queries

selectedquery

Page 6: Inforadar @ UPRM Computing Systems Research Group Prof. Bienvenido Vélez-Rivera – Leader José Enseñat – Graduate student Juan Torres – Undergraduate student

Inforadar’sInforadar’sInteractive

queryhierarchies

colors indicate node status

level 2categories

icons markdocuments

read or in-basket

Page 7: Inforadar @ UPRM Computing Systems Research Group Prof. Bienvenido Vélez-Rivera – Leader José Enseñat – Graduate student Juan Torres – Undergraduate student

Theoretical Formulation

Page 8: Inforadar @ UPRM Computing Systems Research Group Prof. Bienvenido Vélez-Rivera – Leader José Enseñat – Graduate student Juan Torres – Undergraduate student

(a)low information loss

high redundancy

Coverage-based Category Evaluation MetricGoal: Avoid Redundancy and Information Loss

q

q2q1

(b)high information loss

low redundancy(c)

better

Ideal: Select categories that best approximate a partitionBut: This is an NP-complete problem

seed

Page 9: Inforadar @ UPRM Computing Systems Research Group Prof. Bienvenido Vélez-Rivera – Leader José Enseñat – Graduate student Juan Torres – Undergraduate student

CTS: A greedy approximation algorithm for category selection

Approach: CTS picks term fi maximizing:

CitqD )^(*

)^(*itqDC

C = set of documents coveredby previously selected terms

winningcategory!

low coverage

highredundancy

Goal: Pick best term among { t1, t2, t3}

C

D(q ^ t3)

D(q ^ t2)D(q ^ t1)

D(q)

Page 10: Inforadar @ UPRM Computing Systems Research Group Prof. Bienvenido Vélez-Rivera – Leader José Enseñat – Graduate student Juan Torres – Undergraduate student

Experimental Plan

• Implement InforadarInforadar site indexing ALL website data at UPRM

• Make InforadarInforadar the official search engine for the UPRM web site

• Conduct usability study

• Analyze real user feedback

• Incorporate feedback into an improved design

Page 11: Inforadar @ UPRM Computing Systems Research Group Prof. Bienvenido Vélez-Rivera – Leader José Enseñat – Graduate student Juan Torres – Undergraduate student

References

• Query Lookahead for Query-Based Document Categorization. – Ph.D. Thesis

– Massachusetts Institute of Technology

– September 1999 

• Fast and Effective Query Refinement– Bienvenido Vélez, Ron Weiss, Mark Sheldon and David K. Gifford

– ACM Conference in Research and Development in Information Retrieval (SIGIR 97) 

• HyPursuit: A network search engine exploiting concent-link similarity– R. Weiss, B. Vélez, M. Sheldon, C. Namprempre, P. Szilagy and D. K.

Gifford..

– ACM Conference on Hypertext (HyperText 96)