61
Faceted Metadata in Image Search & Browsing Using Words to Browse a Thousand Images Ka-Ping Yee, Kirsten Swearingen, Kevin Li, Marti Hearst Group for User Interface Research UC Berkeley CHI 2003 Research funded by: NSF CAREER Grant IIS-9984741 IBM Faculty Fellowship

Faceted Metadata in Image Search & Browsing Using Words to Browse a Thousand Images

  • Upload
    halia

  • View
    51

  • Download
    0

Embed Size (px)

DESCRIPTION

Faceted Metadata in Image Search & Browsing Using Words to Browse a Thousand Images. Ka-Ping Yee, Kirsten Swearingen, Kevin Li, Marti Hearst Group for User Interface Research UC Berkeley CHI 2003 Research funded by: NSF CAREER Grant IIS-9984741 IBM Faculty Fellowship. Outline. - PowerPoint PPT Presentation

Citation preview

Page 1: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

Faceted Metadata in Image Search & Browsing

Using Words to Browse a Thousand Images

Ka-Ping Yee, Kirsten Swearingen, Kevin Li, Marti Hearst

Group for User Interface ResearchUC Berkeley

CHI 2003

Research funded by:NSF CAREER Grant IIS-9984741

IBM Faculty Fellowship

Page 2: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Outline• How do people search and browse for

images?• Current approaches:

– Keywords– Spatial similarity

• Our approach:– Hierarchical Faceted Metadata– Very careful UI design and testing

• Usability Study• Conclusions

Page 3: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

How do people want to search and browse images?

Ethnographic studies of people who use images intensely:– Finding specific objects is easy

– Find images of the Empire State Building– Browsing is difficult– People want to use rich descriptions.

Page 4: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Ethnographic Study• Markkula & Sormunen ’00

– Journalists and newspaper editors– Choosing photos from a digital

archive• Searching for specific objects is trivial• Stressed a need for browsing• Photos need to deal with themes, places,

types of objects, views– Had access to a powerful interface,

but it had 40 entry forms and was generally hard to use; no one used it.

Page 5: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Markkula & Sormunen ’00

Page 6: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Query Study• Armitage & Enser ’97

– Analyzed 1,749 queries submitted to 7 image and film archives

– Classified queries into a 3x4 facet matrix• Rio Carnivals: Geo Location x Kind of

Event– Concluded that users want to search

images according to combinations of topical categories.

Page 7: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Ethnographic Study• Ame Elliot ’02

– Architects• Common activities:

– Use images for inspiration• Browsing during early stages of design

– Collage making, sketching, pinning up on walls• This is different than illustrating powerpoint

• Maintain sketchbooks & shoeboxes of images– Young professionals have ~500, older ~5k

• No formal organization scheme– None of 10 architects interviewed about their image

collections used indexes• Do not like to use computers to find images

Page 8: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Current Approaches to Image Search• Keyword based

– WebSeek (Smith and Jain ’97)– Commercial web image search

systems– Commercial image vendors (Corbis,

Getty)– Museum web sites

Page 9: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Current Approaches to Image Search• Using Visual “Content”

– Extract color, texture, shape• QBIC (Flickner et al. ‘95)• Blobworld (Carson et al. ‘99)• Piction: images + text (Srihari et al. ’91 ’99)

– Two uses:• Show a clustered similarity space • Show those images similar to a selected one

– Usability studies:• Rodden et al.: a series of studies• Clusters don’t work; showing textual labels is

promising.

Page 10: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Rodden et al., CHI 2001

Page 11: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Rodden et al., CHI 2001

Page 12: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Rodden et al., CHI 2001

Page 13: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

How Best to Support Browsing?• To support serendipity, want to

view images that are related along multiple dimensions.

• But clusters are not comprehensible.

• Instead, allow users to “steer” through the multi-dimensional category space in a flexible manner.

Page 14: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Some Challenges

• Users don’t like new search interfaces.

• How to show lots more information without overwhelming or confusing?

Page 15: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Our Approach• Integrate the search seamlessly

into the information architecture.– Use proper HCI methodologies.

• Use faceted metadata:– More flexible than canned hyperlinks– Less complex than full search– Help users see where to go next and

return to what happened previously

Page 16: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Metadata: data about dataFacets: orthogonal categories

Time/Date TopicGeoRegion

Page 17: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Hierarchical Faceted Metadata Example: Biological Subject Headings

1. Anatomy [A] 2. Organisms [B] 3. Diseases [C] 4. Chemicals and Drugs [D] 5. Analytical, Diagnostic and Therapeutic Techniques and Equipment [E] 6. Psychiatry and Psychology [F] 7. Biological Sciences [G] 8. Physical Sciences [H] 9. Anthropology, Education, Sociology and Social Phenomena [I] 10. Technology and Food and Beverages [J] 11. Humanities [K] 12. Information Science [L] 13. Persons [M] 14. Health Care [N] 15. Geographic Locations [Z]

Page 18: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Hierarchical Faced Metadata

1. Anatomy [A] Body Regions [A01] 2. [B] Musculoskeletal System [A02] 3. [C] Digestive System [A03] 4. [D] Respiratory System [A04] 5. [E] Urogenital System [A05] 6. [F] …… 7. [G] 8. Physical Sciences [H] 9. [I] 10. [J] 11. [K] 12. [L] 13. [M]

Page 19: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Hierarchical Faceted Metadata

1. Anatomy [A] Body Regions [A01] Abdomen [A01.047] 2. [B] Musculoskeletal System [A02] Back [A01.176] 3. [C] Digestive System [A03] Breast [A01.236] 4. [D] Respiratory System [A04] Extremities

[A01.378] 5. [E] Urogenital System [A05] Head [A01.456] 6. [F] …… Neck [A01.598] 7. [G] …. 8. Physical Sciences [H] 9. [I] 10. [J] 11. [K] 12. [L] 13. [M]

Page 20: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Hierarchical Faceted Metadata

1. Anatomy [A] Body Regions [A01] Abdomen [A01.047] 2. [B] Musculoskeletal System [A02] Back [A01.176] 3. [C] Digestive System [A03] Breast [A01.236] 4. [D] Respiratory System [A04] Extremities

[A01.378] 5. [E] Urogenital System [A05] Head [A01.456] 6. [F] …… Neck [A01.598] 7. [G] …. 8. Physical Sciences [H] Electronics 9. [I] Astronomy 10. [J] Nature 11. [K] Time 12. [L] Weights and Measures 13. [M] ….

Page 21: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Hierarchical Faceted Metadata

1. Anatomy [A] Body Regions [A01] Abdomen [A01.047] 2. [B] Musculoskeletal System [A02] Back [A01.176] 3. [C] Digestive System [A03] Breast [A01.236] 4. [D] Respiratory System [A04] Extremities

[A01.378] 5. [E] Urogenital System [A05] Head [A01.456] 6. [F] …… Neck [A01.598] 7. [G] …. 8. Physical Sciences [H] Electronics Amplifiers 9. [I] Astronomy Electronics, Medical 10. [J] Nature Transducers 11. [K] Time 12. [L] Weights and Measures 13. [M] ….

Page 22: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Hierarchical Faceted Metadata

1. Anatomy [A] Body Regions [A01] Abdomen [A01.047] 2. [B] Musculoskeletal System [A02] Back [A01.176] 3. [C] Digestive System [A03] Breast [A01.236] 4. [D] Respiratory System [A04] Extremities [A01.378] 5. [E] Urogenital System [A05] Head [A01.456] 6. [F] …… Neck [A01.598] 7. [G] …. 8. Physical Sciences [H] Electronics Amplifiers 9. [I] Astronomy Electronics, Medical 10. [J] Nature Transducers 11. [K] Time 12. [L] Weights and Measures Calibration 13. [M] …. Metric System Reference Standard

Page 23: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

The Interface Design• Chess metaphor

– Opening– Middle game– End game

Page 24: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Page 25: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Page 26: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Page 27: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Page 28: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Page 29: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Page 30: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Page 31: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Page 32: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Page 33: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

The Interface Design• Tightly Integrated Search• Supports Expand as well as Refine• Dynamically Generated Pages

– Paths can be taken in any order• Consistent Color Coding• Consistent Backup and

Bookmarking• Standard HTML

Page 34: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

What is Tricky About This?• It is easy to do it poorly

– Yahoo directory structure• It is hard to be not overwhelming

– Most users prefer simplicity unless complexity really makes a difference

• It is hard to “make it flow”– Can it feel like “browsing the

shelves”?

Page 35: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Project History• Identify Target Population

– Architects, city planners• Needs assessment.

– Interviewed architects and conducted contextual inquiries. • Lo-fi prototyping.

– Showed paper prototype to 3 professional architects.• Design / Study Round 1.

– Simple interactive version. Users liked metadata idea.• Design / Study Round 2:

– Developed 4 different detailed versions; evaluated with 11 architects; results somewhat positive but many problems identified. Matrix emerged as a good idea.

• Metadata revision. – Compressed and simplified the metadata hierarchies

Page 36: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Project History• Design / Study Round 3.

– New version based on results of Round 2– Highly positive user response

• Identified new user population/collection– Students and scholars of art history– Fine arts images

• Study Round 4– Compare the metadata system to a strong,

representative baseline

Page 37: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

New Usability Study• Participants & Collection

– 32 Art History Students– ~35,000 images from SF Fine Arts Museum

• Study Design– Within-subjects

• Each participant sees both interfaces• Balanced in terms of order and tasks

– Participants assess each interface after use– Afterwards they compare them directly

• Data recorded in behavior logs, server logs, paper-surveys; one or two experienced testers at each trial.

• Used 9 point Likert scales.• Session took about 1.5 hours; pay was $15/hour

Page 38: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

The Baseline System• Floogle• Take the best of the existing

keyword-based image search systems

Page 39: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Comparison of Common Image Search Systems

System Collection

# Results /page

Categories?

# Familiar

Google Web 20 No 27AltaVista Web 15 No 8Corbis Photos 9-36 No 8Getty Photos,

Art12-90 Yes 6

MS Office

Photos, Clip art

6-100 Yes N/A

Thinker Fine arts images

10 Yes 4

BASELINE

Fine arts images

40 Yes N/A

Page 40: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

sword

Page 41: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Page 42: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Page 43: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Page 44: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Evaluation Quandary• How to assess the success of

browsing?– Timing is usually not a good indicator– People often spend longer when

browsing is going well.• Not the case for directed search

– Can look for comprehensiveness and correctness (precision and recall) …

– … But subjective measures seem to be most important here.

Page 45: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Hypotheses• We attempted to design tasks to test the

following hypotheses:– Participants will experience greater search

satisfaction, feel greater confidence in the results, produce higher recall, and encounter fewer dead ends using FC over Baseline

– FC will perceived to be more useful and flexible than Baseline

– Participants will feel more familiar with the contents of the collection after using FC

– Participants will use FC to create multi-faceted queries

Page 46: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Four Types of Tasks– Unstructured (3): Search for images of interest – Structured Task (11-14): Gather materials for

an art history essay on a given topic, e.g.• Find all woodcuts created in the US• Choose the decade with the most• Select one of the artists in this periods and show all

of their woodcuts• Choose a subject depicted in these works and find

another artist who treated the same subject in a different way.

– Structured Task (10): compare related images• Find images by artists from 2 different countries that

depict conflict between groups.– Unstructured (5): search for images of interest

Page 47: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Other Points• Participants were NOT walked through the

interfaces.• The wording of Task 2 reflected the metadata;

not the case for Task 3• Within tasks, queries were not different in

difficulty (t’s<1.7, p >0.05 according to post-task questions)

• Flamenco is and order of magnitude slower than Floogle on average.– In task 2 users were allowed 3 more minutes in FC

than in Baseline.– Time spent in tasks 2 and 3 were significantly longer in

FC (about 2 min more).

Page 48: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Results• Participants felt significantly more confident

they had found all relevant images using FC (Task 2: t(62)=2.18, p<.05; Task 3: t(62)=2.03, p<.05)

• Participants felt significantly more satisfied with the results (Task 2: t(62)=3.78, p<.001; Task 3: t(62)=2.03, p<.05)

• Recall scores:– Task2a: In Baseline 57% of participants found all

relevant results, in FC 81% found all.– Task 2b: In Baseline 21% found all relevant, in FC

77% found all.

Page 49: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Post-Interface Assessments

All significant at p<.05 except simple and overwhelming

Page 50: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Perceived Uses of InterfacesWhat is interface useful for?

6.44

5.475.91

4.91

7.97 7.91

6.646.16

0.00

1.00

2.00

3.00

4.00

5.00

6.00

7.00

8.00

9.00

Useful for mycoursework

Useful forexploring anunfamiliarcollection

Useful for findinga particular image

Useful for seeingrelationships b/w

images

SHASTA

DENALI

Baseline

FC

Page 51: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Post-Test Comparison

15 16

2 30

1 29

   4 28

8 23

6 24

28 3

1 31

2 29

FCBaseline

Find images of rosesFind all works from a given periodFind pictures by 2 artists in same media

Which Interface Preferable For:

Page 52: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Post-Test Comparison

15 16

2 30

1 29

   4 28

8 23

6 24

28 3

1 31

2 29

FCBaseline

Overall Assessment:More useful for your tasks

Easiest to useMost flexible

More likely to result in dead endsHelped you learn more

Overall preference

Find images of rosesFind all works from a given periodFind pictures by 2 artists in same media

Which Interface Preferable For:

Page 53: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Facet Usage• Facets driven largely by task content

– Multiple facets 45% of time in structured tasks• For unstructured tasks,

– Artists (17%)– Date (15%)– Location (15%)– Others ranged from 5-12%– Multiple facets 19% of time

• From end game, expansion from– Artists (39%)– Media (29%)– Shapes (19%)

Page 54: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Qualitative Observations• Baseline:

– Simplicity, similarity to Google a plus– Also noted the usefulness of the category links

• FC:– Starting page “well-organized”, gave “ideas for what to

search for”– Query previews were commented on explicitly by 9

participants– Commented on matrix prompting where to go next

• 3 were confused about what the matrix shows– Generally liked the grouping and organizing– End game links seemed useful; 9 explicitly remarked

positively on the guidance provided there.– Often get requests to use the system in future

Page 55: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Study Results Summary• Strongly positive results for the faceted

metadata interface.• Moderate use of multiple facets.• Strong preference over the current state

of the art.– Chair of Architecture Dept: “It felt like I was

browsing the shelves!”– This kind of enthusiasm is not seen in

similarity-based image search interfaces.• Hypotheses are supported.

Page 56: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Implementation• All open source code

– Mysql database– Python web server (Webkit)– Python code– Lucene search engine (java)

Page 57: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Metadata Availability• Many collections already have rich

metadata associated with them.• Automated methods are

improving.• This tool may be helpful for

resolving metadata creation wars.

Page 58: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Summary• Usability studies done on 3 collections:

– Recipes: 13,000 items– Architecture Images: 40,000 items– Fine Arts Images: 35,000 items

• Conclusions:– Users like and are successful with the

dynamic faceted hierarchical metadata, especially for browsing tasks

– Very positive results, in contrast with studies on earlier iterations

– Note: it seems you have to care about the contents of the collection to like the interface

Page 59: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Other Domains• Applying this to

– Text• Tobacco Documents Archives• Medline biomedical texts

– Products/Catalogs• Don’t have a collection; would like one

Page 60: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

M. Hearst Faceted Metadata in Search

Future Work

• What about information visualization?

• How to integrate with relevance feedback (more like this)?

• How to incorporate user preferences and past behavior?

• How to combine facets to reflect tasks?

Page 61: Faceted Metadata in  Image Search & Browsing Using Words to Browse a Thousand Images

65

Thanks to:

Andrea SahliRashmi Sinha

NSF CAREER Grant IIS-9984741IBM Faculty Fellowship

Try the Demo:flamenco.berkeley.edu