26
Guidelines and Principles for Guidelines and Principles for Developing Search and Browse Developing Search and Browse Vocabularies Vocabularies May 31, 2003 May 31, 2003 Rice University Rice University Houston, TX Houston, TX Amy J. Warner, PhD Amy J. Warner, PhD [email protected]

Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

  • Upload
    terah

  • View
    26

  • Download
    0

Embed Size (px)

DESCRIPTION

Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX. Amy J. Warner, PhD [email protected]. Epicurious.com. Navigation/Taxonomy. Vehicle BrandsVehicle Parts CarsVehicle Accessories MR2Spider Carriers - PowerPoint PPT Presentation

Citation preview

Page 1: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

Guidelines and Principles for Guidelines and Principles for Developing Search and Browse Developing Search and Browse

VocabulariesVocabularies May 31, 2003May 31, 2003

Rice UniversityRice UniversityHouston, TXHouston, TX

Amy J. Warner, PhDAmy J. Warner, [email protected]

Page 2: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

2

Epicurious.com

Page 3: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

3

Navigation/TaxonomyVehicle Brands Vehicle Parts

Cars Vehicle Accessories MR2Spider Carriers Celica Bicycle Carriers Matrix Ski Carriers Avalon Roof Racks Camry Solarus Splash Guards Camry Security Systems Prius Tires Corolla ˚ ECHO ˚SUVs/Vans Engines & Transmissions Land Cruiser ˚ Sequoia ˚` 4 Runner Sienna Highlander RAV4Trucks Tundra Tacoma

Celica Brochure

Camry Brochure

Page 4: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

4

Synonym Rings

CholesterolBlood CholesterolSerum CholesterolGood CholesterolBad CholesterolLDL . . .

Page 5: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

5

Medline

Page 6: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

6

MeSH & UMLS

Page 7: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

7

Controlled Vocabulary Defined A subset of natural language. A list of preferred and (sometimes)variant terms. With semantic relationships (hierarchical and associative)

(sometimes) defined. Used to tag document attributes (describe facets).

– Topic / Subtopic– Audience– Language– Form

Or can be used to create labeling scheme for navigation.

Page 8: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

8

Cornerstones of Vocabulary Control

Use unambiguous labels/search terms. Make distinctions among labels/search terms

clear. Make choices about wording and specificity of

labels/search terms based on user testing and on size of collection.

Use other semantic relationships (hierarchical, associative) if necessary to organize large lists of labels/search terms.

Page 9: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

9

Continuum of Vocabulary ControlLess More

SynonymControl•USE/Used for relationship Vehicle crashes USE Vehicle collisions Vehicle collisions UF Vehicle crashes

•Synonym Rings Vehicle collisions Vehicle crashes Crashes Collisions

HierarchicalRelationships•Broader/Narrower Terms Vehicle collisions NT Truck collisions Truck collisions BT Vehicle collisions•Browse Categories Vehicle safety Truck safety Truck collisions Vehicle safety•Site Index•Taxonomies

AssociativeRelationships•Part/Whole•Cause/Effect•etc. Vehicle parts RT Vehicles Vehicles RT Vehicle parts

Page 10: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

10

Steps in Controlled Vocabulary Construction

Group terms by subject (facet analysis) Link synonyms and variants.

Synonym Rings Vehicle collisions Vehicle crashes Crashes Collisions

Identify broader and narrower terms.Taxonomies / Hierarchies

Identify related terms.Thesauri

Page 11: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

11

Purposes of Standard

Base choices on ‘best practice’. Base choices on known principles. Foster interoperability.

Page 12: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

12

Current NISO Thesaurus Standard

Guidelines for the construction, format, and management of monolingual thesauri: Z39.19-1993.

Not a technical standard, but a set of guidelines. Emphasizes search thesauri. Emphasizes postcoordinate retrieval. Used mainly for abstracting and indexing

services. Does not put the standard in context.

Page 13: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

13

Why Revise

Not revised since 1993. Number of downloads high, reflecting interest. Does not take the web environment into account.

– Navigation schemes are controlled vocabularies too.– Is out of date in terms of computing technology in general:

• Software for managing thesauri has advanced.• Software for leveraging thesauri though an interface has

advanced. Currently little attention paid to user testing.

Page 14: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

14

Term forms

Currently– Emphasizes rigid rules for grammatical form.– Emphasizes short phrases as terms.

Suggested revision– Loosen rules on grammatical form.– Allow for longer, more complex phrases.

Rationale– Software can perform automatic stemming.– Navigation schemes are more precoordinate.

Page 15: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

15

Semantic Relationships

Current standard– Only accounts for explicit equivalence relationships.– Hierarchical relationship only allowed for genus-species

relationship, with a few exceptions.– Associative relationship only allowed across categories.

Proposed revision– Provide guidelines for choosing unambiguous labels.– Provide guidelines for loose, browse categories.

Rationale– Labeling schemes and pick lists often do not account for

explicit synonymy relationships.– Hierarchical navigation schemes need to be less rigid.

Page 16: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

16

Browse Categories

Page 17: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

17

Usability Testing

Current standard– Discusses users but does not include guidelines for testing

with users. Proposed revision

– Provide guidelines for open card sort testing of high level categories.

– Provide guidelines for closed card sorting of term groups under high level categories.

Rationale– User testing important consideration for choose terms and

term relationships.

Page 18: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

18

Display

Current standard– Emphasizes print copies of thesauri.– Screen display section oriented toward display of

print copy. Proposed revision

– Oriented more toward displays of vocabularies that only exist in digital format.

Rationale– Most web vocabularies do not have print

counterparts.

Page 19: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

19

Interoperability

Current standard– Does not address issues associated with

interoperability Proposed revision

– Will address major issues and problems associated with interoperability, including multiple languages

Rationale– Being able to share information within and among

organizations

Page 20: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

20

Construction and Maintenance

Current standard– Emphasizes maintenance problems in print vocabularies.– Discusses software that manages stand-alone vocabularies.

Proposed revision– Advance standards for changing, adding, deleting terms

automatically.– Provide guidance for software that is connected to

information retrieval systems. Rationale

– Software has advanced significantly.

Page 21: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

21

Process for Revising Standard

Appoint editor. Appoint advisory group. Draft revision. Discuss drafts with advisory group. Vote on final draft by NISO board.

Page 22: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

22

Editor & Advisory Group

Amy Warner, lexonomy.com Vivian Bliss, Microsoft Carol Brent, ProQuest John Dickert, U.S. DoD Lynn El-Hoshy, Library of Congress Emily Fayen, SDC liaison Patricia Harpring, Getty Stephen Hearn, American Library Association Sabine Kuhn, American Chemical Society/Chemical Abstracts Pat Kuhr, H.W. Wilson Diane McKerlie, Design Strategy Peter Morville, Semantic Studios Stuart Nelson, National Library of Medicine Diane Vizine-Goetz, OCLC Marcia Lei Zeng, Special Libraries Association

Page 23: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

23

Progress to Date

Agreement on scope of revision. Agreement that guidelines should be placed in context. Agreement that guidelines should be educational as well

as prescribing best practice. Agreement that guidelines should be forward looking in

terms of new technologies. Agreement to write guidelines for elements and features

that all vocabularies have in common, then consider their differences.

Survey conducted to determine use of standard, other standards, software.

Page 24: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

24

Other Players

Communication with editor of British Standard. Communication and work with W3C to address

issues of implementation of controlled vocabularies.

Page 25: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

25

Relationship with Semantic Web and OWL

Semantic Web is an ontological framework. Both terms in the ontology and the relationships between

them are standardized using OWL (Web Ontology Language).

Both the terms and the relationships are ‘deep’ semantically.

This is a structure into which ‘shallower’ terms provided by using Z39.19 could be inserted.

This would enhance interoperability because although we would not have complete agreement on vocabularies, we would have agreement on an effective structure for exchanging them.

Page 26: Guidelines and Principles for Developing Search and Browse Vocabularies May 31, 2003 Rice University Houston, TX

26

Contact Me

Amy J. Warner

[email protected]

www.lexonomy.com