43
Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information Architecture Summit March 16, 2002 Amy J. Warner, Ph.D. [email protected]

Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Embed Size (px)

Citation preview

Page 1: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Tax

onom

y &

Met

adat

a / I

nfo

rmat

ion

A

rch

itec

ture

Con

sult

ing

Amy J. Warner, Ph.D.

Metadata & Taxonomies for a More Flexible Information

ArchitectureInformation Architecture Summit

March 16, 2002

Amy J. Warner, Ph.D.

[email protected]

Page 2: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 2

Outline

• What I’ll cover:– Metadata and IA.– Metadata schema.– Vocabulary development.

• Underlying themes:– Standards.– Reality.– Some IR (information retrieval) issues.

Page 3: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 3

What is Metadata?

Metadata is structured data which describes the characteristics of a resource. It shares many similar characteristics to the cataloguing that takes place in libraries, museums and archives.

Chris TaylorUniversity of Queensland

Page 4: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 4

Types & Functions of MetadataTYPE DEFINITION EXAMPLES

Administrative Metadata used in managing andadministering resources

Acquisition informationRights and reproduction trackingDocumentation of legal accessrequirementsLocation informationVersion control

Descriptive Metadata used to describe oridentify information resources

Cataloging recordsSpecialized indexesHyperlinked relationships betweenresourcesAnnotations by users

Preservation Metadata related to thepreservation of informationresources

Documentation of actions taken topreserve physical and digitalversions of resources (e.g., datarefreshing and migration)

Technical Metadata related to how asystem functions or metadatabehaves

Digitization information (e.g.,formats, compression ratios,scaling routines)Authentication and security data(e.g., encryptions, passwords)

Use Metadata related to the level andtype of use of informationresources

Use and user trackingContent re-use and multi-versioning information

Introduction to Metadata, Getty Information Institute

Page 5: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 5

Confusing Terminology• Controlled vocabularies

– Subject Headings: traditionally employed in libraries to tag (index) the topics of books and other library materials

– Thesauri: traditionally employed in abstracting & indexing services to tag (index) the topics of journal articles and other scholarly material in a given subject area (e.g. medicine, engineering)

– Taxonomies: the classification of different organisms into mutually exclusive categories based on phylum species

Page 6: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 6

Levels of Control

Simple Complex

SynonymRings

AuthorityFiles

ThesauriClassificationSchemes

Equivalence Hierarchical Associative

(Vocabularies)

(Relationships)

Taxonomies

Page 7: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 7

Metadata & IA

Content

UsersBusinessContext

Identify patternsin content

Determine how target audience(s) search for and use information

Determine how stakeholderswant to organize &present

their information

Page 8: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 8

IA ‘Generations’

• ‘Brochureware’

• Pages served from database

• Metadata-driven website

CMS

Page 9: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 9

Metadata in Metadata-Driven Websites

MetadataRecords

Content

J. Jones xxxx White Paper Employees http://...

Author Title DocType Audience URL

http://….

Page 10: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 10

Two Parts to Generating a Metadata Schema

• Decisions about indexable parameters (attributes, aspects) of documents; this corresponds to fields in the database records.

• Decisions about the elements (terms, descriptors, subject headings, tags) that these fields contain.

Page 11: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 11

Two Possibilities

• Content already exists– Identify content that exists--content

inventory.

• Most or all content does not exist– Use ‘wish lists’ to identify desired content.

• To do content inventory, need to go to those who are going to develop, own, maintain content.

Page 12: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 12

Content Analysis

• Look for patterns, similarities:– logical--themes, sensitivity, specialization.– physical--formats, dynamic vs. static (dated

vs. rarely updated).

• Look for relationships--note connections between content (parent-child, sibling, dependencies.

• Begin to create groupings.

Page 13: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 13

Generating a Metadata Table

• The beginning of a metadata-driven website.

• Determine the major indexable parameters or attributes for each major document type in your sample.

• Determine what major types of rules or general guidelines your indexing system will follow for each attribute.

• Create an X-by-Y table.

• Put indexable attributes on the X axis and the rules on the Y axis.

• Fill in the decisions you make about each rule application in the individual cells of the table.

Page 14: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 14

Required Repeatable Auto/Manual Whole doc/Concepts

CV

Author Yes Yes Manual Whole Doc. No

Title Yes No Manual Whole Doc. No

DocType No Yes Manual Whole Doc. DocTypesList

Subject Yes Yes Semi-Auto Concepts SubjectsVocabulary

Audience No No Manual WholeDocument

AudienceList

Metadata Table

Page 15: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 15

User and Stakeholder Involvement

• When organizing content, start with the content, generate the metadata, and then evaluate with users and stakeholders.

• When organizing entities (i.e. products, projects) where content is not the major focus, start with stakeholders and users to determine metadata.

Page 16: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 16

Identify Terms• Published Reference Materials

– Thesauri, classification schemes, encyclopedias, dictionaries, glossaries, indexes.

• Content– Representative sample of web site / intranet.

• Users– Search log analysis, surveys, interviews.

• Experts– Authors, subject experts.

Page 17: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 17

Organize Terms

• Define preferred terms.• Link synonyms and variants.

Synonym Rings

• Group preferred terms by subject.• Identify broader and narrower terms.

Taxonomies / Hierarchies

• Identify related terms.Thesauri

Page 18: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 18

Variant Terms

Variant terms provide the user with entrypoints into the vocabulary.

Synonyms (same meaning):cats USE felines helicopters USE whirlybirds

Lexical Variants (different word forms):paediatrics USE pediatrics BK USE Burger King

Quasi-Synonyms (treated as equivalent):generic posting: beagle USE dogantonyms/continuum: wetness USE dryness

Page 19: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 19

Term Specificity

Assuming a good entry vocabulary, increased term specificity allows for improved precision without hurting recall (but costs grow fast).

Vocabulary A Vocabulary B United States United States

California San Diego

Page 20: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 20

Compound Terms

Article Title: “Software for Information Architects”H

igh

Pre

cis

ion

Hig

h R

ec

all

One Term Information Architecture Software

Two Terms Information Architecture Software

Three Terms Architecture Information Software

Page 21: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 21

Facets

Things (entities)ConceptsProcessesPeopleOrganizationsOccupations

etc.

TopicAudienceIntellectual LevelFormTypeLanguageDate

etc.

Facets of a Topic Facets of Documents

Aspects of Documentsto Index

Controlled Vocabular(ies)

Page 22: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 22

Facet Analysis

• Facets come from content inventory, intuition, and users.

• Break domain into logical categories or chunks based on how documents need to be managed (both for system and for search).

Page 23: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 23

Polyhierarchy

• Strict Hierarchies– Each term appears in only

one place in the hierarchy.– Essential for placement

of physical objects.

• Polyhierarchies– Terms cross-listed

in multiple categories– Accepts complex

nature of reality.

Page 24: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 24

Polyhierarchy

• Compound terms neededto manage 6 milliondocuments in Medline.

• High level ofpre-coordinationforces polyhierarchy.

• Terms may havemore than one BT. Viral

Pneumonia

Diseases

VirusDiseases

RespiratoryTract

Diseases

Medical Subject Headings (MeSH)

Page 25: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 25

Facets, Coordination, Specificity

Drying of ApplesDrying of PearsDrying of PeachesCanned ApplesCanned PearsCanned PeachesFrozen ApplesFrozen PearsFrozen PeachesFresh ApplesFresh PearsFresh PeachesFreezing of Canned ApplesCanning of Dried PearsDrying of Fresh Peaches

EntitiesApplesPearsPeaches

ProcessesCanningFreezingDrying

FormsCannedFrozenFresh

ApplesPearsPeachesCanningFreezingDryingCannedFrozenFreshCanning of ApplesCanning of PearsCanning of PeachesFreezing of ApplesFreezing of PearsFreezing of Peaches

Partial List of Potential Combinations

Page 26: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 26

Semantic Relationships

• Equivalence:– Use/Used For (USE/UF)– Leads from variants to preferred

e.g., prams: USE baby carriages

A = B

Page 27: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 27

Semantic Relationships

• Hierarchical:– Broader Term/Narrower Term (BT/NT)

• Types– Generic (class/species, inheritance)

Vertebrata NT Amphibia

– Whole-Part (associative unless exclusive)

Ear NT Vestibular Apparatus

– Instance (proper name)

Seas NT Mediterranean Sea

AB

Page 28: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 28

Semantic Relationships

• Associative:– Related Term (RT, See Also)

– Non-hierarchical and non-equivalent– Relation should be “strongly implied”

e.g., hammers RT nails

A B

Page 29: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 29

Associative Relationships

• Field of Study and Object of Study:– Forestry RT Forests

• Process and its Agent:– Temperature Control RT Thermostat

• Concepts and their Properties:– Poisons RT Toxicity

• Action and Product of Action:– Weaving RT Cloth

• Concepts Linked by Causal Dependence:– Bereavement RT Death

Page 30: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 30

Leveraging the Thesaurus• User Interface:

– Generate browsable indexes (site-wide, sub-site, specialized authority lists).

– Enable Field-Specific Searching (filters, zones, sorting).

– Support personalization (map profile to vocabulary).

• Behind the Scenes:– Enable efficient content management.– Support decentralized tagging.

Page 31: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 31

Uses of Metadata-Driven Website

• Routing

• Search

• Navigation

Page 32: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 32

RoutingDocument Stream Metadata Filter Document Subset

From IndividualContributors or Syndication Service

Profile orFilter

Page 33: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 33

Generalizations about Routing

• Can be ‘push’ or ‘pull’.

• Can be driven by various metadata elements (e.g., audience, topic, etc.).

• May have both internal and external metadata schemes to consider; mapping may be an important issue.

Page 34: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 34

SearchingSearchingUser Query Databases Document

Subset

MetadataRecords

http://….

Page 35: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 35

Epicurious.com

Page 36: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 36

Epicurious, First FacetBrowse > Picnics

Page 37: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 37

Epicurious.com Facets

Beans, Beef, Berries, Cheese, Chocolate, Citrus,Dairy, Eggs, Fish, Fruits, Garlic, Ginger, Grains,Greens, Herbs, Lamb, Mushrooms, Mustard, Nuts,Olives, Onions, Pasta, Peppers, Pork, Potatoes, Poultry, Rice, Shellfish, Tomatoes, Vegetables

Main Ingredients

African, American, Asian, Caribbean, EasternEuropean, French, Greek, Indian, Italian, Jewish,Mediterranean, Mexican, Middle Eastern,Scandinavian, Spanish

Cuisine

Advance, Bake, Broil, Fry, Grill, Marinade,Microwave, No Cook, Poach, Quick, Roast, Sauté, Slow Cook, Steam, Stir Fry

Preparation Method

Christmas, Easter, Fall, Fourth of July,Hanukkah, New Years, Picnics, Spring,Summer, Superbowl, Thanksgiving, Valentine's Day, Winter

Season/Occasion

Appetizers, Bread, Breakfast, Brunch,Condiments, Cookies, Desserts, HorsD'oeuvres, Main Dish, Salads, Sandwiches,Sauces, Side Dish, Snacks, Soup, Vegetables

Course/Dish

Page 38: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 38

Epicurious, Second Facet

Browse > Picnics > Poultry

Page 39: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 39

Integration of Search and Browse

Page 40: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 40

Integration of Search and Browse

Page 41: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 41

Amazon.com Advanced Search

Page 42: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 42

Generalizations about Search & Navigation

• The relationship between the metadata and search engine capabilities is crucial.

• Controlled vocabulary and keyword searching are often both enabled.

• Navigation and search are often both provided as complements to each other.

Page 43: Taxonomy & Metadata / Information Architecture Consulting Amy J. Warner, Ph.D. Metadata & Taxonomies for a More Flexible Information Architecture Information

Amy J. Warner, Ph.D. 43

Contact:Amy J. Warner, [email protected]

Questions??