25
1 Document Ontologies in Document Ontologies in Library and Information Library and Information Science: An Science: An Introduction and Introduction and Critical Analysis Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle, WA, USA [email protected] http://purl.oclc.org/net/acarlyle Knowledge Technologies Conference 2002, Seattle, WA, USA

1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

Embed Size (px)

Citation preview

Page 1: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

1

Document Ontologies in Document Ontologies in Library and Information Library and Information

Science: An Introduction Science: An Introduction and Critical Analysisand Critical Analysis

Allyson Carlyle iSchool, University of Washington, Seattle, WA, USA [email protected] http://purl.oclc.org/net/acarlyle Knowledge Technologies Conference 2002, Seattle, WA, USA

Page 2: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

2

Overview

Where I’m coming from : knowledge and document organization tasks in LIS (Library and Information Science)

Factors affecting the organization of knowledge and documents

Ontologies & Ontological assumptions in LIS IFLA ontology: Physical/Abstract status of

documents Hirons & Graham ontology: Temporal status of

documents

Page 3: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

3

Where I’m coming from: organizing tasks in LIS Creating document representations,e.g., cataloging

records; Arranging documents, e.g., in Dewey number order

on a library bookshelf; Creating organizational standards (Dewey) and

techniques (alphabetical ordering) to use in representing and arranging documents;

Creating organizational standards and techniques to provide pathways (via titles, author names, taxonomies, classifications, etc.) that guide people to documents and organized knowledge.

Page 4: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

4

Factors affecting the organization of knowledge and documents People: individuals and groups (e.g., social,

cultural, occupational orientations); Systems: retrieval, display / organization,

interface; Knowledge / documents: delivery

mode/format, subject content, disciplinary aspect, artifactual importance;

Administration & environment: costs, other constraints.

Page 5: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

5

Ontology?

In philosophy: “the branch of metaphysics that deals with the nature of being”

In computer related communities: “a specification of a conceptualization” (Tom Gruber); or “a set of vocabulary definitions that expresses a community’s consensus knowledge about a domain. This knowledge is meant to be stable over time, and reused to solve multiple problems.” (Peter Weinstein)

Page 6: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

6

Ontological assumptions in LIS

Documents have a simultaneous existence as both physical and abstract entities; this is being referred to in the library cataloging community as “content vs. carrier”

Page 7: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

7

Page 8: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

8

Page 9: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

9

Content vs. Carrier

The physical/abstract dichotomy presents the following problem – if we are creating document representations, what should we represent? The carrier? The content? Both?

Whatever decision is made, the physical/abstract dichotomy may result in complications for people when they are searching, navigating, and trying to determining relevance in systems.

Page 10: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

10

IFLA ontology of the physical/abstract status of documents International Federation of Library Associations

(IFLA) charged a study group to identify “functional requirements for bibliographic records” (in other words, an explanation an optimum model for creating document representations)

Functional Requirements for Bibliographic Records available at: http://www.ifla.org/VII/s13/frbr/frbr.pdf

Page 11: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

11

IFLA ontology of the physical/abstract status of documents Proposes that documents are single physical

entities representing multiple abstract entities each with its own distinct, and sometimes contradictory, attributes: work (an intellectual or artistic creation) expression (a realization of a work in alpha-numeric,

musical, image, etc. form) manifestation (a physical embodiment of an

expression of a work) item (a single exemplar of a manifestation)

Page 12: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

12

Alternative definitions for IFLA entities work: a set of items embodying a distinct

intellectual or artistic content expression: a set of items embodying a

realization of a work manifestation: a set of “identical” items;

items sharing many intellectual and physical attributes

item: a single item

Page 13: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

13

Items

Item attributes: condition, access restrictions on item, history (provenance), marks or inscriptions present

Page 14: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

14

Manifestations

Manifestation attributes: edition designation (3rd edition), publisher/distributor, date of publication/distribution, physical medium, access restrictions on manifestation, file characteristics (electronic document)

What is a manifestation in the web environment? What you see on the screen or what is stored in a file on a server?

If manifestation defined as what you see on a screen, how useful is it to describe web page “manifestations”?

Page 15: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

15

Expressions

Expression attributes: expression title (The Haunted Pool, The Devil’s Pool, La Mare au Diable), expression creator (e.g., translator), type of score (musical notation), projection or scale (cartographic expression), etc.

Do all expressions have unique attributes? Dune vs. Dune – some interpretations would

make manifestation attributes into expression attributes

Page 16: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

16

Works

Work attributes: creator, work title (La Mar au Diable), date of creation / date of publication or appearance, key (for a musical work), coordinates (for a cartographic work), etc.

Problem: What is a work? When does a version (expression) of a work become different enough to become its own “distinct intellectual or artistic creation”

Charles Dickens’ A Christmas Carol vs. Scrooged – the same work? different works that are related?

Page 17: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

17

Solutions to the Physical/Abstract Multiple Entity Problem IFLA ontology is one approach; others, both simpler

and more complex, are possible – see Indecs Framework, a variation of the IFLA ontology for “intellectual property” e-commerce (http://www.indecs.org/ ).

Standardized approaches or ontologies are possible that: recognize multiple abstract entities embodied in a single

physical item; represent each entity using a particular set of attributes,

clearly distinguished; display relationships among items to users in an

unambiguous and consistent manner

Page 18: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

18

Hirons & Graham ontology: temporal status of documents Some documents, such as magazines,

annual reports, and websites, may be seen as distinct works that accumulate or change as time passes. Hirons and Graham identify these as “ongoing entities.” How do we best create representations for ongoing entities?

Page 19: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

19

Hirons & Graham ontology: temporal status of document

With their ontology, Hirons and Graham clarify the nature of ongoing entities to improve library cataloging rules

However, their ontology may also be used to improve identification of metadata in web documents.

Page 20: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

20

Hirons & Graham ontology

Page 21: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

21

Strengths of the Hirons & Graham ontology Recognizes both similarities and differences

between documents such as serials that are “successive with discrete parts” and those that are “integrating”, such as Websites

Recognizes the fundamental nature of “integrating” documents; that they are not not made up of parts, but are wholes that are updated or changed.

Page 22: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

22

Complications

How do we maintain attribute values for ongoing entities? See Carl Lagoze et al. for a possible solution, using “event aware” metadata: http://www.cs.cornell.edu/lagoze/papers/ev.pdf

Page 23: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

23

Complications

Can the Hirons & Graham ontology and the IFLA ontology be successfully integrated? How can we talk about an integrating work or expression? What attributes are associated with them?

For example, are “serials”, such as magazines or e-journals, really “works”? If they are works (Time Magazine) composed of other works (Time articles), what are the implications for representation?

Page 24: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

24

References IFLA Study Group on the Functional Requirements for

Bibliographic Records. Functional Requirements for Bibliographic Records: Final Report. UBCIM Publications, New Series, vol. 19. München: K.G. Saur, 1998. http://www.ifla.org/VII/s13/frbr/frbr.pdf

The Indecs (INteroperability of Data in E-Commerce Systems) Framework. At: http://www.indecs.org/ Used the IFLA ontology as an initial framework.

Jean Hirons and Crystal Graham. “Issues Related to Seriality.” From: The Principles and Future of AACR. Jean Weihs, ed. Ottawa: Canadian Library Association, 1998. [Written for the library cataloging community, so it parts may be difficult to understand.]

Carl Lagoze, Jane Hunter, and Dan Brinkley. “An Event Aware Model for Metadata Interoperability” At: http://www.cs.cornell.edu/lagoze/papers/ev.pdf

Page 25: 1 Document Ontologies in Library and Information Science: An Introduction and Critical Analysis Allyson Carlyle iSchool, University of Washington, Seattle,

25

“Ontology” References

Tom Gruber. (2001) “What is an Ontology?” At: http://www-ksl.stanford.edu/kst/what-is-an-ontology.html.

Peter Weinstein. “Ontology-Based Metadata: Transforming the MARC Legacy”, from Digital Libraries 98, Pittsburgh, PA, USA: pp. 254-263.