42
IS 257 – Fall 2009 2009.01.21 - SLIDE 1 Organization of Information in Collections: Introduction University of California, Berkeley School of Information IS 245: Organization of Information In Collections

2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 1

Organization of Information in Collections:

IntroductionUniversity of California, Berkeley

School of InformationIS 245: Organization of Information In

Collections

Page 2: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 2

Lecture Contents

• Course Introduction

• Organization of Information

• Metadata

• Dublin Core

• Controlled Vocabularies

• Discussion

Page 3: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 3

Lecture Contents

• Course Introduction

• Organization of Information

• Metadata

• Dublin Core

• Controlled Vocabularies

• Discussion

Page 4: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 4

Course contents

• Metadata and Metadata Schemas

• Bibliographic Description

• Access Points and Vocabulary Control

• Topical/Subject Description

• Thesaurii

• Ontologies

• Other Metadata/Description/Organization topics

Page 5: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 5

COURSE OUTLINE

• Among the topics that will be covered during the semester are a number of “traditional” library-related topics:

• BIBLIOGRAPHIC DESCRIPTION– Introduction to the use of standards and

codes for description of bibliographic materials including the International Standard Bibliographic Description and the Anglo-American Cataloging Rules.

Page 6: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 6

COURSE OUTLINE

• ACCESS– 1. Access by names--Issues and problems

including name authority control– 2. Access by subject

• a. Types of access: descriptors; index terms--including types of indexes (e.g. KWIC, KWOC); subject headings; relational indexes (e.g. PRECIS)

• b. Vocabulary control--role of the thesauri and their use (e.g. Library of Congress Subject Headings; Medical Subject Headings; the Art and Architecture Thesaurus)

Page 7: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 7

COURSE OUTLINE

• ACCESS (cont.)• c. Classification schemes and their uses: shelf

arrangement; organization of printed lists; thesaurus hierarchies

• d. Subject authority control

– 3. Access by other attributes• a. Physical attributes of documents: title, text• b. Other attributes: language, uniform title

Page 8: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 8

COURSE OUTLINE

• ACCESS (cont.)– 4. Use of multiple access points: e.g. subject

and date– 5. Evaluation of different access points within

systems (e.g. Purposes served by access through classification scheme and alphabetical subject terms within a catalog or index)

Page 9: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 9

COURSE OUTLINE

• Metadata and Metadata Schemas– MARC– MODS– METS– EAD– EAC– Dublin Core– OWL – RDF– FRBR (which isn’t really a schema, but a

model)

Page 10: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 10

Course Requirements

• Assignments and exercises (30%)

• Final Paper/Project (60%)– Can be a traditional research paper on an

organizational topic or a project such as construction of a Thesaurus or Ontology for a particular topical area.

– Could be part of a MIMS final project

• Class Participation – including class reports (10%)

Page 11: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 11

Lecture Contents

• Course Introduction

• Organization of Information

• Metadata

• Dublin Core

• Controlled Vocabularies

• Discussion

Page 12: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 12

Organization of Information

• Is there a basic human need to put things into some sort of order?– Much of natural language concerns

categories of things rather than individual things

– Why do we organize things and information?• Why do spoons go in THAT drawer in the kitchen

and not in a can in the garage?• Why do your favorite books go on one shelf and

not-so-favorite on another?

Page 13: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 13

Why Organize Information?

• The main reason– So that you can find things more effectively

• I.e., effective retrieval is predicated on some sort of organization applied to information resources

• Historically there have been many institutions and tools devoted to information organization– Libraries– Museums– Archives– Indexes and catalogs, dictionaries, phone books, etc.

Page 14: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 14

Why Organize Information?

• A question of scale– Using your own ad hoc set of categories and

methods to organize your own collection of books or CDs seems to work fine…

– What if your collection grew to• 10 Times the size? How would you organize it?• 100 Times? • 1000 Times?• 100000 times?• What if it wasn’t physical objects, but electronic?

Page 15: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 15

What is Information Organization?

• Identifying the existence of all types of information-bearing entities as they are made available

• Identifying the works contained within those information-bearing entities or as parts of them

• Systematically pulling together these information-bearing entities into collections in libraries, archives, museums, Internet communications files and other such depositories

From Hagler via Taylor, Chap. 1

Page 16: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 16

What is Information Organization?

• Producing lists of these information-bearing entities prepared according to standard rules for citation

• Providing name, title, subject and other useful access to these information-bearing entities

• Providing the means of locating each information-bearing entity or a copy of it

Page 17: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 17

Key Issues in This Course

• How to describe information resources or information-bearing objects in ways so that they may be effectively used by those who need to use them– Organizing

• How to find the appropriate information resources or information-bearing objects for someone’s (or your own) needs– Retrieving

Page 18: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 18

Key Issues

Creation

Utilization Searching

Active

Inactive

Semi-Active

Retention/Mining

Disposition

Discard

Using Creating

AuthoringModifying

OrganizingIndexing

StoringRetrieval

DistributionNetworking

AccessingFiltering

Page 19: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 19

Organizing/Indexing

• Collecting and integrating information

• Affects data, information and metadata

• “Metadata” describes data and information– More on this shortly

• Organizing information– Types of organization?

• Indexing

Page 20: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 20

Accessing/Filtering

• Using the organization created in the O/I stage to– Select desired (or relevant) information– Locate that information– Retrieve the information from its storage

location (often via a network)

Page 21: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 21

Structure of an IR System

Interest profiles& Queries

Documents & data

Rules of the game =Rules for subject indexing +

Thesaurus (which consists of

Lead-InVocabulary

andIndexing

Language

StorageLine

Potentially Relevant

Documents

Comparison/Matching

Store1: Profiles/Search requests

Store2: Documentrepresentations

Indexing (Descriptive and

Subject)

Formulating query in terms of

descriptors

Storage of profiles

Storage of Documents

Information Storage and Retrieval System

Page 22: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 22

Lecture Contents

• Course Introduction

• Organization of Information

• Metadata

• Dublin Core

• Controlled Vocabularies

• Discussion

Page 23: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 23

Metadata

• Metadata is– “Data about Data” (database systems)– Information about Information

• First used (to the best we can discover) in 1978 (meta-data)

• Used for databases in (Meta-Data Base)– “a data base which itself contains the structural and

semantic data of other data bases”» Thomas R. Cousins & Wayne D. Dominick, “The

Management of Data Bases of Data Bases” ASIS Proceedings, 1978.

Page 24: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 24

Metadata

• Structures and languages for the description of information resources and their elements (components or features)

• “Metadata is information on the organization of the data, the various data domains, and the relationship between them” (Baeza-Yates p. 142)

Page 25: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 25

Metadata

• Often two main types of metadata are distinguished– Descriptive metadata

• Describes the information/data object and its properties

• May use a variety of descriptive formats and rules

– Topical metadata• Describes the topic or “aboutness” of an

information/data object • May include a variety of vocabularies for

describing, subjects, topics, categories, etc.

Page 26: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 26

Types of Metadata

• Element names

• Element description

• Element representation

• Element coding

• Element semantics

• Element classification

Page 27: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 27

Metadata Systems and Standards

• Naming and ID systems• Bibliographic description

– Texts

• Music• Images and objects• Numeric data• Geospatial data• Collections• Video and motion pictures

Page 28: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 28

The Same Item in Different Metadata Systems

• ISBD

• RFC 1807

• TEI Header

• MARC Record

• Dublin Core (a bit later)

Page 29: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 29

ISBD Punctuation

• Title Proper (GMD) = Parallel title : other title info / First statement of responsibility ; others. -- Edition information. -- Material. -- Place of Publication : Publisher Name, Date. -- Material designation and extent ; Dimensions of item. -- (Title of Series / Statement of responsibility). -- Notes. -- Standard numbers: terms of availability (qualifications).

Page 30: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 30

Bibliographic Record

• Introduction to cataloging and classification / Bohdan S. Wynar. -- 8th ed. / Arlene G. Taylor. -- Englewood, Colo. : Libraries Unlimited, 1992. -- (Library science text series).

Page 31: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 31

RFC 1807

• BIB-VERSION:: CS-TR-v2.1• ID:: UCB//123456• ENTRY:: September 9, 1997• TYPE:: BOOK• TITLE:: Introduction to cataloging and classification• AUTHOR:: Wynar, Bohdan S.• AUTHOR:: Taylor, Arlene G.• DATE:: 1992• PAGES:: 633• COPYRIGHT:: Libraries Unlimited, 1992• SERIES:: Library Science Text Series• END:: UCB//123456

Page 32: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 32

Minimal TEI Header

• <teiHeader>• <fileDesc>• <titleStmt>• <title> Introduction to cataloging and classification</title>• <respStmt><name>Bohdan S. Wynar<resp> 8th edition by</resp>• <name>Arlene G. Taylor</name>• </respStmt>• </titleStmt>• <publicationStmt>• <distributor>Libraries Unlimited</distributor>• </publicationStmt>• <sourceDesc>• <bibl> Introduction to cataloging and classification / Bohdan S. Wynar. -- 8th

ed. / Arlene G. Taylor. -- Englewood, Colo. : Libraries Unlimited, 1992. • </bibl>• </sourceDesc>• </fileDesc>• <teiHeader>

Page 33: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 33

MARC Record (Display)

• ID:DCLC9124851-B RTYP:c ST:p FRN: MS:c EL: AD:06-20-91• CC:9110 BLT:am DCF:a CSC: MOD: SNR: ATC: UD:04-11-92• CP:cou L:eng INT: GPC: BIO: FIC:0 CON:b• PC:s PD:1992/ REP: CPI:0 FSI:0 ILC:a II:1• MMD: OR: POL: DM: RR: COL: EML: GEN: BSE:• 010 9124851• 020 0872878112 (cloth)• 020 0872879674 (paper)• 040 DLC$cDLC$dDLC• 050 00 Z693$b.W94 1991• 082 00 025.3$220• 100 1 Wynar, Bohdan S.• 245 10 Introduction to cataloging and classification /$cBohdan S. Wynar.• 250 8th ed. /$bArlene G. Taylor.• 260 Englewood, Colo. :$bLibraries Unlimited,$c1992.• 300 xvii, 633 p. :$bill. ;$c24 cm.• 440 0 Library science text series• 504 Includes bibliographical references (p. 591-599) and index.• 650 0 Cataloging.• 650 0 Subject cataloging.• 650 0 Classification$xBooks.• 630 00 Anglo-American cataloguing rules.• 700 10 Taylor, Arlene G.,$d1941-

Page 34: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 34

Lecture Contents

• Course Introduction

• Organization of Information

• Metadata

• Dublin Core

• Controlled Vocabularies

• Discussion

Page 35: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 35

Dublin Core

• Simple metadata for describing internet resources

• For “Document-Like Objects”

• 15 Elements (in base DC)

Page 36: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 36

Dublin Core (original version)

• TITLE: Introduction to cataloging and classification• CREATOR: Taylor, Arlene G.• OTHER CONTRIBUTOR: Wynar, Bohdan S.• DATE: 1992• FORMAT: BOOK• LANGUAGE: ENG• PAGES: 633• PUBLISHER: Libraries Unlimited• SUBJECT: Cataloging.• SUBJECT: subject cataloging.• SUBJECT: Classification -- Books• DESCRIPTION: Textbook on cataloging and classification• RESOURCE TYPE: text.monograph• RESOURCE IDENTIFIER: (ISBN) 0872879674

Page 37: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 37

Dublin Core (XML)

<Title>Introduction to cataloging and classification</Title><Creator> Taylor, Arlene G.</Creator><Contributor>Wynar, Bohdan S.</Contributor><Date> 1992</Date><Format> BOOK</Format><Language> ENG</Language><Format> 633 pages</Format><Publisher> Libraries Unlimited</Publisher><Subject> Cataloging.</Subject><Subject> subject cataloging .</Subject><Subject> Classification -- Books .</Subject><Description> Textbook on cataloging and classification</Description><Type> text.monograph </Type><Identifier> (ISBN) 0872879674</Identifier>

Page 38: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 38

Dublin Core Elements

• Title

• Creator

• Subject

• Description

• Publisher

• Contributor

• Date

• Type

• Format

• Identifier

• Source

• Language

• Relation

• Coverage

• Rights

Page 39: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 39

Mega-Metadata Standards

• METS - Metadata Encoding and Transmission Standard (http://www.loc.gov/standards/mets)– Developed by the Digital Library Federation as an

implementation strategy for preservation metadata– "XML document format for encoding metadata

necessary for both management of digital library objects within a repository and exchange of such objects between repositories (or between repositories and their users)”

– Provides a flexible mechanism for encoding descriptive, administrative, and structural metadata for a digital library object, and for expressing the complex links between these various forms of metadata

Page 40: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 40

Metadata Resources

• Check the Links section from the class home page

• Best site is the “Digital Library: Metadata Resources” page from IFLA at http://www.ifla.org/II/metadata.htm

• For another good source of information on metadata standards see http://www.chin.gc.ca/English/Standards

Page 41: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 41

Lecture Contents

• Course Introduction

• Organization of Information

• Metadata

• Dublin Core

• Controlled Vocabularies (Introduction)

• Discussion

Page 42: 2009.01.21 - SLIDE 1IS 257 – Fall 2009 Organization of Information in Collections: Introduction University of California, Berkeley School of Information

IS 257 – Fall 2009 2009.01.21 - SLIDE 42

Controlled Vocabularies

• Next time…