70
Introduction to Metadata Metadata Working Group Forum September 21, 2007 Presented by Metadata Services Department Presenters: Glen Wiley, Nancy Solla, Greg Nehler

Introduction to Metadata Metadata Working Group Forum September 21, 2007 Presented by Metadata Services Department Presenters: Glen Wiley, Nancy Solla,

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Introduction to Metadata

Metadata Working Group Forum

September 21, 2007

Presented by Metadata Services Department

Presenters: Glen Wiley, Nancy Solla, Greg Nehler

PART I: Overview

Bring order to information

Bring order to information

Dr. Frank Chardonnay (2005)

Metadata

• Type: White• Price: $14.99 • Quantity: 750ml • Analysis: Alcohol 13.3%; Acidity .56g/100 ml;

pH 3.40; Sugar 0.4% • Description: The floral and fruity personality of

this wine with mineral and toasty elements is in harmony with this style of Chardonnay.

• Use: Goes great with seafood, a creamy Alfredo sauce, roasted chicken or turkey.

Bring order to information

Categorizes

Contextualizes

Summarizes

Gives local meaning

Definition of Metadata

Most common: “Data about data” too vague to be meaningful

Definition by ALA CC:DA: “Metadata are structured, encoded data that describe

characteristics of information-bearing entities to aid in the identification, discovery, assessment, and management of the described entities.“

• ALCTS Committee on Cataloging Task Force on Metadata Summary Report (June 1999)

Definition of metadata “…machine understandable information about web resources or other

things” -- Tim Berners-Lee, Director of World Wide Web Consortium

“structured data about resources that can be used to help support a wide range of operations” – Michael Day, UKOLN

“Metadata” comes from the computer science field

Emerged from database research community in the late 60’s and early 70’s

Digital or non-digital; Human or machine readable

Why Metadata?

The function of organizing & managing information

–for discovery & retrieval

–to enable data interchange or sharing

–resource enrichment

–resource management, including preservation

Why Metadata?

What other functions can be supported?

•Verification of authenticity

•Intellectual property rights management

•Content-rating

•Authentication and authorization

•Personalization and localization of services

Metadata Discovery

Where can metadata be found?

•Within a resource

•Directly linked to the resource

•Detached from resource

Metadata can be found…Within a resource

Title page and table of contents (books) META tags in document headers (Web pages) ID3 metadata (MP3) "file properties" (office documents) EXIF data (images)

Directly linked to the resource

•Using the Link rel="meta" elements (Web pages)

<link rel="meta" href="index.php.rdf" />

Links web page to other metadata formats: Dublin Core, RDF, IEEE LOM, etc

RDF Document

RDF Document

Metadata can be found…

Web PageWeb Page

Independently managed in a separate database; can be linked by identifiers

•This is the most common approach

Metadata can be found…

BookBook

Web Resource

Web Resource

CD-ROMCD-ROM

Metadata Record

Metadata Record

Metadata Record

Metadata Record

Archival Object

Archival Object

Metadata for a ManuscriptDublin Core metadata:

identifier: http://idserver.utk.edu/?id=200600000001212 publisher: University of Tennessee Libraries format: image/jpeg format: manuscript title: Letter, John Shrady in Knoxville, Tenn., to Jeannie Lockhartdescription: In this letter, dated December 25, 1863 from Knoxville, Tenn., to "My

Own Darling," John Shrady, a regimental surgeon, describes his journey to Knoxville, Tenn via Chattanooga. Additionally, he discusses the probability that he will get an appointment in a hospital, pointing out that the "facilities" are not generally in good shape.

subject: Knoxville (Tenn.) -- History -- Civil War, 1861-1865. creator: Shrady, John relation: Finding Aid: 1436; John Shrady Letters; Special Collections Library, The

University of Tennessee rights: For rights relating to this resource, visit http://idserver.utk.edu/?

id=200600000001198

Archival Object

Archival Object

Functions of metadata

Popular Categorizations

Descriptive

Administrative

Structural

Other typologies of metadata Asset/Use/Subject/Relation Integration/Semantic

Cornell University Library & Metadata

EAD archival finding aids with RMC & Kheel Center VIVO’s semantic metadata Publisher-supplied e-journal metadata MARC records in Voyager Project Euclid subscription-level and issue-level

metadata VRA Core metadata with Visual Resources Collections TEI Lite conversion scheme with Hearth Project Etc….

Descriptive Metadata

Descriptive of the intellectual content Discovery, identification, selection, collocation, acquisition

Sample elements:

unique identifiers (PURL, Handle)

physical attributes (media, dimensions condition)

bibliographic attributes (title, author/creator, language, keywords)

Descriptive Metadata

Sample implementations:

Dublin Core MARC HTML Meta Tags_____________________________________

EAD (Encoded Archival Description) TEI (Text Encoding Initiative) Header METS (Metadata Encoding and Transmission

Standard) MODS (Metadata Object Description Schema) –

MARC-21-based.

Descriptive Metadata: LibeCast RSS feed

Administrative Metadata

Technical (file size, resolution, format) Digital Rights Management (authentication, access) Preservation (provenance)

Sample elements:

Light sourceOwnerCopyright dateCopying and distribution limitationsLicense informationPreservation activities

Scanner type and modelImage resolutionBit depthColor spaceFile formatCompression

Administrative Metadata

Sample implementations: MOA2, Administrative Metadata Elements National Library of Australia, Preservation M

etadata for Digital Collections

Preservation Metadata: Implementation Strategies (PREMIS)

International Press Telecommunications Council (IPTC) Core

Administrative Metadata

From http://depot.northwestern.edu/~mcdough/l/ala07nrmig/metadata-inpractice-northwestern-distro.pdf

Structural Metadata

Defining components of information, like a “binder” for information objects

Sample elements: Title page Table of contents Chapters or parts Errata Index sub-object relationship (e.g., photograph from a diary) Movement markings or section letters (scores) Track listings (audio recordings)

Structural Metadata

Sample implementations:

Encoded Archival Description (EAD) MOA2, Structural Metadata Elements Metadata Encoding and Transmission Standard

(METS)

Structural Metadata:

Semantic metadata

Definition:

Metadata that describe contextually relevant or domain-specific information about content (in the right context) based on a domain specific metadata model (e.g., industry-specific or enterprise specific) or ontology is known as semantic metadata.

Semantic metadata annotates or enhances information

Semantic metadata

Examples:

Cornell’s VIVO Semantic metadata in the business domain could be:

company name, ticker symbol, industry, sector, executives, etc.,

Semantic metadata in the intelligence domain could be: terrorist name, event, location, organization, etc.

Metadata that offer greater depth and more insight ‘about the information falls under the semantic metadata category.

Semantic metadata

BEA Systems and PeopleSoft all engage in the "competes with" relationship with Oracle

Image from http://www.semagix.com/documents/SEII.pdf

Metadata building blocks

The basic unit of metadata is a statement.

A statement consists of a property (aka, element) and a value. a resource that has a name and is used to describe a specific aspect,

characteristic, attribute or relation used to describe a resource.  Since a property is a resource, a property can have properties, but most of

the time we are only really interested in the name.

Metadata statements describe resources. Resources are anything that can be uniquely identified  A Resource may be part of a web page or even a whole collection of pages

• From DC 2006, Manzanillo, Colima, 3 Oct 2006 Kurth, Basic DC Semantics, slide 7; http://dublincore.org/resources/training/dc-2006/Tutorial1.pdf

Metadata building blocks

A specific resource together with a named property plus a value of that property for that resource is an statement

From DC 2006, Manzanillo, Colima, 3 Oct 2006, Kurth, Basic DC Semantics, slide 8; http://dublincore.org/resources/training/dc-2006/Tutorial1.pdf

Metadata building blocksWhat are the properties and values in these metadata statements?

Example 2:

<title>View of Ithaca Gorge</title>

<type>Image</type>

Example 1:

245 00 $a Mann Library Chats in the Stacks $h [electronic resource]

Metadata building blocks

A specific resource together with a named property plus a value of that property for that resource is an statement

Metadata Scheme

Definitions:

“a set of metadata elements and rules for their uses that has been defined for a particular purpose” --(Caplan, 2003)

A set of metadata elements and the rules for using it.

“A collection of metadata elements gathered to support a function, or a series of functions (e.g., resource discovery, administration, use, etc.), for an information object.” –(Greenberg, 2005)

Metadata Schemas & Initiatives

CDWA (Categories for the Description of Works of Art) Dublin Core Metadata Initiative (DCMI) EAD (Encoded Archival Description) FGDC Content Standard for Digital Geospatial Metadata

(CSDGM) LOM (Learning Object Metadata) METS (Metadata Encoding and Transmission Standard) MIX (Metadata for Images in XML Schema) MODS (Metadata Object Description Schema) TEI (Text Encoding Initiative) VRA (Visual Resources Association) Core Categories etc….

Related to metadata schemas

Namespace is a unique place to contextualize elements and to avoid element name conflicts

• Dublin Core Metadata Element Set, Version 1.1 [http://purl.org/dc/elements/1.1/]

• Getty Art & Architecture Thesaurus [http://www.getty.edu/research/conducting_research/vocabularies/aat]

Syntax is the rules for encoding the elements or technical implementation

• XML, SGML, MARC

Content Rules define selection and representation of the elements

• Cataloging rules like AACR2

Semantics is the basic meaning of the metadata elements• Definition of author

Application Profile

Definition: An application profile is an assemblage of metadata elements selected from one or more metadata schemas and combined in a compound schema.

• SOURCE: Duval, E., et al. Metadata Principles and Practicalities, D-Lib Magazine, April 2002, http://www.dlib.org/dlib/april02/weibel/04weibel.html

Schema 1Schema 1

Schema 2Schema 2

Metadata Record

Metadata Record

Metadata Record

Metadata Record

Application Profile

Subsets of metadata elements implemented by a particular group

• METS profile for primary textual resources• ETD-MS, Dublin Core for ETDs• Particular Library’s Application Profile for Digital

Collections

Application Profile

KMODDL Application Profile http://kmoddl.library.cornell.edu/aboutmeta2.php

Specifies the elements, refinements and encoding schemes used by The Kinematic Models for Design Digital Library (KMODDL) for its metadata records

Application Profile

Identifying desired metadata elements for the collection

What are the desired elements? Is there an explanation and description of the element? Do you have an example? Is the implementation mandatory or optional? Repeatable? What common or core data is needed? What data do your various user groups need? What established data standards (e.g.,MARC, EAD, CDWA)

might fit the information needs of your institution? What data do you intend to “deliver” to your various end-user groups? Relationship and dependency specification

Application Profile

Decision for value spaces: content and value specifications, vocabularies

What is the element’s name and how do you define its value?

ELEMENT NAMES:

Agent – vra.agent Title – vra.titleLanguage – dc.languageCollection Type – cu.collectiontype

VALUE CONTROL

Agent --Yes, name authority

Local & LC Name Authority--Yes, by rule

Personal: Last, M. First

Organization: Bigger unit, smaller unit--No

Interoperability

Facilitating interoperability

Using defined metadata schemes, shared transfer protocols, and crosswalks between schemes, resources across the network can be searched more seamlessly.

• Cross-system search, e.g., using Z39.50 protocol; • Metadata harvesting, e.g., OAI protocol.

• Source: NISO. (2004) Understanding Metadata.Bethesda, MD: NISO Press, pp.1-2.

Crosswalking Metadata

Definition of crosswalk:

Technical & semantic mapping of elements from one metadata framework to another metadata framework

"a set of transformations applied to the content of elements in a source metadata standard that results in the storage of appropriately modified content in the analogous elements of a target metadata standard."

• Source: NISO White Paper, October 1998

Crosswalking Metadata

Example from http://www.niso.org/standards/resources/UnderstandingMetadata.pdf, page 12

Crosswalking Metadata

655_7 |a Photographs |2 aat

GOES TO

<genreform>Photographs</genreform>

GOES TO

<dc.type>Image</dc.type>

Crosswalking Metadata

MODSDCPREMIS

Creation and tools

Categories of Creation Tools

Templates Mark-up tools

Extraction tools Conversion tools

Creation and tools

Software Specific Template

Fill in the individual values for each metadata

element

Creation and tools

Metadata Mark-up Tools

Create metadata using an

XML editor

Creation and tools

Metadata Extraction Tools

Web page URL or file location goes

here

Creation and tools

Examples of Metadata Creation Tools

Dublin Core toolshttp://dublincore.org/tools

National Library of New Zealand’s Preservation Metadata Extraction Tool

http://meta-extractor.sourceforge.net/ TEI Software

http://www.tei-c.org/Software/index.html Customized Templates for EAD-Encoded Finding Aids

http://www.cdlib.org/inside/projects/oac/toolkit/templates/ EAD Tools & Helper Files

http://www.archivists.org/saagroups/ead/tools.html

Creation and tools

Examples of Metadata Creation Tools

FGDC Metadata Tools http://metadata.nbii.gov/portal/server.pt?open=512&objID=255&&

PageID=338&mode=2&in_hi_userid=2&cached=true

Metadata Software Tools http://ukoln.bath.ac.uk/metadata/software-tools/

OAI-Specific Tools http://www.openarchives.org/tools/tools.html

RDF Editors and Tools http://planetrdf.com/guide/#sec-tools

Questions?

Visit our new Metadata Services Department web site

PART II: Examples

Project Euclid example

Journal issue

marc:022ISSN

dc:identifier[unique string]

marc:008/35-37“EN”

dc:language“EN”

marc:100

dc:identifier[unique string]

dc:titlemarc:245

dc:creator marc:520 dc:description

dc:subject [uncontrolled vocabulary]

marc:653

marc:653

marc:650

marc:653

marc:653marc:653

dc:subject [encoding: msc2000]

dc:subject [uncontrolled vocabulary]dc:subject [uncontrolled vocabulary]dc:subject [uncontrolled vocabulary]dc:subject [uncontrolled vocabulary]

marc:650marc:650

dc:subject [encoding: msc2000]dc:subject [encoding: msc2000]

dc:format“text/pdf”

Billie Jean Isbell Andean Collection example

Digital image

Above: “Scan ID #00467” from the Billie Jean Isbell Andean Collection

Exercise 2: Metadata for Image Files

Exercise 2: Mapping for ISBELL Image Records

Exercise 2: .xml for Scan ID #ISB_00467

Title: Qoricancha exterior wall

Date: 1981

Subjects: Archaeological sites

Subject (local): Inka astronomy; Temple of the Sun; Inka observations of the zenith passage of the sun

Culture: Quechua

Locality: Coricancha Temple Site; Qoricancha Temple Site; Qorikancha Temple Site; Temple of the Sun Site

Region: Peru

Province: Cuzco

Description: The exterior wall of the Temple of the Sun in Cuzco.

Citation: Billie Jean Isbell Andean Collection, Images from the Andes: Collection Highlights: Zenith Link

Citation: Isbell, Billie Jean. “Culture Confronts Nature in the Dialectical World of the Tropics.” In Ethno-Astronomy and Archaeo-Astronomy in the American Tropics, edited by A. F. Aveni and G. Urton. Annals of the New York Academy of Sciences, vol. 385 (1982): 353-363.URL: http://hdl.handle.net/1813/2193 Link

Image Identifier: ISB_00467

Production: digital imaging, Digital Consulting and Production Services, Cornell University Library (Ithaca, NY, USA)

Production: photographers, Isbell, Billie Jean

Image Copyright: This digital collection and its contents are owned and operated by the Cornell University Library. Digital reproductions are provided for private study, scholarship, and research use only and may not be downloaded for use in electronic or print publications (including websites), exhibitions or

broadcasts, without permission. For more information, see: Cornell University Library Copyright Statement.

Collection: Billie Jean Isbell Andean Collection

Exercise 2: Final Output in Luna Web Interface

Questions?