View
216
Download
0
Embed Size (px)
Citation preview
Introduction to Metadata
Metadata Working Group Forum
September 21, 2007
Presented by Metadata Services Department
Presenters: Glen Wiley, Nancy Solla, Greg Nehler
Bring order to information
Dr. Frank Chardonnay (2005)
Metadata
• Type: White• Price: $14.99 • Quantity: 750ml • Analysis: Alcohol 13.3%; Acidity .56g/100 ml;
pH 3.40; Sugar 0.4% • Description: The floral and fruity personality of
this wine with mineral and toasty elements is in harmony with this style of Chardonnay.
• Use: Goes great with seafood, a creamy Alfredo sauce, roasted chicken or turkey.
Definition of Metadata
Most common: “Data about data” too vague to be meaningful
Definition by ALA CC:DA: “Metadata are structured, encoded data that describe
characteristics of information-bearing entities to aid in the identification, discovery, assessment, and management of the described entities.“
• ALCTS Committee on Cataloging Task Force on Metadata Summary Report (June 1999)
Definition of metadata “…machine understandable information about web resources or other
things” -- Tim Berners-Lee, Director of World Wide Web Consortium
“structured data about resources that can be used to help support a wide range of operations” – Michael Day, UKOLN
“Metadata” comes from the computer science field
Emerged from database research community in the late 60’s and early 70’s
Digital or non-digital; Human or machine readable
Why Metadata?
The function of organizing & managing information
–for discovery & retrieval
–to enable data interchange or sharing
–resource enrichment
–resource management, including preservation
Why Metadata?
What other functions can be supported?
•Verification of authenticity
•Intellectual property rights management
•Content-rating
•Authentication and authorization
•Personalization and localization of services
Metadata Discovery
Where can metadata be found?
•Within a resource
•Directly linked to the resource
•Detached from resource
Metadata can be found…Within a resource
Title page and table of contents (books) META tags in document headers (Web pages) ID3 metadata (MP3) "file properties" (office documents) EXIF data (images)
Directly linked to the resource
•Using the Link rel="meta" elements (Web pages)
<link rel="meta" href="index.php.rdf" />
Links web page to other metadata formats: Dublin Core, RDF, IEEE LOM, etc
RDF Document
RDF Document
Metadata can be found…
Web PageWeb Page
Independently managed in a separate database; can be linked by identifiers
•This is the most common approach
Metadata can be found…
BookBook
Web Resource
Web Resource
CD-ROMCD-ROM
Metadata Record
Metadata Record
Metadata Record
Metadata Record
Archival Object
Archival Object
Metadata for a ManuscriptDublin Core metadata:
identifier: http://idserver.utk.edu/?id=200600000001212 publisher: University of Tennessee Libraries format: image/jpeg format: manuscript title: Letter, John Shrady in Knoxville, Tenn., to Jeannie Lockhartdescription: In this letter, dated December 25, 1863 from Knoxville, Tenn., to "My
Own Darling," John Shrady, a regimental surgeon, describes his journey to Knoxville, Tenn via Chattanooga. Additionally, he discusses the probability that he will get an appointment in a hospital, pointing out that the "facilities" are not generally in good shape.
subject: Knoxville (Tenn.) -- History -- Civil War, 1861-1865. creator: Shrady, John relation: Finding Aid: 1436; John Shrady Letters; Special Collections Library, The
University of Tennessee rights: For rights relating to this resource, visit http://idserver.utk.edu/?
id=200600000001198
Archival Object
Archival Object
Functions of metadata
Popular Categorizations
Descriptive
Administrative
Structural
Other typologies of metadata Asset/Use/Subject/Relation Integration/Semantic
Cornell University Library & Metadata
EAD archival finding aids with RMC & Kheel Center VIVO’s semantic metadata Publisher-supplied e-journal metadata MARC records in Voyager Project Euclid subscription-level and issue-level
metadata VRA Core metadata with Visual Resources Collections TEI Lite conversion scheme with Hearth Project Etc….
Descriptive Metadata
Descriptive of the intellectual content Discovery, identification, selection, collocation, acquisition
Sample elements:
unique identifiers (PURL, Handle)
physical attributes (media, dimensions condition)
bibliographic attributes (title, author/creator, language, keywords)
Descriptive Metadata
Sample implementations:
Dublin Core MARC HTML Meta Tags_____________________________________
EAD (Encoded Archival Description) TEI (Text Encoding Initiative) Header METS (Metadata Encoding and Transmission
Standard) MODS (Metadata Object Description Schema) –
MARC-21-based.
Administrative Metadata
Technical (file size, resolution, format) Digital Rights Management (authentication, access) Preservation (provenance)
Sample elements:
Light sourceOwnerCopyright dateCopying and distribution limitationsLicense informationPreservation activities
Scanner type and modelImage resolutionBit depthColor spaceFile formatCompression
Administrative Metadata
Sample implementations: MOA2, Administrative Metadata Elements National Library of Australia, Preservation M
etadata for Digital Collections
Preservation Metadata: Implementation Strategies (PREMIS)
International Press Telecommunications Council (IPTC) Core
Administrative Metadata
From http://depot.northwestern.edu/~mcdough/l/ala07nrmig/metadata-inpractice-northwestern-distro.pdf
Structural Metadata
Defining components of information, like a “binder” for information objects
Sample elements: Title page Table of contents Chapters or parts Errata Index sub-object relationship (e.g., photograph from a diary) Movement markings or section letters (scores) Track listings (audio recordings)
Structural Metadata
Sample implementations:
Encoded Archival Description (EAD) MOA2, Structural Metadata Elements Metadata Encoding and Transmission Standard
(METS)
Semantic metadata
Definition:
Metadata that describe contextually relevant or domain-specific information about content (in the right context) based on a domain specific metadata model (e.g., industry-specific or enterprise specific) or ontology is known as semantic metadata.
Semantic metadata annotates or enhances information
Semantic metadata
Examples:
Cornell’s VIVO Semantic metadata in the business domain could be:
company name, ticker symbol, industry, sector, executives, etc.,
Semantic metadata in the intelligence domain could be: terrorist name, event, location, organization, etc.
Metadata that offer greater depth and more insight ‘about the information falls under the semantic metadata category.
Semantic metadata
BEA Systems and PeopleSoft all engage in the "competes with" relationship with Oracle
Image from http://www.semagix.com/documents/SEII.pdf
Metadata building blocks
The basic unit of metadata is a statement.
A statement consists of a property (aka, element) and a value. a resource that has a name and is used to describe a specific aspect,
characteristic, attribute or relation used to describe a resource. Since a property is a resource, a property can have properties, but most of
the time we are only really interested in the name.
Metadata statements describe resources. Resources are anything that can be uniquely identified A Resource may be part of a web page or even a whole collection of pages
• From DC 2006, Manzanillo, Colima, 3 Oct 2006 Kurth, Basic DC Semantics, slide 7; http://dublincore.org/resources/training/dc-2006/Tutorial1.pdf
Metadata building blocks
A specific resource together with a named property plus a value of that property for that resource is an statement
From DC 2006, Manzanillo, Colima, 3 Oct 2006, Kurth, Basic DC Semantics, slide 8; http://dublincore.org/resources/training/dc-2006/Tutorial1.pdf
Metadata building blocksWhat are the properties and values in these metadata statements?
Example 2:
<title>View of Ithaca Gorge</title>
<type>Image</type>
Example 1:
245 00 $a Mann Library Chats in the Stacks $h [electronic resource]
Metadata building blocks
A specific resource together with a named property plus a value of that property for that resource is an statement
Metadata Scheme
Definitions:
“a set of metadata elements and rules for their uses that has been defined for a particular purpose” --(Caplan, 2003)
A set of metadata elements and the rules for using it.
“A collection of metadata elements gathered to support a function, or a series of functions (e.g., resource discovery, administration, use, etc.), for an information object.” –(Greenberg, 2005)
Metadata Schemas & Initiatives
CDWA (Categories for the Description of Works of Art) Dublin Core Metadata Initiative (DCMI) EAD (Encoded Archival Description) FGDC Content Standard for Digital Geospatial Metadata
(CSDGM) LOM (Learning Object Metadata) METS (Metadata Encoding and Transmission Standard) MIX (Metadata for Images in XML Schema) MODS (Metadata Object Description Schema) TEI (Text Encoding Initiative) VRA (Visual Resources Association) Core Categories etc….
Related to metadata schemas
Namespace is a unique place to contextualize elements and to avoid element name conflicts
• Dublin Core Metadata Element Set, Version 1.1 [http://purl.org/dc/elements/1.1/]
• Getty Art & Architecture Thesaurus [http://www.getty.edu/research/conducting_research/vocabularies/aat]
Syntax is the rules for encoding the elements or technical implementation
• XML, SGML, MARC
Content Rules define selection and representation of the elements
• Cataloging rules like AACR2
Semantics is the basic meaning of the metadata elements• Definition of author
Application Profile
Definition: An application profile is an assemblage of metadata elements selected from one or more metadata schemas and combined in a compound schema.
• SOURCE: Duval, E., et al. Metadata Principles and Practicalities, D-Lib Magazine, April 2002, http://www.dlib.org/dlib/april02/weibel/04weibel.html
Schema 1Schema 1
Schema 2Schema 2
Metadata Record
Metadata Record
Metadata Record
Metadata Record
Application Profile
Subsets of metadata elements implemented by a particular group
• METS profile for primary textual resources• ETD-MS, Dublin Core for ETDs• Particular Library’s Application Profile for Digital
Collections
Application Profile
KMODDL Application Profile http://kmoddl.library.cornell.edu/aboutmeta2.php
Specifies the elements, refinements and encoding schemes used by The Kinematic Models for Design Digital Library (KMODDL) for its metadata records
Application Profile
Identifying desired metadata elements for the collection
What are the desired elements? Is there an explanation and description of the element? Do you have an example? Is the implementation mandatory or optional? Repeatable? What common or core data is needed? What data do your various user groups need? What established data standards (e.g.,MARC, EAD, CDWA)
might fit the information needs of your institution? What data do you intend to “deliver” to your various end-user groups? Relationship and dependency specification
Application Profile
Decision for value spaces: content and value specifications, vocabularies
What is the element’s name and how do you define its value?
ELEMENT NAMES:
Agent – vra.agent Title – vra.titleLanguage – dc.languageCollection Type – cu.collectiontype
VALUE CONTROL
Agent --Yes, name authority
Local & LC Name Authority--Yes, by rule
Personal: Last, M. First
Organization: Bigger unit, smaller unit--No
Interoperability
Facilitating interoperability
Using defined metadata schemes, shared transfer protocols, and crosswalks between schemes, resources across the network can be searched more seamlessly.
• Cross-system search, e.g., using Z39.50 protocol; • Metadata harvesting, e.g., OAI protocol.
• Source: NISO. (2004) Understanding Metadata.Bethesda, MD: NISO Press, pp.1-2.
Crosswalking Metadata
Definition of crosswalk:
Technical & semantic mapping of elements from one metadata framework to another metadata framework
"a set of transformations applied to the content of elements in a source metadata standard that results in the storage of appropriately modified content in the analogous elements of a target metadata standard."
• Source: NISO White Paper, October 1998
Crosswalking Metadata
Example from http://www.niso.org/standards/resources/UnderstandingMetadata.pdf, page 12
Crosswalking Metadata
655_7 |a Photographs |2 aat
GOES TO
<genreform>Photographs</genreform>
GOES TO
<dc.type>Image</dc.type>
Creation and tools
Categories of Creation Tools
Templates Mark-up tools
Extraction tools Conversion tools
Creation and tools
Software Specific Template
Fill in the individual values for each metadata
element
Creation and tools
Examples of Metadata Creation Tools
Dublin Core toolshttp://dublincore.org/tools
National Library of New Zealand’s Preservation Metadata Extraction Tool
http://meta-extractor.sourceforge.net/ TEI Software
http://www.tei-c.org/Software/index.html Customized Templates for EAD-Encoded Finding Aids
http://www.cdlib.org/inside/projects/oac/toolkit/templates/ EAD Tools & Helper Files
http://www.archivists.org/saagroups/ead/tools.html
Creation and tools
Examples of Metadata Creation Tools
FGDC Metadata Tools http://metadata.nbii.gov/portal/server.pt?open=512&objID=255&&
PageID=338&mode=2&in_hi_userid=2&cached=true
Metadata Software Tools http://ukoln.bath.ac.uk/metadata/software-tools/
OAI-Specific Tools http://www.openarchives.org/tools/tools.html
RDF Editors and Tools http://planetrdf.com/guide/#sec-tools
marc:008/35-37“EN”
dc:language“EN”
marc:100
dc:identifier[unique string]
dc:titlemarc:245
dc:creator marc:520 dc:description
dc:subject [uncontrolled vocabulary]
marc:653
marc:653
marc:650
marc:653
marc:653marc:653
dc:subject [encoding: msc2000]
dc:subject [uncontrolled vocabulary]dc:subject [uncontrolled vocabulary]dc:subject [uncontrolled vocabulary]dc:subject [uncontrolled vocabulary]
marc:650marc:650
dc:subject [encoding: msc2000]dc:subject [encoding: msc2000]
dc:format“text/pdf”
Above: “Scan ID #00467” from the Billie Jean Isbell Andean Collection
Exercise 2: Metadata for Image Files
Title: Qoricancha exterior wall
Date: 1981
Subjects: Archaeological sites
Subject (local): Inka astronomy; Temple of the Sun; Inka observations of the zenith passage of the sun
Culture: Quechua
Locality: Coricancha Temple Site; Qoricancha Temple Site; Qorikancha Temple Site; Temple of the Sun Site
Region: Peru
Province: Cuzco
Description: The exterior wall of the Temple of the Sun in Cuzco.
Citation: Billie Jean Isbell Andean Collection, Images from the Andes: Collection Highlights: Zenith Link
Citation: Isbell, Billie Jean. “Culture Confronts Nature in the Dialectical World of the Tropics.” In Ethno-Astronomy and Archaeo-Astronomy in the American Tropics, edited by A. F. Aveni and G. Urton. Annals of the New York Academy of Sciences, vol. 385 (1982): 353-363.URL: http://hdl.handle.net/1813/2193 Link
Image Identifier: ISB_00467
Production: digital imaging, Digital Consulting and Production Services, Cornell University Library (Ithaca, NY, USA)
Production: photographers, Isbell, Billie Jean
Image Copyright: This digital collection and its contents are owned and operated by the Cornell University Library. Digital reproductions are provided for private study, scholarship, and research use only and may not be downloaded for use in electronic or print publications (including websites), exhibitions or
broadcasts, without permission. For more information, see: Cornell University Library Copyright Statement.
Collection: Billie Jean Isbell Andean Collection
Exercise 2: Final Output in Luna Web Interface