40
Introduction to Metadata Mary Manning-Texas A&M University Karen Sigler-Texas State University

Introduction to Metadata Mary Manning-Texas A&M University Karen Sigler-Texas State University

Embed Size (px)

Citation preview

Introduction to Metadata

Mary Manning-Texas A&M UniversityKaren Sigler-Texas State University

TDL Course Description• This introductory course provides students with an understanding of

descriptive metadata through hands-on experience in creating descriptive metadata records for digital objects.

• Among the topics covered in the course are• Overview of descriptive metadata • Outline of community-specific metadata standards • Nuts and bolts of Dublin Core • Hands-on creation of Dublin Core records for • Images • Audio • Text • ETDs

Agenda (Housekeeping)

• 9:00-9:30 Introductions• 9:30-10:15 Overview of Descriptive Metadata•10:15-10:30 Break•10:30-11:15 Community Specific Metadata •11:15-11:45 Dublin Core •11:45-12:00 Metadata & Traditional Cataloging•12:00-1:30 Lunch (on your own)•1:30-3:00 Hands on exercises•3:00-3:15 Break•3:15-4:00 Review and questions

Introductions

• Name• Institution• Job Title/Duties• Why you took this workshop• Your level of metadata understanding• Your goals for this workshop• Does your institution use Dspace or CDM? If not

how do you describe and display digital resources?

Overview of Descriptive Metadata

Definition of Metadata

• Oxford English Dictionary defines it as:a set of data that describes and gives

information about other data.

• First definition appeared in 1968 (44 yrs. ago) There are categories of information about each data set as a unit in a data set of data sets, which must be handled as a special metadata set.

Definition cont’d

• Additional entries in 1970, 1977, 1987 and 1998, but the 1987 definition still has relevance today: The challenge is to accumulate data . . . from diverse sources, convert it to machine-readable form with a harmonized array of metadata descriptors and present the resulting database(s) to the user.

• From NISO 2004: “Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use or manage an information resource. Metadata is often called data about data or information about information”.

Another way of looking at it:

Cataloging = Creating Metadata

MARC XML

Library Catalog Digital Library

Basic Types of Metadata

• Administrative metadataMetadata used in managing and administrating collections and information resources

Examples: • acquisition information • rights and reproduction tracking • documentation of legal access requirements• selection criteria for digitization and • location information

Administrative Metadata (cont’d)

• Information about dates of digitization and technical specifics, along with identifiers and digital file names.

• May be important locally to include in descriptive metadata but not useful in aggregated context.

• Increases noise of the metadata and increases number and ambiguity of data values.

• Better to hide this information from public view

Structural Metadata

• Used primarily to facilitate navigation and presentation of electronic content, for example, multiple views of the same object, such as front, back and side views of a sculpture captured in separate digital image files

• Or a single book scanned as multiple image files, allows the user to jump to different parts of the book

• http://contentdm.adelphi.edu/cdm4/document.php?CISOROOT=/oracles&CISOPTR=22075&REC=1

Descriptive Metadata

• Provides intellectual access to the contents of a digital collection

• Two basic functions: identification and retrieval• Supports both the searching and browsing methods of

information retrieval• Used for describing and identifying information resources.

These could include title, creator, description, language, geographical place names, and so on

• Must be entered in a consistent form—usually taken from a controlled vocabulary to ensure consistent information retrieval

Preservation Metadata

• Information needed for the long-term preservation of the digital object and migration to other digital formats as software and hardware change over time

• For example: type of scanner used, original scanning resolution, image editing specifications

• Sometimes put under the umbrella of administrative metadata. Used to record information on the administration of the data in order to assure the continuity and authenticity of the digital object.

• Tracks the condition of the physical or digital forms and any actions taken to preserve them

Break

Community Specific Metadata

Metadata Strategies

• Variety of metadata schemas, content standards, and controlled vocabularies

• Must choose the best for your project based on a variety of factors

• Subjective in nature

Metadata Creation

– What are you describing?– What are the characteristics of the collection you

are describing?– Who are your users?– Who will create metadata?– How are you going to use your metadata?– Will you be providing and/or harvesting your

metadata in the future?– How does your system work with certain metadata?

Context

• Metadata is first and foremost created for use in its local context, but you need to think about uses of your metadata in a federated environment

• Once local metadata is shared by your institution with outside repositories, it becomes exposed to automatic harvesting and aggregation

• This is why it is important to map local metadata element or field to Dublin Core elements

*Graphic modified from Steven Miller’s “Metadata Resources” https://pantherfile.uwm.edu/mll/www/resource.html

Metadata Standards

Content StandardsRules, guidelines, best practices for element

content

AACR2 CCO (cataloging cultural Objects

RDA

Structure StandardsMetadata schemas

MARCelements

Dublin CoreMODS

Encoding StandardsFor machine readability,

communication, and exchange

MARC XML

Presentation Standards

For display to users

OPAC XSLT/CSS

Value StandardsControlled vocabularies for

the values of elements

LCSH

TGN

AAT

Community Specific Metadata• MODS (Metadata Object Description Schema):

Education• VRA Core (Visual Resources Association): Arts• EAD (Encoded Archival Description): Finding Aid• CDWA (Categories for the Description of Works of Art)• DDI (Data Documentation Initiative): Social, Behavioral

Sciences. • TDL ETD MODS Application Profile (Texas Digital

Library): Electronic Theses and Dissertations MODS Application http://www.tdl.org/wp-content/uploads/2009/04/tdl-descriptive-metadata-guidelines-for-etd-v1.pdf

Designing a Metadata Scheme

• Examine context, content, users• Determine functional requirements• Select and develop an element set• Establish element and database specifications• Establish controlled vocabulary and encoding

schemes• Develop content guidelines• Document the scheme

Documenting a metadata scheme (or a rose by any other name….)

• A critical aspect of metadata design• Common names for metadata scheme

documentation include metadata/user guidelines, usage guide, best practice guide, data dictionaries, and application profiles

• These documents range in formality and specificity—from a simple table to a document of a hundred or so pages

Community Specific Metadata Exercise (If We Have EnoughTime…)

• Which community(ies) uses this standard? • What type(s) of digital objects are being

described with this standard?• What is the encoding standard used (xml,

MARC, html)?• Are there any content standards used with this

standard?• Are there any value standards used with this

standard?

Dublin Core

Background on Dublin Core

• Created in 1995 in Dublin, Ohio• Comprised as a small, simple set of elements

for describing resources• Critical for OAI-PMH harvesting of digital

objects (Open Archives Initiatives Protocol Metadata Harvesting)

• Intended to be used for a wide variety of digital objects across different platforms

Dublin Core (cont’d)

• Has since been generally adopted by libraries and other cultural heritage institutions to describe digital collections

• ContentDM and Dspace are both commonly used platforms for digital content; both use Dublin Core elements for description

• Although Dublin Core started out as a simple set of elements, it became obvious that a more complex structure was necessary for meaningful description, leading to Qualified Dublin Core

Dublin Core: Original 15 Elements(Simple Dublin Core)

1. Title2. Creator3. Subject4. Description5. Publisher6. Contributor7. Date8. Type9. Format10. Identifier11. Source12. Language13. Relation14. Coverage15. Rights

Qualified Dublin Core: Fields with Element Refinement(s)

• Title—Alternative• Description—Table Of Contents, Abstract• Date—Created, Valid, Available, Issued, Modified• Format—Extent, Medium• Relation—Is Version Of, Has Version, Is Replaced

by, Replaces, Is Required By, Requires, Is Part Of, Has Part, Is Referenced By, References, Is Format Of, Has Format

• Coverage—Temporal, Spatial

Qualified Dublin Core: Fields with Element Encoding Scheme(s)

• Identifier—URI• Date—DCMI Period, W3C-DTF• Language—ISO 639-2 and 639-3, RFC 1766

and 4646• Subject—LCSH, MeSH, DDC, LCC, UDC

Qualified Dublin Core: Fields with Element Encoding Scheme(s) cont’d

• Coverage.Spatial—DCMI Point, ISO 3166, DCMI Box, TGN

• Coverage.Temporal—DCMI Period, W3C-DTF• Relation—URI• Source—URI

Metadata and Traditional Cataloging

Making the Connection Between Cataloging and Dublin Core

• Moving from MARC to Dublin Core is painless• Same principle– Accurate description– Access– Machine readable across different

systems/platforms

Mapping from Marc record to Dublin Core

• Title 245 10 Automatic and controlled processes regulating attentional distraction by emotional stimuli|h[electronic resource] /|cby Juliette Galindo.

• Author100 1 Galindo, Juliette.

• Subject650 0 Distraction (Psychology) 650 0 Attention. 650 0 Galvanic skin response.

*When moving from MARC to Dublin Core, Library of Congress Subject headings need to be designated with the suffix .lcsh to differentiate them from keywords.

• Titledc.Title Automatic and Controlled Processes regulating attentional distraction by emotional stimuli

• Authordc.Creator Galindo, Juliette

• Subjectdc.Subject Emotion regulation, attentional

distraction, skin conductance response (SCR), negativity bias.• Subjectdc.subject.lcsh

Galvanic skin responseAttentionDistraction (Psychology)

MARC to Dublin Core

MARC• For granularity of

description, MARC uses multiple subfields for various entries.

• Texas State University--San Marcos.|bDept. of Psychology |vTheses|y2010.

Dublin Core In a Dublin Core record,

the same information is put into multiple fields.

• dc.type.genre thesis• thesis.degree.grantor

Texas State University• thesis.degree.department

Psychology• dc.date.issued 2010-

08-23

Lunch

Metadata Exercises—IMAGE

http://goo.gl/mfL8G

Metadata Exercises—TEXT

Baylor Lariat, Vol. 18, Issue 1, 09/21/1916 (student newspaper), Pages 1-4 - http://contentdm.baylor.edu/cdm4/document.php?CISOROOT=/24lariat&CISOPTR=5179&REC=1Book of Poetry – http://contentdm.baylor.edu/cdm4/document.php?CISOROOT=/09ablwpc&CISOPTR=20624&REC=2Letter-http://digital.lib.uh.edu/cdm4/document.php?CISOROOT=/p15195coll9&CISOPTR=179&REC=1

Contact Us

Mary ManningAssistant University ArchivistTexas A&M [email protected]

Karen SiglerSpecial Collections Cataloging LibrarianTexas State University – San [email protected]