34
November 22, 200 3 DASER Conference. Copyright MIT, 2003 1 METS: Metadata Encoding & Transmission Standard

November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

Embed Size (px)

Citation preview

Page 1: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 1

METS: Metadata Encoding & Transmission Standard

Page 2: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 2

Part One: Problem definition

Page 3: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 3

Digital (Library) Objects

• Reformatted to digital• scanned photographs, books and journals• digitized audio/video files

• “Born digital”• TEI-encoded texts• digital images, audio, video files• GIS, statistical datasets• interactive content

Page 4: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 4

Digital (Library) Objects

• Simple Objects– single files, e.g.

• visual TIFF images• MP3 files• TEI-encoded text

– objects stand alone • no relationships to other objects

Page 5: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 5

Digital (Library) Objects

• Complex Objects– multiple related files, e.g.

– page images from books or articles– multiple channels in digital audio files– related sound and text files (multimedia)– statistical dataset and codebook

– objects cannot stand alone• multiple files required to interpret the

object• requires structural metadata to model

Page 6: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 6

Structural metadata

• Maps physical files (digital assets) to logical items (complex digital objects)

• Examples– Scanned print material

• complex publication structures (e.g. journals runs)

• ordered relationship between digital page images

– A/V material• multiple resolutions of an image• multiple channels of an audio file

Page 7: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 7

Structural metadata

• Examples, continued– Multimedia presentations

• relationship between images, text, sound, video, etc. (time-based or other)

– Web sites• linkages between web pages• sitemaps

– Databases• table models and ER diagrams

Page 8: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 8

Digital (Library) Objects

• Also have other (non-structural) metadata– descriptive

• MARC, DC, FGDC, VRA core, other ontologies

– administrative• rights, provenance

– technical• format details, OAIS “representation

information”

• Standards exist or emerging for these

Page 9: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 9

Part Two: Introduction to METS

Page 10: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 10

METS Scope

• Supports– Structural metadata

• complex reformatted or born digital objects

– Metadata wrapper framework• descriptive, administrative, structural, etc.• structural required• others use namespaces to reference

“extension schemas”

Page 11: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 11

Brief History

• 1997-2001 Making Of America II project– Funded by DLF and NEH– Included Berkeley, Cornell, NYPL, Penn State,

Stanford, U of Michigan

– Designed for scanned archival collections– SGML DTD included pre-defined descriptive,

administrative, structural metadata

• February 2001 DLF workshop on structural metadata produced METS framework

Page 12: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 12

METS Header

Administrativemetadata

FileInventory

Structuremap

Descriptivemetadata

Behavioralmetadata

METS metadata “buckets”

optional

optional

optional required

optional optional

Page 13: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 13

METS metadata

• XML “extension schemas”– descriptive metadata

• Dublin Core, MARC, FGDC, VRA, etc.• Berkeley’s GDM schema (from MOA2)

– administrative/technical metadata• NISO image technical metadata• LC schemas for A/V technical metadata• Rights metadata (e.g. PRISM, XrML, etc.)• Provenance metadata

Page 14: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 14

M etad a ta R e fe ren ce M etad a ta W rap p er

D esc rip tive M etad a ta

Metadata Reference (mdRef): A link to external descriptive metadata. The type of link (URN/Handle/etc.)is included as an attribute, as is the metadata type.

Metadata Wrapper (mdWrap): Included descriptive metadata, as either binary data (Base64 encoded) or arbitrary XML using namespace mechanism. The metadata type is specified as an attribute.

METS Descriptive Metadata Section

Page 15: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 15

Tech n ica lM etad a ta

IP R ig h tsM etad a ta

S ou rceM etad a ta

P reserva tionM etad a ta

A d m in is tra tiveM etad a ta

Technical Metadata (techMD): technical metadata regarding content files

IP Rights Metadata (rightsMD): rights metadata regarding content files or primary source material

Source Metadata (sourceMD): provenance information for content files.

Preservation Metadata (preservationMD): metadata to assist in preservation of digital content

All sections use generic metadata reference and wrapper subelements.

METS Administrative Metadata Section

Page 16: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 16

e tc ., e tc ., e tc .

F ile G rou p F ile

F ile G rou p F ile

F ile In ven to ry(F ile G rou p )

File Group (fileGrp): provides mechanism for hierarchically subdividing physical files, for example by type

File (file): provides a pointer to an external file (Flocat) or includes file content internally (Fcontent) in Base64 encoding

METS File Inventory

Page 17: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 17

etc ., e tc . e tc ....

D ivis ion M E TS P o in te r F ile P o in te r

D ivis ion M E TS P o in te r F ile P o in te r

D ivis ion

S tru c tu ra l M ap

The Structural Map provides a tree structure describing the original document. Each division (div) element is a node in that tree, and can identify content files associated with that division by a METS Pointer (mptr) or a File Pointer (fptr)

METS Structural Map

Page 18: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 18

METS Pointer and File Pointer

METS Pointer (mptr): xlink to another METS file containing the content for the associated div. Useful for breaking up large objects (e.g., a journal run) into a series of smaller METS documents.

File Pointer (fptr): Identifies one or more entries in the File Inventory section containing the content for the associated div element. Can also limit the link from a div element to a portion of a content file (e.g., a segment of an audio or video file, a subarea of an image or video file, etc.).

Page 19: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 19

A rea A rea . . .

P ara lle l F iles

A rea A rea . . .

S eq u en tia l F iles

F ile P o in te r

File Pointer (fptr): Can identify a single file in File Inventory using ID/IDREF linking

Parallel/Sequential(par/seq): Allows a div to be associated with several content files that should be played/displayed in parallel (video with separate audio track file) or sequentially.

Area (area): identifiers a point, linear segment, or 2D area within content file that corresponds with associated div element.

METS File Pointer Mechanisms

Page 20: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 20

METS Area Element Attribtes

FILE: ID for File element in File InventorySHAPE: As in HTML Area elementCOORDS: As in HTML Area elementBEGIN: A start point within a file for defining

a segmentEND: An end point within a file for defining

a segmentBETYPE: Begin/End type: IDREF, Byte Offset,

or SMPTE time codeEXTENT: Length Duration of SegmentEXTYPE: Extent Type: Bytes, or SMPTE

Page 21: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 21

Structure Example

<file ID=“f1” MIMETYPE=“audio/x-wav” SEQ=“1”><Flocat LOCTYPE=“URN”>

urn:x-nyu:violet42</Flocat>

</file><div N=“5” LABEL=“Question 5”>

<fptr><seq>

<area FILE=“f1” BEGIN=00:23:17:00 END=“00:23:38:00” BETYPE=“SMPTE”>

</area><seq>

</fptr></div>

Page 22: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 22

• Created for multimedia structural encoding

• SMIL has “time-based” orientation – for playing multimedia presentations

• Very complex• May eventually be incorporated

Related standards: SMIL (W3C), MPEG-7 (ISO)

Page 23: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 23

Related standards: RDF (W3C)• Also metadata wrapper framework• Structural metadata could be

supported, but doesn’t specify how…

• Opaque to use• No element semantics provided• element names deliberately meaningless

• Originally designed for descriptive metadata

Page 24: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 24

Related standards: OAIS framework

Page 25: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 25

METS and OAIS framework

• Submission Information Package (SIP)• METS as transfer syntax

• Dissemination Information Package (DIP)

• METS as tranfer syntax• METS as input to display applications

• Archival Information Package (AIP)• METS stored internally in an archive

Page 26: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 26

Library Applications

• Digital Object transfer syntax– between systems

• enables interoperability

– between institutions• enables collection sharing

– implements OAIS SIP/DIP/AIP

Page 27: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 27

Library Applications

• Input to Digital Object delivery systems (aka “disseminators”)– Simple bit-streaming– XSL stylesheet– Custom program for complex digital

object display

Page 28: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 28

Part Three: METS Summary

Page 29: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 29

METS summary

• Descriptive/technical/administrative metadata– not defined internally– points to external standard schemas

• Dublin Core, MARC, MPEG-7, etc.• AES audio metadata

– set of “best practice” schemas being identified

Page 30: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 30

METS summary

• Structural metadata– defined internally and required– SMIL-lite

• simple support for multimedia, audio/visual

• SMIL may replace eventually

Page 31: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 31

METS summary

• Current users include• UC Berkeley (archival collections)• Harvard (scanneded print publications, e-

journals)• Library of Congress (audio/visual collections)• British Library• RLG and OCLC• EU METAe project (historic newspapers)• Michigan State (oral history collections)• Univ of Virginia (FEDORA digital objects)• more daily...

Page 32: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 32

METS summary

• Tools under development for– metadata capture– transformation– transfer– dissemination/display

• Profiles necessary for interoperation– Which extension schemas used?– How structure maps are organized…

Page 33: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 33

METS summary

• Current status– version 1.3 available from LC– editorial board in place– LC standards office for maintenance

agency– DLF and RLG underwriting

• RLG will host editorial board, offer documentation and training, develop tools

– Several extension schemas available– Opening Day in October 2004

Page 34: November 22, 2003DASER Conference. Copyright MIT, 20031 METS: Metadata Encoding & Transmission Standard

November 22, 2003 DASER Conference. Copyright MIT, 2003 34

METS summary

• METS is not all things to all people…– Designed for local institutional application

support• Solving an immediate local problem• Common to many institutions• Flexible framework supports many institutional

situations

– Profiling necessary to interoperate• For OAIS packages• For shared tools• For other kinds of interoperation (e.g. cross

repository search)