Upload
brenda-logan
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
IMPLEMENTATION ISSUES
How PREMIS can be used
For systems in developmentbull as a basis for metadata definition
For existing repositoriesbull as a checklist for evaluation
ldquoIt seems that often people say they arent ready to implement PREMIS yet but they dont seem to realise they are already collecting some of the same information that PREMIS describes The metadata is the same because it is often common sense that it is needed in a repository system PREMIS can be useful to point out a few extra areas they perhaps hadnt thought of yetrdquo Deborah Woodyard-Robinson
Implementation issues models
Reconciling data modelsbull PREMIS data model is for convenience of aggregationbull Many arbitrary decisions eg is an anomaly discovered
during validation a property of the object or an outcome of the validation event
bull Other data models equally valid eg NLNZ has Process Object File Metadata
bull However PREMIS encourages consistent application of preservation metadata across different categories of objects (representation file bitstream)
Implementation in relational databasesbull PREMIS data model is not entity-relationship model
Implementation issues obtaining values
How to create or obtain metadata valuesbull Most can be populated by program but tools would help
bull JHOVE NLNZ Metadata Extraction Toolbull Tool page under development
bull Need registries for format and environment information bull Pronom GDFR
What values to use for controlled vocabulariesbull PREMIS does not have ldquoschemerdquo element but probably
ought to
Implementation issues conformance
Conformance is defined in PREMIS Final Reportbull if you use the name use the definitionbull local metadata can supplement but not modify PREMISbull can define more stringent repeatability and obligation
but not more liberal
Meaning of mandatory bull you have to know it and you have to be able to supply it
if exporting for exchangebull you donrsquot have to record it in repository
Implementation issues need for additional metadata preservation metadata not considered core
bull core = all objects all preservation strategiesbull example of non-core = installation requirements
more detailed information on Rights and Agents
metadata describing Intellectual Entity
format-specific technical metadata
business rules of the repository
information about the metadata itself (eg who obtained or recorded a value when last changed)
XML issues
PREMIS XML schemas
One schema for each PREMIS entity in data modelbull Allows user to choose which parts of PREMIS to use
PREMIS container schemabull References schema for each entity typebull Provides a container if it is desirable to keep some or all
PREMIS metadata together bull If using container requires at least an object which in
turn requires objectIdentifier and objectCategorybull Individual schemas may used alone or with container
Semantic units in PREMIS schemasbull XML is faithful to data dictionarybull Only those units mandatory for all categories of objects
are mandatory in object schema
PREMIS Schemas
Container schema
Object schema
Event schema
Agent schema
Rights schema
Proposed schema changes for new version
Define an abstract object type to allow for better validation of object category (representation file bitstream)
Define main elements globally to allow for reuse Implement an extensibility mechanism to provide for
further structure when needed Implement a mechanism to use controlled vocabularies Adjust schemas to support changes in version 2 of data
dictionary
Implementing PREMIS using XML in METS
METS introduction
METS records the (possibly hierarchical) structure of digital objects the names and locations of the files that comprise those objects and the associated metadata
A METS document may be a unit of storage (eg OAIS AIP) or a transmission format (eg OAIS SIP or DIP)
METS is extensible and modular METS uses extension ldquowrappersrdquo or ldquosocketsrdquo where
elements from other schemas can be plugged in METS uses the XML Schema facility for combining
vocabularies from different Namespaces The METS Editorial Board has endorsed PREMIS as an
extension schema Many institutions trying to use PREMIS within the METS
context
The structure of a METS file
METS
dmdSec
amdSec
behaviorSec
structMap
fileSec file inventory
descriptive metadata
administrative metadata
behaviour metadata
structural map
Inserting technical metadata in a METS Document
ltmetsgt ltamdSecgt lttechMDgt ltmdWrapgt ltxmlDatagt
lt-- insert data from different namespace here --gt ltxmlDatagt ltmdWrapgt lttechMDgt ltamdSecgt ltfileSec gt ltstructMap gt ltmetsgt
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
METS extension schemas
ldquowrappersrdquo or ldquosocketsrdquo where elements from other schemas can be plugged in
Provides extensibility Uses the XML Schema facility for combining vocabularies from
different Namespaces Endorsed extension schemas
bull Descriptive MODS DC MARCXMLbull Technical metadata MIX (image) textMD (text)bull Preservation related PREMIS
Issues in using PREMIS with METS
Which METS sections to use and how many Whether to record elements redundantly in PREMIS that are
defined explicitly in the METS schema How to record elements that are also part of a format
specific technical metadata schema (eg MIX) Recording structural relationships How to deal with locally controlled vocabularies Whether to use the PREMIS container
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
How PREMIS can be used
For systems in developmentbull as a basis for metadata definition
For existing repositoriesbull as a checklist for evaluation
ldquoIt seems that often people say they arent ready to implement PREMIS yet but they dont seem to realise they are already collecting some of the same information that PREMIS describes The metadata is the same because it is often common sense that it is needed in a repository system PREMIS can be useful to point out a few extra areas they perhaps hadnt thought of yetrdquo Deborah Woodyard-Robinson
Implementation issues models
Reconciling data modelsbull PREMIS data model is for convenience of aggregationbull Many arbitrary decisions eg is an anomaly discovered
during validation a property of the object or an outcome of the validation event
bull Other data models equally valid eg NLNZ has Process Object File Metadata
bull However PREMIS encourages consistent application of preservation metadata across different categories of objects (representation file bitstream)
Implementation in relational databasesbull PREMIS data model is not entity-relationship model
Implementation issues obtaining values
How to create or obtain metadata valuesbull Most can be populated by program but tools would help
bull JHOVE NLNZ Metadata Extraction Toolbull Tool page under development
bull Need registries for format and environment information bull Pronom GDFR
What values to use for controlled vocabulariesbull PREMIS does not have ldquoschemerdquo element but probably
ought to
Implementation issues conformance
Conformance is defined in PREMIS Final Reportbull if you use the name use the definitionbull local metadata can supplement but not modify PREMISbull can define more stringent repeatability and obligation
but not more liberal
Meaning of mandatory bull you have to know it and you have to be able to supply it
if exporting for exchangebull you donrsquot have to record it in repository
Implementation issues need for additional metadata preservation metadata not considered core
bull core = all objects all preservation strategiesbull example of non-core = installation requirements
more detailed information on Rights and Agents
metadata describing Intellectual Entity
format-specific technical metadata
business rules of the repository
information about the metadata itself (eg who obtained or recorded a value when last changed)
XML issues
PREMIS XML schemas
One schema for each PREMIS entity in data modelbull Allows user to choose which parts of PREMIS to use
PREMIS container schemabull References schema for each entity typebull Provides a container if it is desirable to keep some or all
PREMIS metadata together bull If using container requires at least an object which in
turn requires objectIdentifier and objectCategorybull Individual schemas may used alone or with container
Semantic units in PREMIS schemasbull XML is faithful to data dictionarybull Only those units mandatory for all categories of objects
are mandatory in object schema
PREMIS Schemas
Container schema
Object schema
Event schema
Agent schema
Rights schema
Proposed schema changes for new version
Define an abstract object type to allow for better validation of object category (representation file bitstream)
Define main elements globally to allow for reuse Implement an extensibility mechanism to provide for
further structure when needed Implement a mechanism to use controlled vocabularies Adjust schemas to support changes in version 2 of data
dictionary
Implementing PREMIS using XML in METS
METS introduction
METS records the (possibly hierarchical) structure of digital objects the names and locations of the files that comprise those objects and the associated metadata
A METS document may be a unit of storage (eg OAIS AIP) or a transmission format (eg OAIS SIP or DIP)
METS is extensible and modular METS uses extension ldquowrappersrdquo or ldquosocketsrdquo where
elements from other schemas can be plugged in METS uses the XML Schema facility for combining
vocabularies from different Namespaces The METS Editorial Board has endorsed PREMIS as an
extension schema Many institutions trying to use PREMIS within the METS
context
The structure of a METS file
METS
dmdSec
amdSec
behaviorSec
structMap
fileSec file inventory
descriptive metadata
administrative metadata
behaviour metadata
structural map
Inserting technical metadata in a METS Document
ltmetsgt ltamdSecgt lttechMDgt ltmdWrapgt ltxmlDatagt
lt-- insert data from different namespace here --gt ltxmlDatagt ltmdWrapgt lttechMDgt ltamdSecgt ltfileSec gt ltstructMap gt ltmetsgt
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
METS extension schemas
ldquowrappersrdquo or ldquosocketsrdquo where elements from other schemas can be plugged in
Provides extensibility Uses the XML Schema facility for combining vocabularies from
different Namespaces Endorsed extension schemas
bull Descriptive MODS DC MARCXMLbull Technical metadata MIX (image) textMD (text)bull Preservation related PREMIS
Issues in using PREMIS with METS
Which METS sections to use and how many Whether to record elements redundantly in PREMIS that are
defined explicitly in the METS schema How to record elements that are also part of a format
specific technical metadata schema (eg MIX) Recording structural relationships How to deal with locally controlled vocabularies Whether to use the PREMIS container
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Implementation issues models
Reconciling data modelsbull PREMIS data model is for convenience of aggregationbull Many arbitrary decisions eg is an anomaly discovered
during validation a property of the object or an outcome of the validation event
bull Other data models equally valid eg NLNZ has Process Object File Metadata
bull However PREMIS encourages consistent application of preservation metadata across different categories of objects (representation file bitstream)
Implementation in relational databasesbull PREMIS data model is not entity-relationship model
Implementation issues obtaining values
How to create or obtain metadata valuesbull Most can be populated by program but tools would help
bull JHOVE NLNZ Metadata Extraction Toolbull Tool page under development
bull Need registries for format and environment information bull Pronom GDFR
What values to use for controlled vocabulariesbull PREMIS does not have ldquoschemerdquo element but probably
ought to
Implementation issues conformance
Conformance is defined in PREMIS Final Reportbull if you use the name use the definitionbull local metadata can supplement but not modify PREMISbull can define more stringent repeatability and obligation
but not more liberal
Meaning of mandatory bull you have to know it and you have to be able to supply it
if exporting for exchangebull you donrsquot have to record it in repository
Implementation issues need for additional metadata preservation metadata not considered core
bull core = all objects all preservation strategiesbull example of non-core = installation requirements
more detailed information on Rights and Agents
metadata describing Intellectual Entity
format-specific technical metadata
business rules of the repository
information about the metadata itself (eg who obtained or recorded a value when last changed)
XML issues
PREMIS XML schemas
One schema for each PREMIS entity in data modelbull Allows user to choose which parts of PREMIS to use
PREMIS container schemabull References schema for each entity typebull Provides a container if it is desirable to keep some or all
PREMIS metadata together bull If using container requires at least an object which in
turn requires objectIdentifier and objectCategorybull Individual schemas may used alone or with container
Semantic units in PREMIS schemasbull XML is faithful to data dictionarybull Only those units mandatory for all categories of objects
are mandatory in object schema
PREMIS Schemas
Container schema
Object schema
Event schema
Agent schema
Rights schema
Proposed schema changes for new version
Define an abstract object type to allow for better validation of object category (representation file bitstream)
Define main elements globally to allow for reuse Implement an extensibility mechanism to provide for
further structure when needed Implement a mechanism to use controlled vocabularies Adjust schemas to support changes in version 2 of data
dictionary
Implementing PREMIS using XML in METS
METS introduction
METS records the (possibly hierarchical) structure of digital objects the names and locations of the files that comprise those objects and the associated metadata
A METS document may be a unit of storage (eg OAIS AIP) or a transmission format (eg OAIS SIP or DIP)
METS is extensible and modular METS uses extension ldquowrappersrdquo or ldquosocketsrdquo where
elements from other schemas can be plugged in METS uses the XML Schema facility for combining
vocabularies from different Namespaces The METS Editorial Board has endorsed PREMIS as an
extension schema Many institutions trying to use PREMIS within the METS
context
The structure of a METS file
METS
dmdSec
amdSec
behaviorSec
structMap
fileSec file inventory
descriptive metadata
administrative metadata
behaviour metadata
structural map
Inserting technical metadata in a METS Document
ltmetsgt ltamdSecgt lttechMDgt ltmdWrapgt ltxmlDatagt
lt-- insert data from different namespace here --gt ltxmlDatagt ltmdWrapgt lttechMDgt ltamdSecgt ltfileSec gt ltstructMap gt ltmetsgt
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
METS extension schemas
ldquowrappersrdquo or ldquosocketsrdquo where elements from other schemas can be plugged in
Provides extensibility Uses the XML Schema facility for combining vocabularies from
different Namespaces Endorsed extension schemas
bull Descriptive MODS DC MARCXMLbull Technical metadata MIX (image) textMD (text)bull Preservation related PREMIS
Issues in using PREMIS with METS
Which METS sections to use and how many Whether to record elements redundantly in PREMIS that are
defined explicitly in the METS schema How to record elements that are also part of a format
specific technical metadata schema (eg MIX) Recording structural relationships How to deal with locally controlled vocabularies Whether to use the PREMIS container
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Implementation issues obtaining values
How to create or obtain metadata valuesbull Most can be populated by program but tools would help
bull JHOVE NLNZ Metadata Extraction Toolbull Tool page under development
bull Need registries for format and environment information bull Pronom GDFR
What values to use for controlled vocabulariesbull PREMIS does not have ldquoschemerdquo element but probably
ought to
Implementation issues conformance
Conformance is defined in PREMIS Final Reportbull if you use the name use the definitionbull local metadata can supplement but not modify PREMISbull can define more stringent repeatability and obligation
but not more liberal
Meaning of mandatory bull you have to know it and you have to be able to supply it
if exporting for exchangebull you donrsquot have to record it in repository
Implementation issues need for additional metadata preservation metadata not considered core
bull core = all objects all preservation strategiesbull example of non-core = installation requirements
more detailed information on Rights and Agents
metadata describing Intellectual Entity
format-specific technical metadata
business rules of the repository
information about the metadata itself (eg who obtained or recorded a value when last changed)
XML issues
PREMIS XML schemas
One schema for each PREMIS entity in data modelbull Allows user to choose which parts of PREMIS to use
PREMIS container schemabull References schema for each entity typebull Provides a container if it is desirable to keep some or all
PREMIS metadata together bull If using container requires at least an object which in
turn requires objectIdentifier and objectCategorybull Individual schemas may used alone or with container
Semantic units in PREMIS schemasbull XML is faithful to data dictionarybull Only those units mandatory for all categories of objects
are mandatory in object schema
PREMIS Schemas
Container schema
Object schema
Event schema
Agent schema
Rights schema
Proposed schema changes for new version
Define an abstract object type to allow for better validation of object category (representation file bitstream)
Define main elements globally to allow for reuse Implement an extensibility mechanism to provide for
further structure when needed Implement a mechanism to use controlled vocabularies Adjust schemas to support changes in version 2 of data
dictionary
Implementing PREMIS using XML in METS
METS introduction
METS records the (possibly hierarchical) structure of digital objects the names and locations of the files that comprise those objects and the associated metadata
A METS document may be a unit of storage (eg OAIS AIP) or a transmission format (eg OAIS SIP or DIP)
METS is extensible and modular METS uses extension ldquowrappersrdquo or ldquosocketsrdquo where
elements from other schemas can be plugged in METS uses the XML Schema facility for combining
vocabularies from different Namespaces The METS Editorial Board has endorsed PREMIS as an
extension schema Many institutions trying to use PREMIS within the METS
context
The structure of a METS file
METS
dmdSec
amdSec
behaviorSec
structMap
fileSec file inventory
descriptive metadata
administrative metadata
behaviour metadata
structural map
Inserting technical metadata in a METS Document
ltmetsgt ltamdSecgt lttechMDgt ltmdWrapgt ltxmlDatagt
lt-- insert data from different namespace here --gt ltxmlDatagt ltmdWrapgt lttechMDgt ltamdSecgt ltfileSec gt ltstructMap gt ltmetsgt
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
METS extension schemas
ldquowrappersrdquo or ldquosocketsrdquo where elements from other schemas can be plugged in
Provides extensibility Uses the XML Schema facility for combining vocabularies from
different Namespaces Endorsed extension schemas
bull Descriptive MODS DC MARCXMLbull Technical metadata MIX (image) textMD (text)bull Preservation related PREMIS
Issues in using PREMIS with METS
Which METS sections to use and how many Whether to record elements redundantly in PREMIS that are
defined explicitly in the METS schema How to record elements that are also part of a format
specific technical metadata schema (eg MIX) Recording structural relationships How to deal with locally controlled vocabularies Whether to use the PREMIS container
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Implementation issues conformance
Conformance is defined in PREMIS Final Reportbull if you use the name use the definitionbull local metadata can supplement but not modify PREMISbull can define more stringent repeatability and obligation
but not more liberal
Meaning of mandatory bull you have to know it and you have to be able to supply it
if exporting for exchangebull you donrsquot have to record it in repository
Implementation issues need for additional metadata preservation metadata not considered core
bull core = all objects all preservation strategiesbull example of non-core = installation requirements
more detailed information on Rights and Agents
metadata describing Intellectual Entity
format-specific technical metadata
business rules of the repository
information about the metadata itself (eg who obtained or recorded a value when last changed)
XML issues
PREMIS XML schemas
One schema for each PREMIS entity in data modelbull Allows user to choose which parts of PREMIS to use
PREMIS container schemabull References schema for each entity typebull Provides a container if it is desirable to keep some or all
PREMIS metadata together bull If using container requires at least an object which in
turn requires objectIdentifier and objectCategorybull Individual schemas may used alone or with container
Semantic units in PREMIS schemasbull XML is faithful to data dictionarybull Only those units mandatory for all categories of objects
are mandatory in object schema
PREMIS Schemas
Container schema
Object schema
Event schema
Agent schema
Rights schema
Proposed schema changes for new version
Define an abstract object type to allow for better validation of object category (representation file bitstream)
Define main elements globally to allow for reuse Implement an extensibility mechanism to provide for
further structure when needed Implement a mechanism to use controlled vocabularies Adjust schemas to support changes in version 2 of data
dictionary
Implementing PREMIS using XML in METS
METS introduction
METS records the (possibly hierarchical) structure of digital objects the names and locations of the files that comprise those objects and the associated metadata
A METS document may be a unit of storage (eg OAIS AIP) or a transmission format (eg OAIS SIP or DIP)
METS is extensible and modular METS uses extension ldquowrappersrdquo or ldquosocketsrdquo where
elements from other schemas can be plugged in METS uses the XML Schema facility for combining
vocabularies from different Namespaces The METS Editorial Board has endorsed PREMIS as an
extension schema Many institutions trying to use PREMIS within the METS
context
The structure of a METS file
METS
dmdSec
amdSec
behaviorSec
structMap
fileSec file inventory
descriptive metadata
administrative metadata
behaviour metadata
structural map
Inserting technical metadata in a METS Document
ltmetsgt ltamdSecgt lttechMDgt ltmdWrapgt ltxmlDatagt
lt-- insert data from different namespace here --gt ltxmlDatagt ltmdWrapgt lttechMDgt ltamdSecgt ltfileSec gt ltstructMap gt ltmetsgt
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
METS extension schemas
ldquowrappersrdquo or ldquosocketsrdquo where elements from other schemas can be plugged in
Provides extensibility Uses the XML Schema facility for combining vocabularies from
different Namespaces Endorsed extension schemas
bull Descriptive MODS DC MARCXMLbull Technical metadata MIX (image) textMD (text)bull Preservation related PREMIS
Issues in using PREMIS with METS
Which METS sections to use and how many Whether to record elements redundantly in PREMIS that are
defined explicitly in the METS schema How to record elements that are also part of a format
specific technical metadata schema (eg MIX) Recording structural relationships How to deal with locally controlled vocabularies Whether to use the PREMIS container
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Implementation issues need for additional metadata preservation metadata not considered core
bull core = all objects all preservation strategiesbull example of non-core = installation requirements
more detailed information on Rights and Agents
metadata describing Intellectual Entity
format-specific technical metadata
business rules of the repository
information about the metadata itself (eg who obtained or recorded a value when last changed)
XML issues
PREMIS XML schemas
One schema for each PREMIS entity in data modelbull Allows user to choose which parts of PREMIS to use
PREMIS container schemabull References schema for each entity typebull Provides a container if it is desirable to keep some or all
PREMIS metadata together bull If using container requires at least an object which in
turn requires objectIdentifier and objectCategorybull Individual schemas may used alone or with container
Semantic units in PREMIS schemasbull XML is faithful to data dictionarybull Only those units mandatory for all categories of objects
are mandatory in object schema
PREMIS Schemas
Container schema
Object schema
Event schema
Agent schema
Rights schema
Proposed schema changes for new version
Define an abstract object type to allow for better validation of object category (representation file bitstream)
Define main elements globally to allow for reuse Implement an extensibility mechanism to provide for
further structure when needed Implement a mechanism to use controlled vocabularies Adjust schemas to support changes in version 2 of data
dictionary
Implementing PREMIS using XML in METS
METS introduction
METS records the (possibly hierarchical) structure of digital objects the names and locations of the files that comprise those objects and the associated metadata
A METS document may be a unit of storage (eg OAIS AIP) or a transmission format (eg OAIS SIP or DIP)
METS is extensible and modular METS uses extension ldquowrappersrdquo or ldquosocketsrdquo where
elements from other schemas can be plugged in METS uses the XML Schema facility for combining
vocabularies from different Namespaces The METS Editorial Board has endorsed PREMIS as an
extension schema Many institutions trying to use PREMIS within the METS
context
The structure of a METS file
METS
dmdSec
amdSec
behaviorSec
structMap
fileSec file inventory
descriptive metadata
administrative metadata
behaviour metadata
structural map
Inserting technical metadata in a METS Document
ltmetsgt ltamdSecgt lttechMDgt ltmdWrapgt ltxmlDatagt
lt-- insert data from different namespace here --gt ltxmlDatagt ltmdWrapgt lttechMDgt ltamdSecgt ltfileSec gt ltstructMap gt ltmetsgt
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
METS extension schemas
ldquowrappersrdquo or ldquosocketsrdquo where elements from other schemas can be plugged in
Provides extensibility Uses the XML Schema facility for combining vocabularies from
different Namespaces Endorsed extension schemas
bull Descriptive MODS DC MARCXMLbull Technical metadata MIX (image) textMD (text)bull Preservation related PREMIS
Issues in using PREMIS with METS
Which METS sections to use and how many Whether to record elements redundantly in PREMIS that are
defined explicitly in the METS schema How to record elements that are also part of a format
specific technical metadata schema (eg MIX) Recording structural relationships How to deal with locally controlled vocabularies Whether to use the PREMIS container
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
XML issues
PREMIS XML schemas
One schema for each PREMIS entity in data modelbull Allows user to choose which parts of PREMIS to use
PREMIS container schemabull References schema for each entity typebull Provides a container if it is desirable to keep some or all
PREMIS metadata together bull If using container requires at least an object which in
turn requires objectIdentifier and objectCategorybull Individual schemas may used alone or with container
Semantic units in PREMIS schemasbull XML is faithful to data dictionarybull Only those units mandatory for all categories of objects
are mandatory in object schema
PREMIS Schemas
Container schema
Object schema
Event schema
Agent schema
Rights schema
Proposed schema changes for new version
Define an abstract object type to allow for better validation of object category (representation file bitstream)
Define main elements globally to allow for reuse Implement an extensibility mechanism to provide for
further structure when needed Implement a mechanism to use controlled vocabularies Adjust schemas to support changes in version 2 of data
dictionary
Implementing PREMIS using XML in METS
METS introduction
METS records the (possibly hierarchical) structure of digital objects the names and locations of the files that comprise those objects and the associated metadata
A METS document may be a unit of storage (eg OAIS AIP) or a transmission format (eg OAIS SIP or DIP)
METS is extensible and modular METS uses extension ldquowrappersrdquo or ldquosocketsrdquo where
elements from other schemas can be plugged in METS uses the XML Schema facility for combining
vocabularies from different Namespaces The METS Editorial Board has endorsed PREMIS as an
extension schema Many institutions trying to use PREMIS within the METS
context
The structure of a METS file
METS
dmdSec
amdSec
behaviorSec
structMap
fileSec file inventory
descriptive metadata
administrative metadata
behaviour metadata
structural map
Inserting technical metadata in a METS Document
ltmetsgt ltamdSecgt lttechMDgt ltmdWrapgt ltxmlDatagt
lt-- insert data from different namespace here --gt ltxmlDatagt ltmdWrapgt lttechMDgt ltamdSecgt ltfileSec gt ltstructMap gt ltmetsgt
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
METS extension schemas
ldquowrappersrdquo or ldquosocketsrdquo where elements from other schemas can be plugged in
Provides extensibility Uses the XML Schema facility for combining vocabularies from
different Namespaces Endorsed extension schemas
bull Descriptive MODS DC MARCXMLbull Technical metadata MIX (image) textMD (text)bull Preservation related PREMIS
Issues in using PREMIS with METS
Which METS sections to use and how many Whether to record elements redundantly in PREMIS that are
defined explicitly in the METS schema How to record elements that are also part of a format
specific technical metadata schema (eg MIX) Recording structural relationships How to deal with locally controlled vocabularies Whether to use the PREMIS container
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
PREMIS XML schemas
One schema for each PREMIS entity in data modelbull Allows user to choose which parts of PREMIS to use
PREMIS container schemabull References schema for each entity typebull Provides a container if it is desirable to keep some or all
PREMIS metadata together bull If using container requires at least an object which in
turn requires objectIdentifier and objectCategorybull Individual schemas may used alone or with container
Semantic units in PREMIS schemasbull XML is faithful to data dictionarybull Only those units mandatory for all categories of objects
are mandatory in object schema
PREMIS Schemas
Container schema
Object schema
Event schema
Agent schema
Rights schema
Proposed schema changes for new version
Define an abstract object type to allow for better validation of object category (representation file bitstream)
Define main elements globally to allow for reuse Implement an extensibility mechanism to provide for
further structure when needed Implement a mechanism to use controlled vocabularies Adjust schemas to support changes in version 2 of data
dictionary
Implementing PREMIS using XML in METS
METS introduction
METS records the (possibly hierarchical) structure of digital objects the names and locations of the files that comprise those objects and the associated metadata
A METS document may be a unit of storage (eg OAIS AIP) or a transmission format (eg OAIS SIP or DIP)
METS is extensible and modular METS uses extension ldquowrappersrdquo or ldquosocketsrdquo where
elements from other schemas can be plugged in METS uses the XML Schema facility for combining
vocabularies from different Namespaces The METS Editorial Board has endorsed PREMIS as an
extension schema Many institutions trying to use PREMIS within the METS
context
The structure of a METS file
METS
dmdSec
amdSec
behaviorSec
structMap
fileSec file inventory
descriptive metadata
administrative metadata
behaviour metadata
structural map
Inserting technical metadata in a METS Document
ltmetsgt ltamdSecgt lttechMDgt ltmdWrapgt ltxmlDatagt
lt-- insert data from different namespace here --gt ltxmlDatagt ltmdWrapgt lttechMDgt ltamdSecgt ltfileSec gt ltstructMap gt ltmetsgt
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
METS extension schemas
ldquowrappersrdquo or ldquosocketsrdquo where elements from other schemas can be plugged in
Provides extensibility Uses the XML Schema facility for combining vocabularies from
different Namespaces Endorsed extension schemas
bull Descriptive MODS DC MARCXMLbull Technical metadata MIX (image) textMD (text)bull Preservation related PREMIS
Issues in using PREMIS with METS
Which METS sections to use and how many Whether to record elements redundantly in PREMIS that are
defined explicitly in the METS schema How to record elements that are also part of a format
specific technical metadata schema (eg MIX) Recording structural relationships How to deal with locally controlled vocabularies Whether to use the PREMIS container
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
PREMIS Schemas
Container schema
Object schema
Event schema
Agent schema
Rights schema
Proposed schema changes for new version
Define an abstract object type to allow for better validation of object category (representation file bitstream)
Define main elements globally to allow for reuse Implement an extensibility mechanism to provide for
further structure when needed Implement a mechanism to use controlled vocabularies Adjust schemas to support changes in version 2 of data
dictionary
Implementing PREMIS using XML in METS
METS introduction
METS records the (possibly hierarchical) structure of digital objects the names and locations of the files that comprise those objects and the associated metadata
A METS document may be a unit of storage (eg OAIS AIP) or a transmission format (eg OAIS SIP or DIP)
METS is extensible and modular METS uses extension ldquowrappersrdquo or ldquosocketsrdquo where
elements from other schemas can be plugged in METS uses the XML Schema facility for combining
vocabularies from different Namespaces The METS Editorial Board has endorsed PREMIS as an
extension schema Many institutions trying to use PREMIS within the METS
context
The structure of a METS file
METS
dmdSec
amdSec
behaviorSec
structMap
fileSec file inventory
descriptive metadata
administrative metadata
behaviour metadata
structural map
Inserting technical metadata in a METS Document
ltmetsgt ltamdSecgt lttechMDgt ltmdWrapgt ltxmlDatagt
lt-- insert data from different namespace here --gt ltxmlDatagt ltmdWrapgt lttechMDgt ltamdSecgt ltfileSec gt ltstructMap gt ltmetsgt
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
METS extension schemas
ldquowrappersrdquo or ldquosocketsrdquo where elements from other schemas can be plugged in
Provides extensibility Uses the XML Schema facility for combining vocabularies from
different Namespaces Endorsed extension schemas
bull Descriptive MODS DC MARCXMLbull Technical metadata MIX (image) textMD (text)bull Preservation related PREMIS
Issues in using PREMIS with METS
Which METS sections to use and how many Whether to record elements redundantly in PREMIS that are
defined explicitly in the METS schema How to record elements that are also part of a format
specific technical metadata schema (eg MIX) Recording structural relationships How to deal with locally controlled vocabularies Whether to use the PREMIS container
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Proposed schema changes for new version
Define an abstract object type to allow for better validation of object category (representation file bitstream)
Define main elements globally to allow for reuse Implement an extensibility mechanism to provide for
further structure when needed Implement a mechanism to use controlled vocabularies Adjust schemas to support changes in version 2 of data
dictionary
Implementing PREMIS using XML in METS
METS introduction
METS records the (possibly hierarchical) structure of digital objects the names and locations of the files that comprise those objects and the associated metadata
A METS document may be a unit of storage (eg OAIS AIP) or a transmission format (eg OAIS SIP or DIP)
METS is extensible and modular METS uses extension ldquowrappersrdquo or ldquosocketsrdquo where
elements from other schemas can be plugged in METS uses the XML Schema facility for combining
vocabularies from different Namespaces The METS Editorial Board has endorsed PREMIS as an
extension schema Many institutions trying to use PREMIS within the METS
context
The structure of a METS file
METS
dmdSec
amdSec
behaviorSec
structMap
fileSec file inventory
descriptive metadata
administrative metadata
behaviour metadata
structural map
Inserting technical metadata in a METS Document
ltmetsgt ltamdSecgt lttechMDgt ltmdWrapgt ltxmlDatagt
lt-- insert data from different namespace here --gt ltxmlDatagt ltmdWrapgt lttechMDgt ltamdSecgt ltfileSec gt ltstructMap gt ltmetsgt
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
METS extension schemas
ldquowrappersrdquo or ldquosocketsrdquo where elements from other schemas can be plugged in
Provides extensibility Uses the XML Schema facility for combining vocabularies from
different Namespaces Endorsed extension schemas
bull Descriptive MODS DC MARCXMLbull Technical metadata MIX (image) textMD (text)bull Preservation related PREMIS
Issues in using PREMIS with METS
Which METS sections to use and how many Whether to record elements redundantly in PREMIS that are
defined explicitly in the METS schema How to record elements that are also part of a format
specific technical metadata schema (eg MIX) Recording structural relationships How to deal with locally controlled vocabularies Whether to use the PREMIS container
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Implementing PREMIS using XML in METS
METS introduction
METS records the (possibly hierarchical) structure of digital objects the names and locations of the files that comprise those objects and the associated metadata
A METS document may be a unit of storage (eg OAIS AIP) or a transmission format (eg OAIS SIP or DIP)
METS is extensible and modular METS uses extension ldquowrappersrdquo or ldquosocketsrdquo where
elements from other schemas can be plugged in METS uses the XML Schema facility for combining
vocabularies from different Namespaces The METS Editorial Board has endorsed PREMIS as an
extension schema Many institutions trying to use PREMIS within the METS
context
The structure of a METS file
METS
dmdSec
amdSec
behaviorSec
structMap
fileSec file inventory
descriptive metadata
administrative metadata
behaviour metadata
structural map
Inserting technical metadata in a METS Document
ltmetsgt ltamdSecgt lttechMDgt ltmdWrapgt ltxmlDatagt
lt-- insert data from different namespace here --gt ltxmlDatagt ltmdWrapgt lttechMDgt ltamdSecgt ltfileSec gt ltstructMap gt ltmetsgt
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
METS extension schemas
ldquowrappersrdquo or ldquosocketsrdquo where elements from other schemas can be plugged in
Provides extensibility Uses the XML Schema facility for combining vocabularies from
different Namespaces Endorsed extension schemas
bull Descriptive MODS DC MARCXMLbull Technical metadata MIX (image) textMD (text)bull Preservation related PREMIS
Issues in using PREMIS with METS
Which METS sections to use and how many Whether to record elements redundantly in PREMIS that are
defined explicitly in the METS schema How to record elements that are also part of a format
specific technical metadata schema (eg MIX) Recording structural relationships How to deal with locally controlled vocabularies Whether to use the PREMIS container
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
METS introduction
METS records the (possibly hierarchical) structure of digital objects the names and locations of the files that comprise those objects and the associated metadata
A METS document may be a unit of storage (eg OAIS AIP) or a transmission format (eg OAIS SIP or DIP)
METS is extensible and modular METS uses extension ldquowrappersrdquo or ldquosocketsrdquo where
elements from other schemas can be plugged in METS uses the XML Schema facility for combining
vocabularies from different Namespaces The METS Editorial Board has endorsed PREMIS as an
extension schema Many institutions trying to use PREMIS within the METS
context
The structure of a METS file
METS
dmdSec
amdSec
behaviorSec
structMap
fileSec file inventory
descriptive metadata
administrative metadata
behaviour metadata
structural map
Inserting technical metadata in a METS Document
ltmetsgt ltamdSecgt lttechMDgt ltmdWrapgt ltxmlDatagt
lt-- insert data from different namespace here --gt ltxmlDatagt ltmdWrapgt lttechMDgt ltamdSecgt ltfileSec gt ltstructMap gt ltmetsgt
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
METS extension schemas
ldquowrappersrdquo or ldquosocketsrdquo where elements from other schemas can be plugged in
Provides extensibility Uses the XML Schema facility for combining vocabularies from
different Namespaces Endorsed extension schemas
bull Descriptive MODS DC MARCXMLbull Technical metadata MIX (image) textMD (text)bull Preservation related PREMIS
Issues in using PREMIS with METS
Which METS sections to use and how many Whether to record elements redundantly in PREMIS that are
defined explicitly in the METS schema How to record elements that are also part of a format
specific technical metadata schema (eg MIX) Recording structural relationships How to deal with locally controlled vocabularies Whether to use the PREMIS container
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
The structure of a METS file
METS
dmdSec
amdSec
behaviorSec
structMap
fileSec file inventory
descriptive metadata
administrative metadata
behaviour metadata
structural map
Inserting technical metadata in a METS Document
ltmetsgt ltamdSecgt lttechMDgt ltmdWrapgt ltxmlDatagt
lt-- insert data from different namespace here --gt ltxmlDatagt ltmdWrapgt lttechMDgt ltamdSecgt ltfileSec gt ltstructMap gt ltmetsgt
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
METS extension schemas
ldquowrappersrdquo or ldquosocketsrdquo where elements from other schemas can be plugged in
Provides extensibility Uses the XML Schema facility for combining vocabularies from
different Namespaces Endorsed extension schemas
bull Descriptive MODS DC MARCXMLbull Technical metadata MIX (image) textMD (text)bull Preservation related PREMIS
Issues in using PREMIS with METS
Which METS sections to use and how many Whether to record elements redundantly in PREMIS that are
defined explicitly in the METS schema How to record elements that are also part of a format
specific technical metadata schema (eg MIX) Recording structural relationships How to deal with locally controlled vocabularies Whether to use the PREMIS container
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Inserting technical metadata in a METS Document
ltmetsgt ltamdSecgt lttechMDgt ltmdWrapgt ltxmlDatagt
lt-- insert data from different namespace here --gt ltxmlDatagt ltmdWrapgt lttechMDgt ltamdSecgt ltfileSec gt ltstructMap gt ltmetsgt
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
METS extension schemas
ldquowrappersrdquo or ldquosocketsrdquo where elements from other schemas can be plugged in
Provides extensibility Uses the XML Schema facility for combining vocabularies from
different Namespaces Endorsed extension schemas
bull Descriptive MODS DC MARCXMLbull Technical metadata MIX (image) textMD (text)bull Preservation related PREMIS
Issues in using PREMIS with METS
Which METS sections to use and how many Whether to record elements redundantly in PREMIS that are
defined explicitly in the METS schema How to record elements that are also part of a format
specific technical metadata schema (eg MIX) Recording structural relationships How to deal with locally controlled vocabularies Whether to use the PREMIS container
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
METS extension schemas
ldquowrappersrdquo or ldquosocketsrdquo where elements from other schemas can be plugged in
Provides extensibility Uses the XML Schema facility for combining vocabularies from
different Namespaces Endorsed extension schemas
bull Descriptive MODS DC MARCXMLbull Technical metadata MIX (image) textMD (text)bull Preservation related PREMIS
Issues in using PREMIS with METS
Which METS sections to use and how many Whether to record elements redundantly in PREMIS that are
defined explicitly in the METS schema How to record elements that are also part of a format
specific technical metadata schema (eg MIX) Recording structural relationships How to deal with locally controlled vocabularies Whether to use the PREMIS container
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
METS extension schemas
ldquowrappersrdquo or ldquosocketsrdquo where elements from other schemas can be plugged in
Provides extensibility Uses the XML Schema facility for combining vocabularies from
different Namespaces Endorsed extension schemas
bull Descriptive MODS DC MARCXMLbull Technical metadata MIX (image) textMD (text)bull Preservation related PREMIS
Issues in using PREMIS with METS
Which METS sections to use and how many Whether to record elements redundantly in PREMIS that are
defined explicitly in the METS schema How to record elements that are also part of a format
specific technical metadata schema (eg MIX) Recording structural relationships How to deal with locally controlled vocabularies Whether to use the PREMIS container
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
METS extension schemas
ldquowrappersrdquo or ldquosocketsrdquo where elements from other schemas can be plugged in
Provides extensibility Uses the XML Schema facility for combining vocabularies from
different Namespaces Endorsed extension schemas
bull Descriptive MODS DC MARCXMLbull Technical metadata MIX (image) textMD (text)bull Preservation related PREMIS
Issues in using PREMIS with METS
Which METS sections to use and how many Whether to record elements redundantly in PREMIS that are
defined explicitly in the METS schema How to record elements that are also part of a format
specific technical metadata schema (eg MIX) Recording structural relationships How to deal with locally controlled vocabularies Whether to use the PREMIS container
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
METS extension schemas
ldquowrappersrdquo or ldquosocketsrdquo where elements from other schemas can be plugged in
Provides extensibility Uses the XML Schema facility for combining vocabularies from
different Namespaces Endorsed extension schemas
bull Descriptive MODS DC MARCXMLbull Technical metadata MIX (image) textMD (text)bull Preservation related PREMIS
Issues in using PREMIS with METS
Which METS sections to use and how many Whether to record elements redundantly in PREMIS that are
defined explicitly in the METS schema How to record elements that are also part of a format
specific technical metadata schema (eg MIX) Recording structural relationships How to deal with locally controlled vocabularies Whether to use the PREMIS container
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Linking in METS Documents(XML IDIDREF links)
DescMDmods
relatedItemrelatedItem
AdminMDtechMDsourceMDdigiprovMDrightsMD
fileGrpfilefile
StructMapdiv div fptr
div fptr
METS extension schemas
ldquowrappersrdquo or ldquosocketsrdquo where elements from other schemas can be plugged in
Provides extensibility Uses the XML Schema facility for combining vocabularies from
different Namespaces Endorsed extension schemas
bull Descriptive MODS DC MARCXMLbull Technical metadata MIX (image) textMD (text)bull Preservation related PREMIS
Issues in using PREMIS with METS
Which METS sections to use and how many Whether to record elements redundantly in PREMIS that are
defined explicitly in the METS schema How to record elements that are also part of a format
specific technical metadata schema (eg MIX) Recording structural relationships How to deal with locally controlled vocabularies Whether to use the PREMIS container
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
METS extension schemas
ldquowrappersrdquo or ldquosocketsrdquo where elements from other schemas can be plugged in
Provides extensibility Uses the XML Schema facility for combining vocabularies from
different Namespaces Endorsed extension schemas
bull Descriptive MODS DC MARCXMLbull Technical metadata MIX (image) textMD (text)bull Preservation related PREMIS
Issues in using PREMIS with METS
Which METS sections to use and how many Whether to record elements redundantly in PREMIS that are
defined explicitly in the METS schema How to record elements that are also part of a format
specific technical metadata schema (eg MIX) Recording structural relationships How to deal with locally controlled vocabularies Whether to use the PREMIS container
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Issues in using PREMIS with METS
Which METS sections to use and how many Whether to record elements redundantly in PREMIS that are
defined explicitly in the METS schema How to record elements that are also part of a format
specific technical metadata schema (eg MIX) Recording structural relationships How to deal with locally controlled vocabularies Whether to use the PREMIS container
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
PREMIS and METS sections
Flexibility of METS requires implementation decisions You canrsquot put all PREMIS metadata directly under amdSec What sections to use for PREMIS metadata
bull Alternative 1bull Object in techMDbull Event in digiProvMDbull Rights in rightsMDbull Agent with event or rights
bull Alternative 2bull Everything in digiProvMD
bull Alternative 3bull Everything in techMD
How many administrative MD sections to use Experimentation will result in best practices
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
ltfileSecgtltfileGrpgtltfile ID=FID1 SIZE=184302 ADMID=TMD1PREMIS TMD1MIX DP1EVENT
DP1AGENTldquo CHECKSUM=4638bc65c5b9715557d09ad373eefd147382ecbf CHECKSUMTYPE=SHA-1gt
ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPG gtltfilegtltfileGrpgtltfileSecgtlttechMD ID=TMD1PREMISgt ltmdWrap MDTYPE=PREMISgt ltxmlDatagt
ltpremisobject gt ltobjectCharacteristicsgt ltfixitygt ltmessageDigestAlgorithmgtSHA-1 ltmessageDigestAlgorithmgt ltmessageDigestgt4638bc65c5b9715557d09ad373eefd147382ecbf
ltmessageDigestgt ltmessageDigestOriginatorgtEchoDepmessageDigestOriginatorgt ltfixitygt ltsizegt184302ltsizegt ltobjectCharacteristicsgt
Elements defined in both METS and PREMISbull METS Checksum Checksumtype
bull attribute of ltfilegtbull not repeatable
PREMIS fixitybull also includes messageDigestOriginatorbull allows multiples
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
ltfileSecgtltfileGrpgtltfile ID=FID1 ADMID=TMD1PREMIS DP1EVENT DP1AGENTldquo
MIMETYPE=imagejpeg ltFLocat LOCTYPE=OTHER xlinkhref=BXF22JPGgtltfilegtltfileGrpgtltfileSecgt
lttechMD ID=TMD1PREMISldquo ltmdWrap MDTYPE=PREMISgt ltxmlDatagt ltpremisobjectgt ltobjectCharacteristicsgt ltformatgt ltformatDesignationgt ltformatNamegtimagejpegltformatNamegt ltformatVersiongt102 ltformatVersiongt ltformatDesignationgtltformatgt ltobjectCharacteristicsgtElements defined both in METS and PREMISbull METS MIMETYPE
bull attribute of ltfilegtbull optional
PREMIS ltformatgt bull more granular includes name and version (although name may be MIMETYPE)bull mandatory
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
ltfileSecgt ltfileGrpgt ltfile ID=FID1 ADMID=TMD1PREMIS TMD1MIX DP1EVENT DP1AGENTgtlttechMD ID=TMD1PREMISgt ltlinkingEventIdentifiergt ltlinkingEventIdentifierTypegtECHODEP Hub Event ltlinkingEventIdentifierTypegt ltlinkingEventIdentifierValuegtecho12345ltlinkingEventIdentifierValuegt ltlinkingEventIdentifiergtltdigiprovMD ID=DP1EVENTgt ltpremiseventgt lteventIdentifiergt lteventIdentifierTypegtECHODEP Hub EventlteventIdentifierTypegt lteventIdentifierValuegtecho12345 lteventIdentifierValuegt lteventIdentifiergt lteventTypegtingestionlteventTypegt lteventDateTimegt2006-05-02T151253 lteventDateTimegtlteventgt
Elements defined both in METS and PREMIS METS IDIdref used to associate metadata in different sections and for different
files PREMIS identifiers explicit linking between entity types
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
ltstructMap TYPE=ldquophysicalrdquogt ltdiv ORDER=1 TYPE=textgt ltfptr FILEID=FID9gt ltdiv ORDER=1 TYPE=page LABEL= Page [1]gt ltfptr FILEID=FID1gtltmetsdivgt ltdiv ORDER=2 TYPE=page LABEL= Page [2]gt ltfptr FILEID=FID2gtltmetsdivgt ltdivgt
ltrelationshipgt ltrelationshipTypegtstructuralltrelationshipTypegt ltrelationshipSubTypegtis sibling of ltrelationshipSubTypegt ltrelatedObjectIdentificationgt ltrelatedObjectIdentifierTypegtUCBltrelatedObjectIdentifierTypegt ltrelatedObjectIdentifierValuegtFID2ltrelatedObjectIdentifierValuegt ltrelatedObjectSequencegt1ltrelatedObjectSequencegt
Elements defined both in METS and PREMIS METS structMap
bull details structural relationships and is the heart of the METS documentbull hierarchical so may be more expressive than PREMIS semantic unitsbull links the elements of the structure to content files and metadata
PREMIS ltrelationshipgt bull details all kinds of relationships including structuralbull data dictionary says that implementations may record by other means
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Should semantic units be recorded redundantly
Various options are possible when there is overlap between PREMIS and METS or PREMIS and other technical metadata schemasbull Record only in METSbull Record only in PREMISbull Record in both
Are there advantages in using PREMIS semantic units Is it important to keep PREMIS metadata together as a unit
There may be an advantage for reuse and maintenance purposes
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
How to record elements from 2 different technical metadata schemas
Format specific metadata may be included in addition to PREMIS general technical metadata
Use multiple techMD sections and specify source in MDType attribute andor namespace declarationbull eg MDTYPE=ldquoNISOIMGrdquo or ldquoPREMISrdquobull Give MIX schema declaration in METS document
MIX was recently revised to correspond with the revision of the Z3987 technical metadata for digital still images standard names harmonized with corresponding PREMIS semantic units
For digital still images best practice may be to use PREMIS for general semantic units defined in PREMIS and MIX for format specific units without redundancy
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Examples of PREMIS in XML
PREMIS in METSbull Portrait of Louis Armstrong (Library of Congress)bull Peoria County Illinois aerial photograph (ECHO
Depository UIUC Grainger Engineering Library) MATHARC implementation
httppigpenlibuchicagoedu8888pigpenuploads13asset_descr_mets_premis_02v2xml
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
MPEG-21 Digital Item Declaration (DID)
ISOIEC 21000-2 Digital Item Declaration A promising alternative to represent Digital Objects Starting to get supported by some repositories eg
aDORe DSpace Fedora A flexible and expressive model that easily represents
compound objects (recursive ldquoitemrdquo) Attach well-formed XML from persistent namespaces as
metadata
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Abstract Model for MPEG-21 DID
resource resource resource
component component
descriptorstatement
descriptorstatement
descriptorstatement
descriptorstatement
item
item
container
resource datastream
component binding of descriptorstatements to datastreams
item represents a Digital Item aka Digital Object aka asset Descriptorstatement constructs convey information about the Digital Item
container grouping of items and descriptorstatement constructs pertaining to the container
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Mapping
resource resource resource
object3 object4
premis object
premisobject
premispremis
DIDInfo
object2
object1
DIDAll rights events and agents go here The top level object goes here Other
objects may be duplicated here or linked here
premis object
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Partial Implementation in DID
resource resource resource
object3 object4
premis format
premissignificantProperties
premispremis
DIDInfo
object2
object1
DIDWhen metadata are not sufficient to form
the top level PREMIS elements partial implementation may be done if PREMIS
elements are globally defined
premis creatingApplication
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Example of PREMIS in MPEG DID
PREMIS in MPEG DIDbull aDORe example (LANL)
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Summary container formats
A container format is needed to package together all forms of metadata (of which PREMIS is one) and digital content
Use of a container is compatible with and an implementation of the OAIS information package concept
Co-existence with other types of metadata requires best practices for both approaches redundancy seems to be preferred
Changes to the next version of the PREMIS XML schemas will facilitate a phased approach to full PREMIS implementation
Development of registries (informal or formal) for controlled vocabularies will benefit implementation
Tools are being developed to facilitate implementation
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Summary METS vs MPEG 21 DID
METS and MPEG DID are similar types of container formats in that both are expressed in XML both represent the structure of digital objects and both include metadata
MPEG DID doesnrsquot have the segmentation in metadata sections that METS does so this implementation decision need not be made in DID
METS is open source and developed by open discussion mainly cultural heritage community
MPEG DID is an ISO standard and has industry support but is often implemented in a proprietary way and standards development is closed
It would be possible to transform a METS container to a MPEG DID and vice versa development of stylesheets will enable transformations
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units
Implementersrsquo panel
What types of objects are you preserving Has your institution implemented a preservation repository What preservation metadata are you recording How are you recording it eg database METSXML other Do you plan to exchange preservation metadata with other
repositories Are you planning to or already using PREMIS Which semantic units are most useful Which semantic units are least useful What difficulties have you had applying PREMIS units