Case History:Library of Congress Audio-Visual Prototyping ProjectMETS Opening DayOctober 27, 2003
Carl FleischhauerOffice of Strategic InitiativesLibrary of Congresscfle@loc.gov
The AV ProjectPreservation, sense one: reformatting into digital-file formPreservation, sense two: sustaining digital objectsParticipation by Motion Picture, Broadcasting, and Recorded Sound Division (M/B/RS) and the American Folklife Center
Reformatting DocumentationAbout the source original disc or tape being reformatted
About the processhow the copy file was made, what devices/tools
About the outcomecharacteristics and features of the copy file
PRODUCERSADMINISTRATIONDATAMANAGEMENTARCHIVALSTORAGEINGESTACCESSCONSUMERSPRESERVATION PLANNINGReference Model for an Open Archival Information System (OAIS)SIPs (Submission Information Packages) will be produced by the AV preservation activity, ready to submit to LCs future digital repository.
AV Project Web Site Home Page http://lcweb.loc.gov/rr/mopic/avprot/
AV Project Extension Schema Page http://lcweb.loc.gov/rr/mopic/avprot/metsmenu2.html
AV Project Initial Data Capture SystemMS-Access Database - Collation Input Screen
Top level: workSecond level: sound recordingsThird level: disc sidesFourth level: cuts
Recorded Sound Processing SectionContent selected for reformatting1. Initial creation or copying-in of metadataWorkflow Sidebar
Recorded Sound Processing SectionContent selected for reformatting1. Initial creation or copying-in of metadataLC Recording Lab or offsite contractorScanning activity2. Creation of second layer of metadataWorkflow Sidebar
Recorded Sound Processing SectionContent selected for reformatting1. Initial creation or copying-in of metadataLC Recording Lab or offsite contractorScanning activity2. Creation of second layer of metadata3. Return loop to processing, edit and possible addition of third layer of metadataWorkflow Sidebar
The AV METS System Today
OUTCOME ONE: A VIRTUAL DIGITAL OBJECT (SIP)Logical storage structure based in a UNIX filesystemmaster -- family of logical directories where the master files are stored (there is a parallel set of service directories)afc -- owner is the American Folklife Centerafc1941001 -- group or aggregate of items, often from an actual collectionsr05 -- item directory (at the level of the digital object, counterpart to a bib record or line in a finding aid)sr05am.wav -- the master file for side A of this discsr05am.wav -- the master file for side B of this discIndex of master/afc/afc1941001/sr05
OUTCOME ONE: VIRTUAL DIGITAL OBJECTThe fileGrp segment of a METS instance binds the objectIncludes logical pathnames for files, future switch to persistent names possible.
OUTCOME 2: PRESENTATION OF OBJECTPresentation in Browser
Zoom on Image in Presentation
Interim username/password access management
In the Presentation: Metadata Map for the Dedicated
sourceMD data from the Metadata MapExtension schema content displayed as name-value pairs
Generator takes data from the database and makes METS XML
Snapshot of the database back end
Selection from the database diagram: tables for METS id, agent information, and structMap data
Selection from the database diagram: tables for extension schema data for image source, video source, and audio source
Selection from the database diagram: tables for digiProv (digitization process) information
Builder: the data-entry front end to the database
Builder: template making tool
Builder: tool to shape a structMap using indent, outdent, up, and down. May be used in both template and individual object modes.
Cut wizard a twenty more like this one tool
Part of MODS descriptive data for a recorded interview with a former enslaved person.
File Association Tool
Tool to append a MODS record
Two samples from the MODS entry and editing tool.+ repeats the sectionx and delete sections or subsections
Selection from the online data dictionary
Some METS objects, by title
Administration Tool Menu
Example of data entry screenBlue terms are used to select separate data entry screens
Some ShortcomingsCumbersome data entry many screens, many actions Bugs hard to get them all fixed now that the contractor is goneBest if users understand METS and the structMap barrier to entry for new team membersDoes not include tools for bulk compilation from pre-existing data
Distributed Data EntryHoped-for futureEach teams enters its own data in less cumbersome local toolsTool for descriptive data, especially copying in and out of the ILSTool for data about the source item and certain technical aspects, copied in and out of MAVISTool for digiProv data, the engineers formTool or a MAVIS extension to encode the structMap
Supporting ToolsTo support the hoped-for futureCentralized tool to gather and compile the various XML data units into a METS instanceFacility to manage the METS XML documents
Fiddling our way to the future? Listen for hints in Corey Keiths talk tomorrow . . . .Thats all in this talk today. Thank you!
Greetings. This is a two part case history. Part one is a story about the project and its development and part two takes a look at the METS-making tool that is in place today. This project is strongly oriented toward preservation, meaning both reformatting older physical materials and sustaining the digital result, and that accounts for the projects particular shape. Two special collections divisions have participated: M/B/RS and AFC.We have a high interest in documentation and wanted data about the source item, i.e., the entity that was being reformatted, about the process, i.e., how the reformatting was carried out, and about the outcome, i.e., the details of the digital file that reproduces the original item. As a result, we have tried to capture quite a bit of metadata.
Regarding preservation in the other sense--sustaining content once in digital form--we want to orient ourselves to the OAIS reference model for a digital-content repository. Our project didnt plan to build a repository but we did want to produce a digital object that was as ready to submit to one as possible. (The city talk for this is SIP or submission information package.)
We started in late 1999 and by November 2000, the project had taken enough shape for me to give a talk at the DLF forum in Chicago. I reported that our group was taken with the MOA2 (Making of America 2) metadata model, then limited to images and texts. Jerry McDonough (NYU) and Mackenzie Smith (then Harvard, now MIT) collared me at that meeting and explained that they were thinking about expanding the MOA2 structure and also finding ways to embrace audio. Would LC wish to join that effort, they asked. I said that we surely would.So we did join--the LC group was led by Morgan Cundiff. Dick Thaxter from the Motion Picture, Broadcasting, and Recorded Sound Division and I joined in during some of the early meetings. During 2001-2002, with the help of our contractor, User Technology Associates, the AV project was able to sketch out some extension schemas.Some of this extension-schema work has been taken and improved by others to give us the METS-community-endorsed schemas for descriptive, image, and text metadata. Meanwhile, with a great debt to David Ackerman of Harvard Universitys Library Digital Initiative and the Audio Engineering Society, we cooked up and continue to use our own working schema for audio and digiProv metadata. Weve also got one for video but have not yet put it to work.We use a relational database to capture the data, which is then output as XML. Our contractor cooked up some early data-capture software in MS-Access, which -- how shall I say -- taught us enough lessons to set the stage for the construction of the software we have today. One big lesson concerned the need for considerable recursion in the structMap. Our first stab at a relational database to capture the data only gave us three levels, what we called the work, the type, and the subtype.But we found that we wanted n levels, i.e., an indefinite number of levels. For example, the reproduction of a phonograph album includes both sound recordings and entities like images and easily gave us four levels. The work is the parent of the sound recording division (and some image elements). In turn, the sound recording division is the parent of disc sides, which are in turn the parents of cuts. So our three-level limit was frustrating to us.Sidebar on workflow: As the preceding examples suggest, our work has been focused on recorded sound collections. For the M/B/RS Division, the items to be reformatted start their digital life in the Recorded Sound Processing Section, where they are prepared and cataloged (if not previously cataloged), and where some conservation work takes place. Some of our METS metadata is first inscribed here or, if it pre-exists, is copied into the data set.Then physical materials go to the M/B/RS Recording laboratory or to an outside contractor for digitizing. When images are to be made, there is often a separate imaging loop in the process. Additional METS metadata is added as a result of these activities.Then the originals and the digital reproductions come back to (or are made accessible to) the processing section, which adds to or corrects the final METS metadata. Throughout the data-entry design process, we were considering h