19
May 12, 2006 Spring 2006 Common Solutions Group Archival, Digital Preservation, and Records Management David Millman, Columbia University Ron Thielen, University of Chicago

Archival, Digital Preservation, and Records Management

  • Upload
    selene

  • View
    36

  • Download
    1

Embed Size (px)

DESCRIPTION

Archival, Digital Preservation, and Records Management. David Millman, Columbia University Ron Thielen, University of Chicago. Agenda. Difference between an Archive, Repository, and Records Management The Three Reasons to Archive The State of the Industry, Government, Higher Ed, … Standards - PowerPoint PPT Presentation

Citation preview

Page 1: Archival, Digital Preservation, and Records Management

May 12, 2006 Spring 2006 Common Solutions Group

Archival, Digital Preservation, and

Records Management

David Millman, Columbia University

Ron Thielen, University of Chicago

Page 2: Archival, Digital Preservation, and Records Management

May 12, 2006 Spring 2006 Common Solutions Group

Agenda

Difference between an Archive, Repository, and Records Management

The Three Reasons to Archive The State of the Industry, Government,

Higher Ed, … Standards Policies and Processes Steps Toward Archival Some Key Issues

Page 3: Archival, Digital Preservation, and Records Management

May 12, 2006 Spring 2006 Common Solutions Group

Differences between an Archive, Repository, and Records Management

Institutional Repository – A system for collecting, preserving, and disseminating scholarly content.

Archive – A collection of data that is maintained as a long-term record of a business, application, or information state. Archives are typically kept for auditing, regulatory, analysis or reference purposes rather than for application or data recovery. - SNIA

Records Management – The systematic control of records throughout their life cycle. – ARMA

Page 4: Archival, Digital Preservation, and Records Management

May 12, 2006 Spring 2006 Common Solutions Group

Reasons to Archive

Legal and Regulatory ComplianceAs an Aid to Corporate Memory in

Order to Improve Operational Effectiveness

To Preserve Material of Potentially Historic and Enduring Value

Page 5: Archival, Digital Preservation, and Records Management

May 12, 2006 Spring 2006 Common Solutions Group

Legal and Regulatory Issues

Some financial records need to be retained for statutory periods varying up to 10 years

Medical research needs to be retained beyond the life of the subject

Lack of process for retaining records may be at best lack of due diligence and at worst obstruction

It is increasingly common that courts are unwilling to accept the argument that discovery would be too difficult or expensive In some cases they are fining companies that are too slow

to comply with court orders

Page 6: Archival, Digital Preservation, and Records Management

May 12, 2006 Spring 2006 Common Solutions Group

Improve Operational Effectiveness

Act as an Aid to Institutional Memory Assist Institutional Governance by Capturing

the Rationale for Decisions Operational in our Context Extends to

Scholarly Effectiveness

Page 7: Archival, Digital Preservation, and Records Management

May 12, 2006 Spring 2006 Common Solutions Group

Historic and Enduring Value

Not always possible to know a priori what will have enduring value

Will a researcher in the next century be more interested in the content of a particular web site or how the content was presented and in our browser interface interactions? Both.

Page 8: Archival, Digital Preservation, and Records Management

May 12, 2006 Spring 2006 Common Solutions Group

State of the IT Industry

Used to be all about compliance Increasing awareness that there are other

reasons for archival Scan of IT Industry Organizations Scan of IT Vendors Scan of Government Initiatives Scan of Higher Education Initiatives

Page 9: Archival, Digital Preservation, and Records Management

May 12, 2006 Spring 2006 Common Solutions Group

IT Industry Organizations SNIA (Storage Network Industry Association) Data

Management Forum (DMF) LTACSI (Long Term Archive and Compliance Storage

Initiative) 100 Year Archive Task Force SDDF (Self Describing Data Format) Task Force

ARMA - Association for Records Managers and Administrators (aka RIM Professionals) – Working with the SNIA

AIIM – Association for Information and Image Management – Believes that ISO adoption of PDF/A is the way to address preservation

Page 10: Archival, Digital Preservation, and Records Management

May 12, 2006 Spring 2006 Common Solutions Group

Scan of IT Vendors

Niche (generally seem to get it)

Archivas, Permabit, Yosemite 800 lb Gorillas (some get it, some don’t)

HP, IBM, EMC, Sun (aka StorageTek)

“Archival” Vendors (generally don’t seem to get it)

Commvault, Zantaz, ZipLip, iLumin, …

Page 11: Archival, Digital Preservation, and Records Management

May 12, 2006 Spring 2006 Common Solutions Group

Survey of Government Authorities and Initiatives

LOC “Library of Congress” NARA “National Archives and Records

Administration” NDIIPP “National Digital Information

Infrastructure and Preservation Program”

Page 12: Archival, Digital Preservation, and Records Management

May 12, 2006 Spring 2006 Common Solutions Group

Survey of Higher Education and Library Initiatives DSpace (an institutional repository, not an archive) FEDORA (ditto) Stanford LOCKSS (Lots of Copies Keep Stuff Safe) DAITSS (Dark Archive in the Sunshine State) NEDLIB (Networked European Deposit Library) JORUM (repository service, U.K.) Columbia (DSpace pilots; FEDORA in Socioeconomic

Data Center Long-Term Archive) CDAD (Chicago Digital Archive Depository) RLG Digital Repository Certification UCSD / SRB (Storage Resource Broker) JHOVE (Harvard--object validation service)

Page 13: Archival, Digital Preservation, and Records Management

May 12, 2006 Spring 2006 Common Solutions Group

Standards(formal, ad-hoc, and otherwise)

OAIS “Open Archival Information System” PREMIS “Preservation Metadata Standard” METS “Metadata Encoding and Transmission

Standard” EAD “Encoded Archival Description” MADS “Metadata Authority Description Schema” MODS "Metadata Object Description Schema" DOD 5015.2 “Design Criteria Standard for Electronic

Records Management Software Applications” ISO 15489 (Records Management) and on … and on … and …

Page 14: Archival, Digital Preservation, and Records Management

May 12, 2006 Spring 2006 Common Solutions Group

Standards for Access and Interoperation

Institutional Repository service vs Archive Scholarly/Instructional Access issues

Discovery Interoperation/reuse Citation stability

Digital Library issues Content structure Format migration

Page 15: Archival, Digital Preservation, and Records Management

May 12, 2006 Spring 2006 Common Solutions Group

Page 16: Archival, Digital Preservation, and Records Management

May 12, 2006 Spring 2006 Common Solutions Group

Policy/process

Strategies email: nightly incrementals (a backup strategy) digital library: quarterly curator sign-off (an

archival strategy) Faculty buy-in

minimum metadata? education

Page 17: Archival, Digital Preservation, and Records Management

May 12, 2006 Spring 2006 Common Solutions Group

Education experiment:Spectrum of Stability

Activecollaboration

Versioning Citableworking-paper

Publication

Multiple users w/“collab space”functions

File systemmetaphor /w/some metadata

Institutionalrepository /metadata

Preserved /archived /cataloged

Scholarly research activityLibrary curation

Page 18: Archival, Digital Preservation, and Records Management

May 12, 2006 Spring 2006 Common Solutions Group

Five Steps to Archival

Backup - a backup is not an archive, but backup processes, support personnel, and infrastructure may (or may not) support parts of the archival infrastructure

Simple Bitstream Preservation - keep from losing the information; adds fixity checking, digital media asset management to backup

Records Management - adds policy based classification and information life-cycle management

Intellectual Content Preservation - keep the format current; migrate (or emulate) formats & structures

Archival - adds bibliographic and administrative metadata

Page 19: Archival, Digital Preservation, and Records Management

May 12, 2006 Spring 2006 Common Solutions Group

Sampling of Issues Not Enough Cooperation to Build Standards Based Archival Systems It’s not just about the data

Metadata is key – Where does it come from (harvest, contributor, cataloger?) Context is often necessary (e.g. roles, organizational structures both formal

and informal, provenance) A Backup is not an Archive IP & DRM Who’s Archive Is It? Digital Media Asset Management (tape is dead, long live tape) Balancing Collection of Everything vs. Determining Suitability of Material

for Archival (Selection Criteria) Data Classification (Metadata Driven, Policy Based Selection Processes?)

Requirements for Research Preservation and Dissemination Fixity Checking and Repair Disaster Recovery ?