Presentation slides prepared for a lecture on digital preservation given at the University of the West of England, Bristol on the 19th February 2013.
- 1. A centre of expertise in digital information managementwww.ukoln.ac.ukUKOLN is supported by:Digital PreservationMichael DayDigital Curation CentreUKOLN, University of Bathm.firstname.lastname@example.orgInformation Systems and Services, UWE, Bristol, 19 February 2013
2. A centre of expertise in digital information managementwww.ukoln.ac.ukPresentation outline Digital preservation overview Some definitions Technical challenges Organisational challenges Approaches to solving the problem Preservation Strategies Tools for: Format characterisation Preservation Planning The OAIS model: Preservation metadata Repository audit frameworks (TRAC, DRAMBORA) Institutional assessment tools: (DAF, CARDIO) Research Data Management 3. A centre of expertise in digital information managementwww.ukoln.ac.ukDefinitions Digital preservation: Is mainly concerned with the sustainability of content fora given period of time (probably not forever) Largely about ensuring continued access to content The series of managed activities necessary to ensurecontinued access to digital materials for as long asnecessary - Digital Preservation Coalition (DPC) DigitalPreservation Definitions and Concepts list:http://www.dpconline.org/advice/preservationhandbook/introduction/definitions-and-concepts?q=definitions A combination of technical, organisational and legalchallenges 4. A centre of expertise in digital information managementwww.ukoln.ac.ukDigital preservation basics An ongoing (lifecycle) approach to managing digitalcontent based on: The identification and adoption of appropriatepreservation strategies for content The collection and management of appropriate metadata(explicit and implicit knowledge, contexts) The ongoing monitoring of technical contexts and theapplication of preservation planning techniques Continual monitoring of the organisation (audit) Not about keeping everything, forever 5. A centre of expertise in digital information managementwww.ukoln.ac.ukA multi-faceted set of challenges Technical Strategies needed todeal with ongoingobsolescence andscale Organisational Access and reuse Authenticity andintegrity Sustainability (costs) Legal Deciding what needs tobe retained 6. A centre of expertise in digital information managementwww.ukoln.ac.ukTechnical challenges (1) Physical Bits stored on a physical medium (or in the cloud?) Focus 20 years ago was on new media types (e.g. opticalstorage technologies) as a panacea Bit-level preservation is still important the first layer in aviable preservation strategy 7. A centre of expertise in digital information managementwww.ukoln.ac.ukObsolete mediaImage courtesy of Frank CareyExhibition at NASA WhiteSands Test Facility, 2009 8. A centre of expertise in digital information managementwww.ukoln.ac.ukTechnical challenges (2) Hardware and software dependence Most digital objects are dependent on particularconfigurations of hardware and software Relatively short obsolescence cycles 9. A centre of expertise in digital information managementwww.ukoln.ac.ukHardware and software dependenceExhibition at NASA WhiteSands Test Facility, 2009Image courtesy of Frank Carey 10. A centre of expertise in digital information managementwww.ukoln.ac.ukConceptual challenges (1) What is an digital object? Some are analogues of traditional objects, e.g. meetingminutes, research papers Others are not, e.g. Web pages, blogs, GIS, 3D modelsof chemical structures, research data more generally Complexity Dynamic nature Interactivity Born digital vs. product of digitisation initiatives Logical layer between physical storage of bits and theconceptual objects that need preservation (includes datatypes, formats, etc.) 11. A centre of expertise in digital information managementwww.ukoln.ac.ukConceptual challenges (2) Need to identify and document the significantproperties (or characteristics) of content: Recognises that preservation is context dependent, evenuser specific (OAIS concept of designated community) Helps with choosing an acceptable preservation strategy Compare the performance model developed by theNational Archives of Australia (2002) - The source ofa record is a fixed message that interacts withtechnology. This message provides the recordsunique meaning, but by itself is meaningless toresearchers since it needs to be combined withtechnology in order to be rendered as its creatorintended. The process is the technology required torender meaning from the source Focus on re-use (e.g., data curation) 12. A centre of expertise in digital information managementwww.ukoln.ac.ukOrganisational challenges (1) Sustainability: Ultimately the sustainability of content depends upon the long-term sustainability of organisations Focus on business models Embedding preservation into the core task of organisations Organisational commitment: An institutional repository needs to be a service withcontinuity behind it Institutions need to recognise thatthey are making commitments for the long term CliffordLynch Need for policy development Incentives for preservation: Clarity on roles and responsibilities needed Who benefits? Who pays? Free riding? 13. A centre of expertise in digital information managementwww.ukoln.ac.ukOrganisational challenges (2) Economic perspectives: Blue Ribbon Task Force on Sustainable DigitalPreservation and Access: http://brtf.sdsc.edu/ Final report (Feb 2010) Ensuring that valuable digitalassets will be available for future use is not simply amatter of finding sufficient funds. It is about mobilizingresources - human, technical, and financial - across aspectrum of stakeholders diffuse over both space andtime. But questions remain about what digitalinformation we should preserve, who is responsiblefor preserving, and who will pay. JISC-funded LIFE (Life Cycle Information for E-Literature) has developed a predictive costing tool:http://www.life.ac.uk/ 14. A centre of expertise in digital information managementwww.ukoln.ac.ukOrganisational challenges (3) The challenge of scale: The Web Digitised textual content: Google Books DPLA / Europeana The data deluge in e-Science: New generations of instruments, computersimulations Many terabytes generated per day, petabyte scalecomputing (and growing) Cory Doctorow, Welcome to the petacentre. Nature,455, pp 17-21, 4 Sep 2008 15. A centre of expertise in digital information managementwww.ukoln.ac.ukOrganisational challenges (4) The need for collaboration: Need for deep-infrastructure for preservation recognisedas far back as 1996 by the Task Force on Archiving ofDigital Information Digital preservation involves the "grander problem oforganizing ourselves over time and as a society ... [tomanoeuvre] effectively in a digital landscape" (p. 7) Building on existing networks Role for national-level co-ordination: Digital Preservation Coalition (DPC), nestor(Germany), National Digital Information Infrastructureand Preservation Program (NDIIPP) 16. A centre of expertise in digital information managementwww.ukoln.ac.ukOrganisational challenges (5) Learn the lessons fromthe past: Things will go wrong Do what you can toenable recovery fromdisaster Digital technologiessupport replication(create more than onepoint of failure) 17. A centre of expertise in digital information managementwww.ukoln.ac.ukDigital preservation strategies (1) Main approaches: Technology preservation (e.g., computing museums) Digital archaeology (a post hoc approach) Emulation (focusing on the environment, often usedwhere look-and-feel is important, e.g. computer games) Migration (focusing on the content) A mature approach: A set of organised tasksdesigned to achieve the periodic transfer of digitalinformation from one hardware and softwareconfiguration to another, or from one generation ofcomputer technology to a subsequent one - CPA/RLGreport (1996) 18. A centre of expertise in digital information managementwww.ukoln.ac.ukDigital preservation strategies (2) Preservation strategies are not in competition Different strategies will work together, may be value indiversification Migration strategies mean difficult choices need to bemade about target formats But the strategy chosen has implications for: The technical infrastructure required (and metadata) Collection management priorities Rights management Owning the rights to re-engineer software Costs 19. A centre of expertise in digital information managementwww.ukoln.ac.ukDigital preservation strategies (3) Tools for format characterisation and validation DROID - Digital Record Object Identification (based onthe PRONOM registry Very important to know what types (formats) ofcontent exist in a particular collection (e.g.,institutional repository or Web archive) Performs batch identification of file formats http://www.nationalarchives.gov.uk/PRONOM/Default.aspx JHOVE - JSTOR/Harvard Object Validation Environment Used for format validation http://hul.harvard.edu/jhove/ 20. A centre of expertise in digital information managementwww.ukoln.ac.ukDigital preservation strategies (4) Plato preservation planning tool Developed by EU Planets project A decision support tool that helps users explore theevaluation of potential preservation solutions againstspecific requirements and for building a plan forpreserving a given set of objects Integrates file format identification (using DROID); somemigration services; XML-based generic formatcharacterisation using XCL (eXtensible CharacterisationLanguages) More info: http://www.ifs.tuwien.ac.at/dp/plato/intro.html Integration with repositories tested by JISC KeepItproject: http://preservation.eprints.org/keepit/ 21. A centre of expertise in digital information managementwww.ukoln.ac.ukOAIS Reference Model (ISO 14721)OAIS Functional Entities (Figure 4-1)http://public.ccsds.org/publications/archive/650x0m2.pdf 22. A centre of expertise in digital information managementwww.ukoln.ac.ukPreservation metadata Metadata and documentation is vitally important Relates to OAIS concepts