32
Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division National Institute of Standards and Technology

Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Embed Size (px)

Citation preview

Page 1: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Government Information Preservation Working Group

Highlights of Digital Preservation Survey

December 16th 2003

Oliver Slattery

Information Access Division

National Institute of Standards and Technology

Page 2: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Need for Digital Preservation:• ….crucial….critical….essential....important…

• Legally required.

• Principle role of agency/central to agency mission.

• 30-100’s years

• Archive distribution and central requirements of data assets.

• Important for department to provide secure, accessible, archival information on QC testing and other technical work.

• Continuity of operations.

• The need to stay current.• Records are ‘permanent’.

Page 3: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Challenges in the next 5 years:Obstacles

• Large/increasing volumes of data

• Multiple formats / format compatibility

• Quality/capacity of media

• Storage space

• Getting customers to use latest media

• Upgrading infrastructure/equipment – procurement (cost and time)

• Ensuring authenticity

Specific challenges/tasks

• Websites (archiving of)

• Preservation with online/on demand access

• Coordinating/integrating preservation procedures

• Migration of current archive

• Ensuring authenticity

Other concerns

• Management/record keeping

• Defining digital preservation

• Test capabilities/equipment (procurement – cost and time)

• Uniformity among suppliers of digital documents

• Same document through every phase of life cycle

Page 4: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Current strategy and its limitations:• Control Formats – limit: must be done at creation.• Use tapes to store and distribute data – limit: tapes are expensive, will soon no

longer be made and are susceptible to errors • DLT, CD/DVD ROM. PDF/TIFF – limit: size, cost, compatibility• Networked computer disk drives and backup magnetic media. Systems include

Access databases and laboratory test database called Testream (SQL) – limit: Access portion not secure or traceable. Backup may be insufficient. No assurances of data accessibility if formats change. 

• Coordinate the preservation of born digital items – Limit: resources• Currently migrating from analog to digital. Still acquire in analog, but send out

to customers in digital. Moving towards full digital acquisition. – Limit: storage space and budget. Process is slow.

• From archive to CD/DVD for distribution. ‘Deep archive’ facilities for long term storage. – Limit: Large data sets too big for current archive media capacities.

• HD media (tapes) such as DLT and SDLT ect., Servers/LAN, some web based access. –Limit: Network throughput is small – nearing limitation. Automation not available for HD preservation work.

Page 5: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Research we want to see:Information Quality and Access

• Authentication• Accuracy of rendering. • Universal media.• One size fits all.• Safeguards to ensure authenticity

and version control of archived docs

• PDF for archiving• Universal access tool.• Practices and procedures. Digital

is easy to change but hard to detect changes!

• Standards analysis and development.

Reliability• Media durability• Physical testing and artificial

aging of digital media to predict durability.

• Preservation media.• Testing and evaluation of media.

Important to share results.• Large capacity, reliable archive

media.• Development of media analysis

tools.• Detect changes of error rates in

media.• Classical issues such as video

archiving, microfilm preservation issues, environmental studies.

New/Alternative Technologies• Fiber channel hard drives• Blue-ray discs• Solid state storage• Universal media.• Keeping an eye on future

technology…hardware, software, formats.

• Large capacity, reliable archive media.

New/Alternative Technologies• Fiber channel hard drives• Blue-ray discs• Solid state storage• Universal media.• Keeping an eye on future

technology…hardware, software, formats.

• Large capacity, reliable archive media.

Procedures/Best Practices • Methods for migration of legacy

information.• Safeguards to ensure

authenticity and version control of archived docs

• Practices and procedures. Digital is easy to change but hard to detect changes!

Formats• PDF for archiving• Preservation media.• Universal access tool.• Preservation format.• Format interconversion.

Page 6: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Types of data:

• Data files

• Microfilm

• Multimedia/web

• Imagery (Scanned, digital)

• Documents (mixed/compound, digital)

• Software

• Video

• Laboratory results (from equipment)

• Records

• Graphics/Drawings

• Support data• Binary

• Binary – seismic

• Binary – well logs

• Text

• Audio

Bold = multiple hits

Page 7: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Capture and Collection

Absolute Maximum = 50

Very important = 5 points

Quite important = 4 points

Somewhat important = 3 points

Not especially important = 2 points

Not at all important = 1

0 10 20 30 40 50

Accessing Old Data

Digitizing

Digital Conversion

Data Formats

Amount of Data

Dig

ital

Pre

serv

atio

n I

ssu

es

Score out of 50

Page 8: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Capture and Collection

The maximum number of hits per level of importance is 10.

The minimum number of hits per importance level is 0.

Amount of Data: Importance

0 2 4 6 8

Not at all

Not especially

Somewhat

Quite

Very

Imp

orta

nce

Lev

el

Hits out of 10

Page 9: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Capture and Collection

The maximum number of hits per level of importance is 10.

The minimum number of hits per importance level is 0.

Data Formats: Importance

0 2 4 6 8

Not at all

Not especially

Somewhat

Quite

Very

Imp

orta

nce

Lev

el

Hits out of 10

Page 10: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Capture and Collection

The maximum number of hits per level of importance is 10.

The minimum number of hits per importance level is 0.

Digital Format Conversion: Importance

0 2 4 6 8

Not at all

Not especially

Somewhat

Quite

Very

Imp

orta

nce

Lev

el

Hits out of 10

Page 11: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Capture and Collection

The maximum number of hits per level of importance is 10.

The minimum number of hits per importance level is 0.

Digitization of Analog Data: Importance

0 2 4 6 8

Not at all

Not especially

Somewhat

Quite

Very

Imp

orta

nce

Lev

el

Hits out of 10

Page 12: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Capture and Collection

The maximum number of hits per level of importance is 10.

The minimum number of hits per importance level is 0.

Accessing Old Data: Importance

0 2 4 6 8

Not at all

Not especially

Somewhat

Quite

Very

Imp

orta

nce

Lev

el

Hits out of 10

Page 13: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Absolute Maximum = 50

Very important = 5 points

Quite important = 4 points

Somewhat important = 3 points

Not especially important = 2 points

Not at all important = 1

0 10 20 30 40 50

Performance

Media S tandards

Media Compatibility

Multiple Media Types

Media Capacity

Media Longevity

Media Technology

Dig

ital

Pre

serv

atio

n I

ssu

es

Score out of 50

Storage Media

Page 14: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Storage Media

The maximum number of hits per level of importance is 10.

The minimum number of hits per importance level is 0.

Media Technology: Importance

0 2 4 6 8

Not at all

Not especially

Somewhat

Quite

Very

Imp

orta

nce

Lev

el

Hits out of 10

Page 15: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Storage Media

The maximum number of hits per level of importance is 10.

The minimum number of hits per importance level is 0.

Media Longevity: Importance

0 2 4 6 8

Not at all

Not especially

Somewhat

Quite

Very

Imp

orta

nce

Lev

el

Hits out of 10

Page 16: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Storage Media

The maximum number of hits per level of importance is 10.

The minimum number of hits per importance level is 0.

Media Capacity: Importance

0 2 4 6 8

Not at all

Not especially

Somewhat

Quite

Very

Imp

orta

nce

Lev

el

Hits out of 10

Page 17: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Storage Media

The maximum number of hits per level of importance is 10.

The minimum number of hits per importance level is 0.

Multiple Media Formats: Importance

0 2 4 6 8

Not at all

Not especially

Somewhat

Quite

Very

Imp

orta

nce

Lev

el

Hits out of 10

Page 18: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Storage Media

The maximum number of hits per level of importance is 10.

The minimum number of hits per importance level is 0.

Media Compatibility: Importance

0 2 4 6 8

Not at all

Not especially

Somewhat

Quite

Very

Imp

orta

nce

Lev

el

Hits out of 10

Page 19: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Storage Media

The maximum number of hits per level of importance is 10.

The minimum number of hits per importance level is 0.

Media Standards: Importance

0 2 4 6 8

Not at all

Not especially

Somewhat

Quite

Very

Imp

orta

nce

Lev

el

Hits out of 10

Page 20: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Storage Media

The maximum number of hits per level of importance is 10.

The minimum number of hits per importance level is 0.

Media/System Performance: Importance

0 2 4 6 8

Not at all

Not especially

Somewhat

Quite

Very

Imp

orta

nce

Lev

el

Hits out of 10

Page 21: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Data and Storage Management

Absolute Maximum = 50

Very important = 5 points

Quite important = 4 points

Somewhat important = 3 points

Not especially important = 2 points

Not at all important = 1

0 10 20 30 40 50

Software Tools

Data Managment

Infrustructure

Data Format Stds.

Migration Program

Dig

ital

Pre

serv

atio

n I

ssu

es

Score out of 50

Page 22: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Data and Storage Management

The maximum number of hits per level of importance is 10.

The minimum number of hits per importance level is 0.

Data Migration Program: Importance

0 2 4 6 8

Not at all

Not especially

Somewhat

Quite

Very

Imp

orta

nce

Lev

el

Hits out of 10

Page 23: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Data and Storage Management

The maximum number of hits per level of importance is 10.

The minimum number of hits per importance level is 0.

Data Format Standards: Importance

0 2 4 6 8

Not at all

Not especially

Somewhat

Quite

Very

Imp

orta

nce

Lev

el

Hits out of 10

Page 24: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Data and Storage Management

The maximum number of hits per level of importance is 10.

The minimum number of hits per importance level is 0.

Infrastructure Management: Importance

0 2 4 6 8

Not at all

Not especially

Somewhat

Quite

Very

Imp

orta

nce

Lev

el

Hits out of 10

Page 25: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Data and Storage Management

The maximum number of hits per level of importance is 10.

The minimum number of hits per importance level is 0.

Data Management: Importance

0 2 4 6 8

Not at all

Not especially

Somewhat

Quite

Very

Imp

orta

nce

Lev

el

Hits out of 10

Page 26: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Data and Storage Management

The maximum number of hits per level of importance is 10.

The minimum number of hits per importance level is 0.

Software Tools: Importance

0 2 4 6 8

Not at all

Not especially

Somewhat

Quite

Very

Imp

orta

nce

Lev

el

Hits out of 10

Page 27: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Access and Distribution

Absolute Maximum = 50

Very important = 5 points

Quite important = 4 points

Somewhat important = 3 points

Not especially important = 2 points

Not at all important = 1

0 10 20 30 40 50

VPN Access

Internet access

Times

Security

Dig

ital

Pre

serv

atio

n I

ssu

es

Score out of 50

Page 28: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Access and Distribution

The maximum number of hits per level of importance is 10.

The minimum number of hits per importance level is 0.

Security, Muti-level Access: Importance

0 2 4 6 8

Not at all

Not especially

Somewhat

Quite

Very

Imp

orta

nce

Lev

el

Hits out of 10

Page 29: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Access and Distribution

The maximum number of hits per level of importance is 10.

The minimum number of hits per importance level is 0.

Access and Retrieval Times: Importance

0 2 4 6 8

Not at all

Not especially

Somewhat

Quite

Very

Imp

orta

nce

Lev

el

Hits out of 10

Page 30: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Access and Distribution

The maximum number of hits per level of importance is 10.

The minimum number of hits per importance level is 0.

Need to Provide Internet Access: Importance

0 2 4 6 8

Not at all

Not especially

Somewhat

Quite

Very

Imp

orta

nce

Lev

el

Hits out of 10

Page 31: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Access and Distribution

The maximum number of hits per level of importance is 10.

The minimum number of hits per importance level is 0.

Virtual Private Network Access: Importance

0 2 4 6 8

Not at all

Not especially

Somewhat

Quite

Very

Imp

orta

nce

Lev

el

Hits out of 10

Page 32: Government Information Preservation Working Group Highlights of Digital Preservation Survey December 16 th 2003 Oliver Slattery Information Access Division

Thanks

Thanks to all who replied.

Survey creation:

Jerry McFaul, Ollie Slattery, Victor McCrary, Fred Byers , Xiao Tang, Rich Vining.