21
What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM [email protected]

What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM [email protected]

Embed Size (px)

Citation preview

Page 1: What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan.sullivan@nara.gov

What Agencies Should Know About PDF/A

September 20, 2005

Susan J. Sullivan, CRM

[email protected]

Page 2: What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan.sullivan@nara.gov

Introduction

• Agenda– Why long term preservation of PDF is an issue

– Discussion of PDF/A Standard and NARA’s Transfer Guidance for Permanent PDF records

– Roles of both PDF/A and the NARA’s PDF Transfer Guidance in Federal recordkeeping

– Overview of PDF/A and the ISO Process

– Conclusion and Questions

Page 3: What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan.sullivan@nara.gov

Wide Use of PDF

• PDF is a ubiquitous open format for electronic documents– Proprietary, but with publicly available specification

• The feature-rich nature of PDF can complicate preservation efforts

• All PDFs not created equal

• Much important information maintained in PDF • Permanent archival records, in some cases.

Page 4: What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan.sullivan@nara.gov

PDF Not a Suitable Archival Format

• PDF itself is not suitable as an archival format. – Some Features not compatible with current archival

requirements– Not necessarily self-contained– All PDFs are not created equal

• Long-term solution needed – Permanent archival records, in some cases– Administrative Office of U.S. Courts initiated idea for an

ISO Standard based on PDF (PDF/A)

Page 5: What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan.sullivan@nara.gov

How NARA is Addressing PDF

• Issued PDF Transfer Guidance– Allowing agencies to transfer permanent records to

NARA in PDF In March of 2003, NARA

• Participating in PDF/A ISO Standard Development– To influence the process– To gain knowledge

Page 6: What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan.sullivan@nara.gov

Transfer Format versus File Format

NARA’s transfer guidance and PDF/A have a similar goal…..to ensure that valuable electronic information in PDF is not lost

But different purposes:• Transfer Format - NARA’s PDF Transfer Guidance

– Specifies NARA transfer requirements – Applies to existing and future records in PDF

• File Format - The PDF/A ISO Standard (PDF/A)– Specifies a subset of the PDF file format – More format reliability/fewer in “bells & whistles”– PDF should be maintained longer as PDF (e.g., within agencies)

Page 7: What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan.sullivan@nara.gov

Scope and Usage

NARA’s PDF Transfer Guidance• Usage: Transfer existing permanent PDF records to

NARA Permanent PDF Records • Scope

– Applies to permanent records

– PDF 1.0 - 1.4

– Quality criteria, laws and regulations, transfer documentation,

NARA contact information

PDF/A ISO Standard • Usage: Programming Specification • Scope

– Addresses one aspect of long term preservation (i.e., file format) – Should be used as one piece of the archival puzzle

Page 8: What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan.sullivan@nara.gov

Requirements - PDF/A and NARA’s PDF Transfer Guidance

Embedded fonts • PDF/A and NARA’s PDF Transfer Guidance both require that

fonts be embedded– NARA Guidance phases in requirements for workstation

resident fonts.

Encryption • PDF/A and NARA’s PDF Transfer Guidance both prohibit

encryption– NARA Guidance phases in requirement as long as we can

open, view and print

Page 9: What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan.sullivan@nara.gov

Special Features• PDF/A restricts special features

– Embedded files, external links, Java Script– PDF/A promotes tagged PDF as a higher level of conformance

• NARA evaluates special features on a case-by-case basis at the time of scheduling

Metadata/Documentation • PDF/A requires that embedded metadata must be in Adobe XMP• NARA requires transfer documentation (e.g., SF-258), and would

evaluate embedded metadata at the time of scheduling

Requirements - PDF/A and NARA’s PDF Transfer Guidance

Page 10: What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan.sullivan@nara.gov

Quality Requirements• PDF/A as a file format does not address

quality/creation requirements such as exact replication of source material– Informative Annex B - identifies recommended creation

guidelines

– Agencies must implement these guidelines to comply with NARA’s PDF transfer guidance

• NARA’s PDF Transfer Guidance includes – quality requirements regarding scanning quality, – lossy compression – substitution of characters with OCR’d text

Requirements - PDF/A and NARA’s PDF Transfer Guidance

Page 11: What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan.sullivan@nara.gov

NARA’s Expectations for PDF/A

– PDF/A should address some of the PDF archival issues and enable PDF records to be maintained longer as PDF

– Standard maintained by ISO, not just vendors – Agencies should implement PDF/A along with records

management policies and procedures

• Such as….

– NARA’s PDF Transfer Guidance

– AOUSC’s document management program

Page 12: What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan.sullivan@nara.gov

The PDF/A Standard

• Multi-part ISO International Standard

– ISO 19005-1:2005, Document management – Electronic document file format for long-term preservation – Part 1: Use of PDF 1.4 (PDF/A-1)

– Part 2 (19005-2) intended to bring PDF/A into conformance with PDF 1.6

– And additional future parts, as necessary

Page 13: What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan.sullivan@nara.gov

Time Line for Part 1

• Submitted to ISO Central Secretariat for publication as International Standard– Should be publicly available September 2005

• Throughout the process, PDF/A has been reviewed by technical experts from 15 national standards bodies

Page 14: What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan.sullivan@nara.gov

PDF/A - Approach

• PDF/A specifies:– The subset of PDF components, from the PDF 1.4 Reference),

that are either required, restricted, or prohibited, and – How these components may be used by software

PDF/A

PDF 1.4 Reference

Specifies required featuresSpecifies restricted features

Specifies prohibited features

Page 15: What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan.sullivan@nara.gov

PDF/A - Requirements

• Disallows or limits features that could complicate long term preservation, and

• Maximizes: – Device independence

• Can be reliably and consistently rendered without regard to the hardware/software platform

– Self-contained• Contains all resources necessary for rendering

– Self-documenting• Contains its own description

– Transparency • Amenable to direct analysis with basic tools

Page 16: What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan.sullivan@nara.gov

PDF/A - Table of Contents

• 1 Scope• 2 Normative References• 3 Terms and Definitions• 4 Notation • 5 Conformance Levels• 6 Technical Requirements

– 6.1 File Structure– 6.2 Graphics– 6.3 Fonts

– 6.4 Transparency– 6.5 Annotations– 6.6 Actions– 6.7 Metadata– 6.8 Logical Structure– 6.9 Interactive Forms

• Informative annexes

– Annex A - PDF/A-1 Conformance Summary

– Annex B - Best Practices for PDF/A

• Bibliography

Page 17: What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan.sullivan@nara.gov

Annexes of the Draft PDF/A Standard – Informative Annexes

• Informative Annexes provide supplemental information including:– Summary of the PDF structures and components

disallowed, required, or limited– Best Practices for PDF/A

• Guidelines for capturing or converting electronic documents to PDF/A– To replicates the exact quality and content of source

documents – Required for compliance with NARA’s PDF Transfer Guidance

Page 18: What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan.sullivan@nara.gov

PDF/A - Overview of Requirements

• Two levels of conformance– Level A (e.g., Tagged PDF, UNICODE Mapping)– Level B (e.g. No Tagged PDF)

• Uniform file format (header, trailer, no encryption)• Device-independent rendering of graphics• Embedded fonts, character encoding• Annotations restricted, content should be displayed by

readers• External actions restricted, no dependence on external

content • Readers not required to act on hyperlinks, but may• XMP metadata “Adobe XML Metadata Framework” • Forms based on appearance, not data

Page 19: What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan.sullivan@nara.gov

• For permanent records in PDF, agencies need to understand that:– PDF/A is one option for long-term preservation of

electronic documents– PDF/A, by itself, does not guarantee exact replication

of source material– Agencies must implement PDF/A in conjunction with

additional requirements to meet NARA standards for transferring permanent records to NARA (i.e., NARA’s PDF Transfer Guidance)

Take Away

Page 20: What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan.sullivan@nara.gov

More Information is Available

• More information on NARA’s PDF Transfer Guidance on NARA’s Web Site– http://www.archives.gov/records-mgmt/initiatives/pdf-records.html

• More information on PDF/A on AIIM Web Site– http://www.aiim.org/standards.asp?ID=25013

• Contact Susan Sullivan at [email protected]

Page 21: What Agencies Should Know About PDF/A September 20, 2005 Susan J. Sullivan, CRM susan.sullivan@nara.gov

Questions/Discussion