Introduction
• Agenda– Why long term preservation of PDF is an issue
– Discussion of PDF/A Standard and NARA’s Transfer Guidance for Permanent PDF records
– Roles of both PDF/A and the NARA’s PDF Transfer Guidance in Federal recordkeeping
– Overview of PDF/A and the ISO Process
– Conclusion and Questions
Wide Use of PDF
• PDF is a ubiquitous open format for electronic documents– Proprietary, but with publicly available specification
• The feature-rich nature of PDF can complicate preservation efforts
• All PDFs not created equal
• Much important information maintained in PDF • Permanent archival records, in some cases.
PDF Not a Suitable Archival Format
• PDF itself is not suitable as an archival format. – Some Features not compatible with current archival
requirements– Not necessarily self-contained– All PDFs are not created equal
• Long-term solution needed – Permanent archival records, in some cases– Administrative Office of U.S. Courts initiated idea for an
ISO Standard based on PDF (PDF/A)
How NARA is Addressing PDF
• Issued PDF Transfer Guidance– Allowing agencies to transfer permanent records to
NARA in PDF In March of 2003, NARA
• Participating in PDF/A ISO Standard Development– To influence the process– To gain knowledge
Transfer Format versus File Format
NARA’s transfer guidance and PDF/A have a similar goal…..to ensure that valuable electronic information in PDF is not lost
But different purposes:• Transfer Format - NARA’s PDF Transfer Guidance
– Specifies NARA transfer requirements – Applies to existing and future records in PDF
• File Format - The PDF/A ISO Standard (PDF/A)– Specifies a subset of the PDF file format – More format reliability/fewer in “bells & whistles”– PDF should be maintained longer as PDF (e.g., within agencies)
Scope and Usage
NARA’s PDF Transfer Guidance• Usage: Transfer existing permanent PDF records to
NARA Permanent PDF Records • Scope
– Applies to permanent records
– PDF 1.0 - 1.4
– Quality criteria, laws and regulations, transfer documentation,
NARA contact information
PDF/A ISO Standard • Usage: Programming Specification • Scope
– Addresses one aspect of long term preservation (i.e., file format) – Should be used as one piece of the archival puzzle
Requirements - PDF/A and NARA’s PDF Transfer Guidance
Embedded fonts • PDF/A and NARA’s PDF Transfer Guidance both require that
fonts be embedded– NARA Guidance phases in requirements for workstation
resident fonts.
Encryption • PDF/A and NARA’s PDF Transfer Guidance both prohibit
encryption– NARA Guidance phases in requirement as long as we can
open, view and print
Special Features• PDF/A restricts special features
– Embedded files, external links, Java Script– PDF/A promotes tagged PDF as a higher level of conformance
• NARA evaluates special features on a case-by-case basis at the time of scheduling
Metadata/Documentation • PDF/A requires that embedded metadata must be in Adobe XMP• NARA requires transfer documentation (e.g., SF-258), and would
evaluate embedded metadata at the time of scheduling
Requirements - PDF/A and NARA’s PDF Transfer Guidance
Quality Requirements• PDF/A as a file format does not address
quality/creation requirements such as exact replication of source material– Informative Annex B - identifies recommended creation
guidelines
– Agencies must implement these guidelines to comply with NARA’s PDF transfer guidance
• NARA’s PDF Transfer Guidance includes – quality requirements regarding scanning quality, – lossy compression – substitution of characters with OCR’d text
Requirements - PDF/A and NARA’s PDF Transfer Guidance
NARA’s Expectations for PDF/A
– PDF/A should address some of the PDF archival issues and enable PDF records to be maintained longer as PDF
– Standard maintained by ISO, not just vendors – Agencies should implement PDF/A along with records
management policies and procedures
• Such as….
– NARA’s PDF Transfer Guidance
– AOUSC’s document management program
The PDF/A Standard
• Multi-part ISO International Standard
– ISO 19005-1:2005, Document management – Electronic document file format for long-term preservation – Part 1: Use of PDF 1.4 (PDF/A-1)
– Part 2 (19005-2) intended to bring PDF/A into conformance with PDF 1.6
– And additional future parts, as necessary
Time Line for Part 1
• Submitted to ISO Central Secretariat for publication as International Standard– Should be publicly available September 2005
• Throughout the process, PDF/A has been reviewed by technical experts from 15 national standards bodies
PDF/A - Approach
• PDF/A specifies:– The subset of PDF components, from the PDF 1.4 Reference),
that are either required, restricted, or prohibited, and – How these components may be used by software
PDF/A
PDF 1.4 Reference
Specifies required featuresSpecifies restricted features
Specifies prohibited features
PDF/A - Requirements
• Disallows or limits features that could complicate long term preservation, and
• Maximizes: – Device independence
• Can be reliably and consistently rendered without regard to the hardware/software platform
– Self-contained• Contains all resources necessary for rendering
– Self-documenting• Contains its own description
– Transparency • Amenable to direct analysis with basic tools
PDF/A - Table of Contents
• 1 Scope• 2 Normative References• 3 Terms and Definitions• 4 Notation • 5 Conformance Levels• 6 Technical Requirements
– 6.1 File Structure– 6.2 Graphics– 6.3 Fonts
– 6.4 Transparency– 6.5 Annotations– 6.6 Actions– 6.7 Metadata– 6.8 Logical Structure– 6.9 Interactive Forms
• Informative annexes
– Annex A - PDF/A-1 Conformance Summary
– Annex B - Best Practices for PDF/A
• Bibliography
Annexes of the Draft PDF/A Standard – Informative Annexes
• Informative Annexes provide supplemental information including:– Summary of the PDF structures and components
disallowed, required, or limited– Best Practices for PDF/A
• Guidelines for capturing or converting electronic documents to PDF/A– To replicates the exact quality and content of source
documents – Required for compliance with NARA’s PDF Transfer Guidance
PDF/A - Overview of Requirements
• Two levels of conformance– Level A (e.g., Tagged PDF, UNICODE Mapping)– Level B (e.g. No Tagged PDF)
• Uniform file format (header, trailer, no encryption)• Device-independent rendering of graphics• Embedded fonts, character encoding• Annotations restricted, content should be displayed by
readers• External actions restricted, no dependence on external
content • Readers not required to act on hyperlinks, but may• XMP metadata “Adobe XML Metadata Framework” • Forms based on appearance, not data
• For permanent records in PDF, agencies need to understand that:– PDF/A is one option for long-term preservation of
electronic documents– PDF/A, by itself, does not guarantee exact replication
of source material– Agencies must implement PDF/A in conjunction with
additional requirements to meet NARA standards for transferring permanent records to NARA (i.e., NARA’s PDF Transfer Guidance)
Take Away
More Information is Available
• More information on NARA’s PDF Transfer Guidance on NARA’s Web Site– http://www.archives.gov/records-mgmt/initiatives/pdf-records.html
• More information on PDF/A on AIIM Web Site– http://www.aiim.org/standards.asp?ID=25013
• Contact Susan Sullivan at [email protected]
Questions/Discussion