Upload
thomas-stensitzki
View
476
Download
2
Embed Size (px)
Citation preview
Long-Time Preservation
Thomas Stensitzki
Page 2
Agenda
Long-Term preservation
Why should/must items be archived?
What should/must items be archived?
How can archiving be done?
1
2
3
4
Page 3
Terms Outsourcing, Filing, Backup, Archiving
Outsourcing
- Data (e.g. of a specific period) is being exported from a source system and
converted (if required)
- Outsourced data is not available in the source system
- Outsourced data can be backed up or archived
- Importing of outsourced data might require conversion, when the target data
structure is different
Filing
- Storage of objects in a folder of the file system
- Filed objects can be backed up or archived depended on their file location
Page 4
Terms Outsourcing, Filing, Backup, Archiving
Backup
- Copy of existing objects to a storage medium to be able to restore data in the
case of data corruption or accidental deletion
- Performed periodically
- Storage medium is being overwritten in time, older version of an object can
therefore not be restored
- Old versions of an object can be restored for a specific period only
Archiving
- Copy of a file or document to an external storage medium
- Standardized file format (tif, jpg) (if required)
- Storage for a longer period
Page 5
Terms Document management vs. Long-term preservation
Document management
- Management of documents being edited using Check-In, Check-Out and
Versioning
- Documents can be found by attribute value search or full-text search
- Attributes and document links are managed by DMS
- Documents are stored in the file system or a DMS database
Page 6
Terms Document management vs. Long-term preservation
Long-term preservation
- Auditable and unchangeable storage of completed objects for a long time
- Copy of objects (e.g. files, documents) to an external storage medium
- Files and raw data are archived in original format
- Documents are converted and archived in standardized format (black/white =
TIF, colour = JPEG or PDF/A)
- Document lookup via index
- Archived files and raw data can be provided in original format
- Archived documents can be provided using a viewer software
Page 7
Terms Long-term preservation
Digital archiving
- Database-driven, long-term, secure and unchangeable storage of digital
information objects which are reproducible at any time
Digital long-term preservation
- Storage of digital information for a period longer than 10 years
Auditable digital archiving
- Storage of digital business-related information of in accordance to the
requirements of
- Handelsgesetzbuch § 239, § 257 HGB
- Abgabeordnung § 146, §147, § 200 AO
- GoBS
- Secure and orderly storage of business-related documents with retention
periods of six to ten years
Page 8
Why Sources of documents/objects
Documents, lifecycle of documents
- Creation and editing documents: in process (e.g. DMS, SharePoint)
- Completed documents: final version of a document
- Additional editing creates new version
Other documents
- Correspondence, reports, rules, pictures, films, letters, invoices, quotations,
certificates from different sources
Workflows
- Information from workflow based systems (with digital signatures)
- Final document can be created from related data as the final workflow step
IT systems
- Raw data is usually available in databases or files
Page 9
Why Dealing with documents/objects
Documents
- Documents in process and/or final documents are stored in DMS, SharePoint or
a disk drive (local or network share)
- Documents stored on network shares are backup automatically
- Documents in SharePoint and emails in Outlook are deleted after retention
period has expired
- Deleted documents on a network share cannot be restored after the backup
period as exceeded
- Final documents signed by hand are archived in paper and/or scanned to PDF
and stored as file (attached to an email)
Page 10
Why Dealing with documents/objects
Other documents
- Emails are deleted from the inbox automatically after retention period has
expired
- Reports, images, films, invoices, quotations, certificates, etc. available as files
are be considered as documents
- Documents in paper, e.g. correspondence, letters, certificates, etc. are stored in
files
Page 11
Why Dealing with documents/objects
Workflow vs. documents
- Information created in workflow systems is stored with data of digital signatures
in databases
- All data of a finalized workflow is stored digitally within the database (usually),
final document can be created using a template
- Print-out is treated as a copy of the original digital document
- Digitally signed documents are treated equally to documents signed by hand
IT systems vs. raw data
- Raw data is stored in databases or files which grow over time
- Data can be outsourced or exported to reduce the storage size, but the data is
not instantly accessible for the application
- Software manufacturers must guarantee that release changes do not impact the
capability to import outsourced data
Page 12
Why Legal and regulatory requirements for archiving
Legal requirements for business documents
- Handelsgesetzbuch (HGB) § 257 regulates which business documents have
to be archived
- Legal retention period for business letters is 6 years, for other documents 10
years
- Abgabenordnung (AO) §§ 146, 147 describe similar requirements for
administrative regulations
- Digitally archiving of those documents must comply to the principles of proper
accounting (GoB) and GoBS which describe the requirements for process
documentation
- Process documentation is the proof of correct operation of the system and
describes the overall organizational and technical process of archiving
(collection, indexing, storage, retrieval, protection against loss / corruption and
reproduction of archived information)
Page 13
Why Legal and regulatory requirements for archiving
- Digitally signed documents are legally binding as well as conventional paper
documents
- Each country has different requirements depending on the business of the
company (e.g. Sarbanes-Oxley Act regarding internal controlling)
- Subject to audits and inspections
Page 14
Why Legal and regulatory requirements for archiving
Industry-specific requirements for documentation / archiving
- Gefahrengutverordnung (GGAV)
- Environmental liability and product liability law
- Operational directives and regulations
- Good Practice quality guidelines and regulations
- etc.
Agree with internal departments (QS, Legal, Controlling) and maybe with
authorities on the archiving process
Page 15
What Retention policies for information life-cycle in Outlook and SharePoint
Recommendations
Outlook Retention period
Inbox 60 days
Other folders
Sent Items
Drafts
Outbox
2 years
Deleted items 7 days
Calendar
Tasks
2 years
Contacts Duration of
employment
Classes in SharePoint Retention period
Standard 2 years
Review 7 years
Long-Term 10 years
Page 16
What Which documents and data
Business units determine
- Which documents have to be archived how and for how long
(storage form, file plan, retention periods)
- Document classes (logical archive)
- Document types
- Index data
Page 17
What Requirements
Requirements for long-term preservation are specified by the
business
- Processes, workflows, interfaces
- Documents, objects, source, meta data
- Archiving period
- Regulatory aspects
- Permissions, roles, user management, responsibilities
- Purpose of archiving (e.g. display of documents in 15 years)
- Confidentiality, data integrity, sensitive data, availability
- Capacity (data volume, number of users, performance)
- etc.
Page 18
What Meta data
Meta data provides structured index and search capabilities to
archived objects
- Source of meta data (e.g. master data systems)
- Who maintains the master data?
- Shall meta data be selected or manually entered?
- Is meta data document-dependent?
- Is meta data transferred automatically from other systems?
- Is an audit-trail required? (Who has changed which meta-data, when, why)
Coordination of the meta data in early stages is highly recommended
Page 19
What Requirements
If raw data has to be archived
- Raw data is stored as is, bit-wise
- Primary goal is the ability to import raw data as 1:1 copy of the original data
- IT system generating raw data must be able to handle imported raw data even
after a long time
- Format of raw data must be coordinated
- Software manufacturers must guarantee that release changes do not impact the
capability to import outsourced data
- Meta data must be defined
- Processing of long-term preserved raw data is the responsibility of the
generating IT system, not of the archiving system
Page 20
How Technical aspects
Selection of eligible file formats
- Should the document be displayed as original incl. embedded graphics?
- Should reproduce the original document properties (paper size, font size,
header, footer, logos, color, hand-written notes, etc.)?
- Should documents be archived in different formats but with same content (e.g.
XML and graphic)?
- Legal requirements?
- Is “loss of information” acceptable when converting into graphical
representations (jpeg)?
- Is the converting process revision-safe?
- Is the archived document format suitable for the archiving period?
Page 21
How BSI approved formats
Graphics
- TIFF, storage of screened black-white images
- JPEG, storage of colour and gray scale images
Structure formats
- XML, can be used for long-term preservation of digital documents
Schema and layout have to be archived as well
- PDF/A, subset of PDF, standardized for long-term preservation
Format with structure and layout information and graphical objects
Documents must be validated to be PDF/A compliant
Page 22
How Storage media
Possible storage media
- Paper
- Microfilm
- Magnetic tapes, floppy disks
- Optical storage media (e.g. CD-R, CD-ROM, DVD, WORM)
- Hard drives
- etc.
Selected media types have a limited lifetime and durability. Long-term
preserved objects must be copied to new media unchanged, if
required due to technology related changes in the storage media.
Page 23
How Additional topics
- Storage of sensitive data
- Restart of the archiving system after system outage in a disaster
- Integration in current IT environment
- Migration of archived objects is expensive depending on data volume
- User management
- Usage of storage media must be regulated
- Firewall based separation of archiving system
- Long-Term archiving solution should be in use for a long time, supplier selection
should be aware of this
Page 24
How Pros & Cons
Pros
Single storage of documents/objects
Save storage space
Documents/objects available to
authorized persons
Documents/objects available from
every workplace
Structured search of
documents/objects
Cons
Usage of source documents must be
regulated
Personal must be trained
(end-user, administrator)
On-going maintenance costs
Complex IT system and IT
infrastructure required
Page 25
We would be happy to help.
Do You Have
Any Questions?
http://www.sf-tools.net [email protected]