25
Long-Time Preservation Thomas Stensitzki

Long time preservation

Embed Size (px)

Citation preview

Page 1: Long time preservation

Long-Time Preservation

Thomas Stensitzki

Page 2: Long time preservation

Page 2

Agenda

Long-Term preservation

Why should/must items be archived?

What should/must items be archived?

How can archiving be done?

1

2

3

4

Page 3: Long time preservation

Page 3

Terms Outsourcing, Filing, Backup, Archiving

Outsourcing

- Data (e.g. of a specific period) is being exported from a source system and

converted (if required)

- Outsourced data is not available in the source system

- Outsourced data can be backed up or archived

- Importing of outsourced data might require conversion, when the target data

structure is different

Filing

- Storage of objects in a folder of the file system

- Filed objects can be backed up or archived depended on their file location

Page 4: Long time preservation

Page 4

Terms Outsourcing, Filing, Backup, Archiving

Backup

- Copy of existing objects to a storage medium to be able to restore data in the

case of data corruption or accidental deletion

- Performed periodically

- Storage medium is being overwritten in time, older version of an object can

therefore not be restored

- Old versions of an object can be restored for a specific period only

Archiving

- Copy of a file or document to an external storage medium

- Standardized file format (tif, jpg) (if required)

- Storage for a longer period

Page 5: Long time preservation

Page 5

Terms Document management vs. Long-term preservation

Document management

- Management of documents being edited using Check-In, Check-Out and

Versioning

- Documents can be found by attribute value search or full-text search

- Attributes and document links are managed by DMS

- Documents are stored in the file system or a DMS database

Page 6: Long time preservation

Page 6

Terms Document management vs. Long-term preservation

Long-term preservation

- Auditable and unchangeable storage of completed objects for a long time

- Copy of objects (e.g. files, documents) to an external storage medium

- Files and raw data are archived in original format

- Documents are converted and archived in standardized format (black/white =

TIF, colour = JPEG or PDF/A)

- Document lookup via index

- Archived files and raw data can be provided in original format

- Archived documents can be provided using a viewer software

Page 7: Long time preservation

Page 7

Terms Long-term preservation

Digital archiving

- Database-driven, long-term, secure and unchangeable storage of digital

information objects which are reproducible at any time

Digital long-term preservation

- Storage of digital information for a period longer than 10 years

Auditable digital archiving

- Storage of digital business-related information of in accordance to the

requirements of

- Handelsgesetzbuch § 239, § 257 HGB

- Abgabeordnung § 146, §147, § 200 AO

- GoBS

- Secure and orderly storage of business-related documents with retention

periods of six to ten years

Page 8: Long time preservation

Page 8

Why Sources of documents/objects

Documents, lifecycle of documents

- Creation and editing documents: in process (e.g. DMS, SharePoint)

- Completed documents: final version of a document

- Additional editing creates new version

Other documents

- Correspondence, reports, rules, pictures, films, letters, invoices, quotations,

certificates from different sources

Workflows

- Information from workflow based systems (with digital signatures)

- Final document can be created from related data as the final workflow step

IT systems

- Raw data is usually available in databases or files

Page 9: Long time preservation

Page 9

Why Dealing with documents/objects

Documents

- Documents in process and/or final documents are stored in DMS, SharePoint or

a disk drive (local or network share)

- Documents stored on network shares are backup automatically

- Documents in SharePoint and emails in Outlook are deleted after retention

period has expired

- Deleted documents on a network share cannot be restored after the backup

period as exceeded

- Final documents signed by hand are archived in paper and/or scanned to PDF

and stored as file (attached to an email)

Page 10: Long time preservation

Page 10

Why Dealing with documents/objects

Other documents

- Emails are deleted from the inbox automatically after retention period has

expired

- Reports, images, films, invoices, quotations, certificates, etc. available as files

are be considered as documents

- Documents in paper, e.g. correspondence, letters, certificates, etc. are stored in

files

Page 11: Long time preservation

Page 11

Why Dealing with documents/objects

Workflow vs. documents

- Information created in workflow systems is stored with data of digital signatures

in databases

- All data of a finalized workflow is stored digitally within the database (usually),

final document can be created using a template

- Print-out is treated as a copy of the original digital document

- Digitally signed documents are treated equally to documents signed by hand

IT systems vs. raw data

- Raw data is stored in databases or files which grow over time

- Data can be outsourced or exported to reduce the storage size, but the data is

not instantly accessible for the application

- Software manufacturers must guarantee that release changes do not impact the

capability to import outsourced data

Page 12: Long time preservation

Page 12

Why Legal and regulatory requirements for archiving

Legal requirements for business documents

- Handelsgesetzbuch (HGB) § 257 regulates which business documents have

to be archived

- Legal retention period for business letters is 6 years, for other documents 10

years

- Abgabenordnung (AO) §§ 146, 147 describe similar requirements for

administrative regulations

- Digitally archiving of those documents must comply to the principles of proper

accounting (GoB) and GoBS which describe the requirements for process

documentation

- Process documentation is the proof of correct operation of the system and

describes the overall organizational and technical process of archiving

(collection, indexing, storage, retrieval, protection against loss / corruption and

reproduction of archived information)

Page 13: Long time preservation

Page 13

Why Legal and regulatory requirements for archiving

- Digitally signed documents are legally binding as well as conventional paper

documents

- Each country has different requirements depending on the business of the

company (e.g. Sarbanes-Oxley Act regarding internal controlling)

- Subject to audits and inspections

Page 14: Long time preservation

Page 14

Why Legal and regulatory requirements for archiving

Industry-specific requirements for documentation / archiving

- Gefahrengutverordnung (GGAV)

- Environmental liability and product liability law

- Operational directives and regulations

- Good Practice quality guidelines and regulations

- etc.

Agree with internal departments (QS, Legal, Controlling) and maybe with

authorities on the archiving process

Page 15: Long time preservation

Page 15

What Retention policies for information life-cycle in Outlook and SharePoint

Recommendations

Outlook Retention period

Inbox 60 days

Other folders

Sent Items

Drafts

Outbox

2 years

Deleted items 7 days

Calendar

Tasks

2 years

Contacts Duration of

employment

Classes in SharePoint Retention period

Standard 2 years

Review 7 years

Long-Term 10 years

Page 16: Long time preservation

Page 16

What Which documents and data

Business units determine

- Which documents have to be archived how and for how long

(storage form, file plan, retention periods)

- Document classes (logical archive)

- Document types

- Index data

Page 17: Long time preservation

Page 17

What Requirements

Requirements for long-term preservation are specified by the

business

- Processes, workflows, interfaces

- Documents, objects, source, meta data

- Archiving period

- Regulatory aspects

- Permissions, roles, user management, responsibilities

- Purpose of archiving (e.g. display of documents in 15 years)

- Confidentiality, data integrity, sensitive data, availability

- Capacity (data volume, number of users, performance)

- etc.

Page 18: Long time preservation

Page 18

What Meta data

Meta data provides structured index and search capabilities to

archived objects

- Source of meta data (e.g. master data systems)

- Who maintains the master data?

- Shall meta data be selected or manually entered?

- Is meta data document-dependent?

- Is meta data transferred automatically from other systems?

- Is an audit-trail required? (Who has changed which meta-data, when, why)

Coordination of the meta data in early stages is highly recommended

Page 19: Long time preservation

Page 19

What Requirements

If raw data has to be archived

- Raw data is stored as is, bit-wise

- Primary goal is the ability to import raw data as 1:1 copy of the original data

- IT system generating raw data must be able to handle imported raw data even

after a long time

- Format of raw data must be coordinated

- Software manufacturers must guarantee that release changes do not impact the

capability to import outsourced data

- Meta data must be defined

- Processing of long-term preserved raw data is the responsibility of the

generating IT system, not of the archiving system

Page 20: Long time preservation

Page 20

How Technical aspects

Selection of eligible file formats

- Should the document be displayed as original incl. embedded graphics?

- Should reproduce the original document properties (paper size, font size,

header, footer, logos, color, hand-written notes, etc.)?

- Should documents be archived in different formats but with same content (e.g.

XML and graphic)?

- Legal requirements?

- Is “loss of information” acceptable when converting into graphical

representations (jpeg)?

- Is the converting process revision-safe?

- Is the archived document format suitable for the archiving period?

Page 21: Long time preservation

Page 21

How BSI approved formats

Graphics

- TIFF, storage of screened black-white images

- JPEG, storage of colour and gray scale images

Structure formats

- XML, can be used for long-term preservation of digital documents

Schema and layout have to be archived as well

- PDF/A, subset of PDF, standardized for long-term preservation

Format with structure and layout information and graphical objects

Documents must be validated to be PDF/A compliant

Page 22: Long time preservation

Page 22

How Storage media

Possible storage media

- Paper

- Microfilm

- Magnetic tapes, floppy disks

- Optical storage media (e.g. CD-R, CD-ROM, DVD, WORM)

- Hard drives

- etc.

Selected media types have a limited lifetime and durability. Long-term

preserved objects must be copied to new media unchanged, if

required due to technology related changes in the storage media.

Page 23: Long time preservation

Page 23

How Additional topics

- Storage of sensitive data

- Restart of the archiving system after system outage in a disaster

- Integration in current IT environment

- Migration of archived objects is expensive depending on data volume

- User management

- Usage of storage media must be regulated

- Firewall based separation of archiving system

- Long-Term archiving solution should be in use for a long time, supplier selection

should be aware of this

Page 24: Long time preservation

Page 24

How Pros & Cons

Pros

Single storage of documents/objects

Save storage space

Documents/objects available to

authorized persons

Documents/objects available from

every workplace

Structured search of

documents/objects

Cons

Usage of source documents must be

regulated

Personal must be trained

(end-user, administrator)

On-going maintenance costs

Complex IT system and IT

infrastructure required

Page 25: Long time preservation

Page 25

We would be happy to help.

Do You Have

Any Questions?

http://www.sf-tools.net [email protected]