21
Automatic Evaluation of Migration Quality in Distributed Networks of Converters Miguel Ferreira [email protected] Supervisors Ana Alice Baptista José Carlos Ramalho E C D L 0 5 D o c t o r a l C o n s o r t i u m 2005-09-21

Automatic Evaluation of Migration Quality in Distributed Networks of Converters

Embed Size (px)

DESCRIPTION

Automatic Evaluation of Migration Quality in Distributed Networks of Converters. ECDL 05 Doctoral Consortium. Miguel Ferreira [email protected] Supervisors Ana Alice Baptista José Carlos Ramalho. 2005-09-21. Contents. Introductory concepts Research problems Proposed system - PowerPoint PPT Presentation

Citation preview

Page 1: Automatic Evaluation of Migration Quality in Distributed Networks of Converters

Automatic Evaluation of Migration Quality in

Distributed Networks of Converters

Miguel [email protected]

SupervisorsAna Alice Baptista

José Carlos Ramalho

EC

DL

05D

octo

ral C

onso

rtiu

m

2005-09-21

Page 2: Automatic Evaluation of Migration Quality in Distributed Networks of Converters

Contents

• Introductory concepts• Research problems• Proposed system• Methodology• Topics for discussion

Page 3: Automatic Evaluation of Migration Quality in Distributed Networks of Converters

Introductory concepts

• Digital preservation– The set of processes and activities that

ensure the continued access to information and all kinds of cultural heritage existing in digital formats

• Digital object– An information object, of any type of

information or any format, that is expressed in digital form

– Text documents, digital photos, vector graphics, databases, Web pages, software

Page 4: Automatic Evaluation of Migration Quality in Distributed Networks of Converters

Strategies for digital preservation

• Emulation– Reproduction of the behaviour of a

hardware/software platform in a different technological environment

• Encapsulation– Storing information about how the objects

should be interpreted

• Migration– Periodic transfer of digital materials from one

hardware/software configuration to another

• Others– Computer museums, viewers, Universal Virtual

Computer

Page 5: Automatic Evaluation of Migration Quality in Distributed Networks of Converters

Migration

• Advantages– Updated formats that users can read and

edit

• Disadvantages– Requires a continuous diligence– Data loss

• Variants– Migration on request– Normalisation– Distributed migration

Page 6: Automatic Evaluation of Migration Quality in Distributed Networks of Converters

Distributed migration

• A network of remote conversion services supported by a semantic layer [Hunter et al.]

• Advantages– Platform independent– Redundancy– Multiple migration paths– Cost reduction– Compatible with other migration strategies

• Disadvantages– bandwidth– Slow

• Examples– PANIC– MyMorph (NLMed)– TOM

FormatB

FormatC

FormatD

FormatE

FormatA

ConversionA-C

ConversionB-C

ConversionC-E

ConversionA-E

Page 7: Automatic Evaluation of Migration Quality in Distributed Networks of Converters

How to choose a preservation strategy?

• Many preservation alternatives• Lack of universal acceptance• Distinct preservation

requirements– Satisfaction of the designated community– Characteristics of the collection– Budget

• Framework for evaluating preservation strategies [Rauber]– Utility Analysis

Page 8: Automatic Evaluation of Migration Quality in Distributed Networks of Converters

Evaluation of preservation strategies

1. Definition of objective tree2. Assignment of measurement units

(e.g. millimetre, Mb, Euro)

3. Identification of preservation alternatives4. Execution of preservation alternatives

and evaluation of the outcome5. Weighting of criteria in the objective tree6. Calculation of partial and total values7. Ranking of alternatives

Page 9: Automatic Evaluation of Migration Quality in Distributed Networks of Converters

Objective tree (example)

Page 10: Automatic Evaluation of Migration Quality in Distributed Networks of Converters

Research problems

• Automation of preservation processes

• Authenticity issues• Cost management• Evaluation of preservation

alternatives

Page 11: Automatic Evaluation of Migration Quality in Distributed Networks of Converters

Research questions

• Is it feasible to design and implement a system that is able to automatically:– determine the amount of data loss

occurred in a migration and generate detailed migration reports for inclusion in the objects’ preservation metadata?

– provide recommendations of migration paths or target formats that will best suit users’ requirements?

Page 12: Automatic Evaluation of Migration Quality in Distributed Networks of Converters

MigrationEvaluator

MigrationAdvisor

MigrationKnowledge

Base(MKB)

MetaConverter

Request Migration[Source object]

Store[Migration report]

[Migration data]

Invoke Migration[Source object]

Evaluate migration[Original object] [Migrated object] [Process metrics]

Request Advice[Criteria]

Request advice[Criteria]

[Migrated Object][Migration Report]

[Migration Advice]

[Migration report]

[Migration advice]

[Migrated object]

User

Migration Network

Query MKB

[Parameters]

Proposed System

Page 13: Automatic Evaluation of Migration Quality in Distributed Networks of Converters

MigrationEvaluator

MigrationAdvisor

MigrationKnowledge

Base(MKB)

MetaConverter

Request Migration[Source object]

Store[Migration report]

[Migration data]

Invoke Migration[Source object]

Evaluate migration[Original object] [Migrated object] [Process metrics]

Request Advice[Criteria]

Request advice[Criteria]

[Migrated Object][Migration Report]

[Migration Advice]

[Migration report]

[Migration advice]

[Migrated object]

User

Migration Network

Query MKB

[Parameters]

Proposed System

Page 14: Automatic Evaluation of Migration Quality in Distributed Networks of Converters

MigrationEvaluator

MigrationAdvisor

MigrationKnowledge

Base(MKB)

MetaConverter

Request Migration[Source object]

Store[Migration report]

[Migration data]

Invoke Migration[Source object]

Evaluate migration[Original object] [Migrated object] [Process metrics]

Request Advice[Criteria]

Request advice[Criteria]

[Migrated Object][Migration Report]

[Migration Advice]

[Migration report]

[Migration advice]

[Migrated object]

User

Migration Network

Query MKB

[Parameters]

Proposed System

Page 15: Automatic Evaluation of Migration Quality in Distributed Networks of Converters

MigrationEvaluator

MigrationAdvisor

MigrationKnowledge

Base(MKB)

MetaConverter

Request Migration[Source object]

Store[Migration report]

[Migration data]

Invoke Migration[Source object]

Evaluate migration[Original object] [Migrated object] [Process metrics]

Request Advice[Criteria]

Request advice[Criteria]

[Migrated Object][Migration Report]

[Migration Advice]

[Migration report]

[Migration advice]

[Migrated object]

User

Migration Network

Query MKB

[Parameters]

Proposed System

Page 16: Automatic Evaluation of Migration Quality in Distributed Networks of Converters

MigrationEvaluator

MigrationAdvisor

MigrationKnowledge

Base(MKB)

MetaConverter

Request Migration[Source object]

Store[Migration report]

[Migration data]

Invoke Migration[Source object]

Evaluate migration[Original object] [Migrated object] [Process metrics]

Request Advice[Criteria]

Request advice[Criteria]

[Migrated Object][Migration Report]

[Migration Advice]

[Migration report]

[Migration advice]

[Migrated object]

User

Migration Network

Query MKB

[Parameters]

Proposed System

Page 17: Automatic Evaluation of Migration Quality in Distributed Networks of Converters

MigrationEvaluator

MigrationAdvisor

MigrationKnowledge

Base(MKB)

MetaConverter

Request Migration[Source object]

Store[Migration report]

[Migration data]

Invoke Migration[Source object]

Evaluate migration[Original object] [Migrated object] [Process metrics]

Request Advice[Criteria]

Request advice[Criteria]

[Migrated Object][Migration Report]

[Migration Advice]

[Migration report]

[Migration advice]

[Migrated object]

User

Migration Network

Query MKB

[Parameters]

Proposed System

Page 18: Automatic Evaluation of Migration Quality in Distributed Networks of Converters

Methodology - proof of concept

The concepts1. Automatic quantification of data

loss occurred in a migration and generation of preservation metadata

2. Automatic recommendation of migration strategies as well as target formats

The proof (empirical validation)

1. Evaluator versus Human experts2. Advisor versus Evaluation

framework

Page 19: Automatic Evaluation of Migration Quality in Distributed Networks of Converters

Key contributions

• For individual preservers, digital archives and libraries: – Outsourcing and automation of digital preservation– Generation of preservation metadata (authenticity)– Ranking of migration alternatives

• For designers and programmers of converters: – Possibility of publishing their converters as services

• For metadata creators and users: – Increase adoption– Help to improve future versions – Accelerate the development of XML bindings

Page 20: Automatic Evaluation of Migration Quality in Distributed Networks of Converters

Round-up

• Service oriented architecture (SOA)– Automatic quantification of data loss– Provides recommendations on which

migration paths or target formats are best suited for each user

– Simplifies the creation of preservation metadata

– Based on migration

• Methodology– Proof of concept with empirical

validation• Evaluator versus Human experts• Advisor versus Evaluation framework

Page 21: Automatic Evaluation of Migration Quality in Distributed Networks of Converters

Topics for discussion

• Relevance of research • Research methodology • System architecture• Format registry vocabulary

– e.g. MIME types, TOM type descriptors, Global Digital Format Registry, PRONOM, etc.

• Preservation metadata schema– e.g. PREMIS data dictionary (event entity)