Upload
fabio-corubolo
View
230
Download
0
Tags:
Embed Size (px)
Citation preview
SIGNIFICANT ENVIRONMENT INFORMATION FOR LTDP
Fabio Corubolo, Adil Hasan – University of LiverpoolAnna Eggers, Jens Ludwig - Göttingen State University LibraryMark Hedges, Simon Waddington - King’s College London
This project has received funding from the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement no FP7-601138 PERICLES.
Objective and outline• Aim: Ensure long term usability of Digital Objects (DO)
• Usability of Digital Object usually requires access to parts of its environment • Define a broad set of information (Environment information) • Consider its significance (Significant environment information) • Explore and test pragmatic methods to collect such information
Environment information definition
• All the entities (DOs, metadata, policies, rights, services, users, etc.) useful to correctly access, render and use the DO.
Refinement:• The information about the set of relationships between the
source DO and any related objects from its environment.
Environment for a DO• Technical system information (OS, system architecture, etc.)• DO metadata (descriptive, structural, technical) • User, policy, process information (User BG knowledge, …)• Information necessary to make use of the object including: • Auxiliary data (e.g. calibration data for to support sensor data)• External documentation (e.g. specifications, related documents) • Implicit knowledge about what data is useful to use the DO (e.g. the user
knowledge about what is relevant and what not in the collection) • More…
Environment
No object is an island, entire of itself• Digital objects are used in a rich environment
Digital object
Ext. Metadata
Present FutureStorage Digital object
Digital object information• Rich and varied terminology• The scope of each term is not
absolutely defined• We are aiming to support
object use: use-centric view• First broad - Environment
information: more or less all that sits outside of the DO
Significant Environment Information (SEI)• Use of a DO has a purpose • The purpose gives a scope to the dependent environment
information• Weights can express the importance for a specific purpose
(definition)We define SEI as the set of relationships between a DO and its environment information qualified with purpose and weights
How to collect and measure SEI? • Observe the use of DOs – in different phases of lifecycle • in the environment of creation and use
• Collect dependencies for use (relationships to other DOs)• Measure significance e.g. based on frequency of use• Different semantics and factors for significance weights (value,…) – WIP • Weights will change in time
• Sheer curation: curation activities integrated in the use workflow; lightweight and transparent
Pericles Extraction Tool (PET)• Open source* framework - builds on the SEI concepts • Uses a sheer curation approach – right time and place• Generic, modular, domain agnostic• Collection by observation – monitoring changes in time• Snapshot of the system environment • To observe unstructured workflows• https://github.com/pericles-project/pet
* Release due soon, approved but waiting for final stamps
PET Architecture and modules• Available and used system resources;• File format identification and
checksums;• Currently running processes; • Event information (file and network)
from processes;• Graphic configuration information;• MS Office and PDF font
dependencies.• Native commands
How to setup PET for a use scenario• PET is installed, configured, started on the machine where the
DOs are used – stays in monitoring mode• The profile (modules and configuration) are use case specific • The user interacts normally with the DOs while PET collects SEI
in the background• The environment information, DO events and changes are
collected for future use and analysis
General scenario for PET1. Use PET to collect environment information when-where the
DOs are used, based on profiles--- We are now here ---
2. Analyse the information collected to infer new relationships (also SEI) between DOs - forming a graph structure
3. Assign weights to relationships based on the purpose and significance – weighted graph
Experiment: use case description• Fictional scenario, based on operations for ISS SOLAR payload• Operator’s task: resolve anomalies • Process: extensive search in the archived data + documents• Issue: how to preserve implicit information, help with overload • PET task: record SEI for a specific anomaly• monitor environment, record significant events, infer documentation
useful to solve the anomaly• SEI: to identify and debug a specific anomaly, that is the
implicit operator knowledge
Experimental results (1)An anomaly is reported in an handover sheet
The operator proceeds with documentation search and consultation, all tracked by PET
Experimental results (2)• Environment monitoring• Events, extraction on occurrence of events • Leads to dependency inference
• In future work we consider more complex issues• ‘noise’ from multitask, • careful analysis of collected data in the next phases
Conclusions, Future work• Define Significant Environment Information (SEI) for object reuse • Base for dependency graphs weighted on significance and purpose
• Explain ways to obtain SEI and significance weights • Present the PET tool – to collect SEI• Show experimental results - initial dependency collectionFuture:• Improve: filtering, dependency inference• Work on definition and semantics for significance weights • Use weighted dependency graphs to support appraisal
About the PERICLES project• Promoting and enhancing reuse of information throughout the
content lifecycle taking account of evolving semantics• Ensure availability and reuse of digital objects for the next
generations• Extensions to current preservation and lifecycle models to
address the evolution of dynamic heterogeneous resources and their dependencies• Models capturing intent and interpretative context: key to
achieving “preservation by design”
Facts & Figures• Collaborative FP7 project on digital preservation• 12 million Euro, co-funded by the European Commission• 11 partners: research institutions, IT development and
application domain• 6 European countries• Feb 2013 – Feb 2017• Project website: http://www.pericles-project.eu
ConsortiumCOORDINATOR: King’ s College London – UK
ACADEMIC PARTNERS:Hoegskolan i Borås – University of Borås – SEGeorg-August-Universität Göttingen – DEUniversity of Liverpool – UKCentre for Research and Technology Hellas – GRUniversity of Edinburgh – UK
NON-ACADEMIC PUBLIC SECTOR ORGANISATIONS Tate – UK Belgian User Service and Operation Centre - B.USOC – BE
PRIVATE SECTOR ORGANISATIONS Dotsoft – GRSpace Applications Services NV/SA (SpaceApps) – BEXerox Research Centre Europe - FR