48
Formalizing the Design of Digital Libraries Based on UML Delos NoE, Preservation Cluster: Workshop: Persistency in Digital Libraries 13. February 2006, Oxford Internet Institute

Formalizing the Design of Digital Libraries Based on UML Delos NoE, Preservation Cluster: Workshop: Persistency in Digital Libraries 13. February 2006,

  • View
    215

  • Download
    1

Embed Size (px)

Citation preview

Formalizing the Design of Digital Libraries Based on UML

Delos NoE, Preservation Cluster:

Workshop: Persistency in Digital Libraries

13. February 2006, Oxford Internet Institute

0

Talking about …

• Theoretical stage: Transforming conceptual models into an UML representation (class diagram)

• „Pragmatic“ model by Endres and Fellner

• Formally defined model „5S Framework for Digital Libraries“ by Fox, Goncalves et al.

1

The Endres/Fellner Model (EF-Model)

Goals

• Modelling an architecture of a digital library on a very high level (Conceptual model)

• Modelling just those elements of a DL which are absolutely fundamental and do not change

2

Starting point: Use cases

The EF-Model is based on an essential model, regarding

first of all fundamental scenarios of the system (business processes, use cases):

3

• How can the digital library system fulfill the requirements of the essential model?

• Therefore we need to know: With which elements and concepts the digital library has to deal in order to handle the Use Cases?

4

• The fundamental unit of a digital library is data.

• All systems data has to be saved.

DigitalLibraryData

saveData()

5

• According to the essential model, there are 8 kind of data within a digital library.

• All of these data is a specialisation of the global concept of data.

• So these data can be modelled as super-class - sub-class relationships, i.e. as generalisations.

7

1. Users

• Data about people who are users of the digital library are one fundamental kind of data within a digital library system.

This data represents the user. Therefore, the class to be modelled is termed „User“.

• Basic attributes are address and profile of the user; Additionally, users can be identified through an identification number; operations enable to modify or create these data.

• Users are specified through sub-classes.

8

Class „EFUser“

6

9

2. Supplier

• Suppliers are the second group of entities which interact with the system. They can be real persons as well as corporations. Supplier‘s data is encapsulated within the class „Supplier“.

• According to E/F, basic attributes are address and (sales) conditions. They are considered to be common to all suppliers.

10

Class „EFSupplier“

„EFSupplier“ can be specialised through subclasses. Which particular specialisations are chosen is up to the designer and

depends on the requirements of the DL.

11

3. Documents

• Documents are the core products of a digital library.

• All data about digital documents which are deliverable (asked for by any user) are subsumed within a class „EFDocument“.

• „EFDocument“ serves as a super-class for a number of sub-classes. Again, the question which sub-classes can be derived is a matter of the needs of every distinct digital library.

12

Class „EFDocument“

13

4. Finding Aids

• Finding aids cover all of the descriptive metadata of a digital library; E/F are focussing especially on those metadata which you can retrieve via e.g. OPACs or search engines.

We therefore call this class „EFRetrieval“. The tools for retrieval are modelled as sub-classes as well.

• According to E/F, basic attributes are designation, type and (network) address; basic operations are inserting new finding aids or modifying the existing.

Class „EFRetrieval“

14

15

5. Services

• Services are defined as all services which are supported by the digital library except the delivery of documents.

• E/F do not give more detailed statements on services.

EFService

16

6. Orders

• The E/F model also comprises business data, just as we can find them in almost every commercial company.

• Within the EF-Model, one important task of a digital library is its ability to cope with orders of users for documents or services.

• The class „EFOrder“ represents this task.

17

Class „EFOrder“

18

7. Deliveries

• Suppliers provide users with the services or documents they have ordered.

• These data concerning deliveries are therefore encapsulated within the class „EFDelivery“.

19

Class „EFDelivery“

20

8. Accountings

• All deliveries are accounted. The related data is encapsulated in the „EFAccounting“ class. The particular units of the accounting (items) are modelled as a class that is associated to „EFAccounting“.

• Order, Delivery and Accounting are business related data.

21

Class „EFAccounting“

22

EF-Model: Summary

• The EF-Model is a high-level architecture. It provides a conceptual model of a digital library system.

• The EF-Model is also a taxonomy of data.

• It focuses on some aspects of digital libraries. Not all aspects are equally considered. The system is to a certain extent understood as an economical one.

• The model is also on an analytical stage of system design.

23

24 Complete model (red= core classes)

25

5S Model of a Digital Library

1. What is „5S“?

• „5S“ stands for: Streams, Structures, Spaces, Scenarios and Societies

• These five dimensions are considered to be crucial for every digital library

• As the main components they constitute a framework for a digital library. All of the elements in the 5S framework are formally described.

26

• Streams are defined as a sequence of elements of an arbitrary type. This could be e.g. bitstreams, stream of characters.

• Structures reflect the organisation of information. This can be on quite diffrent levels, e.g. structure of streams, structure of a hypertext, relationships among actors, system connections.

27

• Spaces present the content of digital libraries in a usable and retrievable way. This could be the interface to a bibliographic database or a browser for accessing objects.

• Scenarios detail the behaviour of digital library services and explain the functionality of structures and spaces. An example is the act of searching for objects.

• Societies focus on the actors involved in the functionality of a digital library, e.g. users, suppliers, service staff.

28

Formal Definition of a DL

A digital library is a 4-tuple DL=(R,Cat,Serv,Soc) where R is a repository Cat={DMC1,DMC2,...,DMCK} is a set of metadata catalogues

for all collections {C1, C2,..., CK} in R Serv is a set of services Soc is a society.

28.1

Formal Definition of a DL

A digital library is a 4-tuple DL=(R,Cat,Serv,Soc) where R is a repository Cat={DMC1,DMC2,...,DMCK} is a set of metadata catalogues

for all collections {C1, C2,..., CK} in R Serv is a set of services Soc is a society.

29

Formal Definition of Repository

A repository is formally defined as R={Ci} (i=1 to f) with assumed operations get(), store(), del(): R is a family of collections and get(), store() and del() are fundamental functions for

a repository to manipulate collections.

A collection is a set of digital objects: C={do1, do2,..., dok}.

30

5S Repository

31

Formal Definition of a Digital Object

A digital object is defined as a tuple

do=(h,SM,ST,StructuredStreams) with h is an element of H SM={sm1,sm2,...,smn} is a set of streams ST={st1,st2,...,stm} is a set of structural metadata specifications and StructuredStreams={stsm1, stsm2,..., stsmp} is a set of functions, defined

from the streams in the SM set and the structural

metadata specifications in the ST.

32

Formal Definition of a Digital Object

To resolve the concepts of this definition: H is a set of universally unique

handles (labels) Stream is defined as a sequence of elements of an arbitrary type Structural Metadata Specification is a structure Structure is a tuple (G,L,F) where

G is a directed graph, L is a set of label values and F is a labelling function.

33

Enlarged Repository Structure

34A

Formal Definition of a DL

A digital library is a 4-tuple DL=(R,Cat,Serv,Soc) where R is a repository Cat={DMC1,DMC2,...,DMCK} is a set of metadata catalogues

for all collections {C1, C2,..., CK} in R Serv is a set of services Soc is a society.

34

Formal Definition of Catalogue

Cat={DMC1,DMC2,...,DMCK} where DMC is a set of pairs {(h,{dm1,...,dmkh})} and where C is a collection with k handles in H h is element of H, dmi is a descriptive metadata specification.

35

5S Catalogue

36A

Formal Definition of a DL

A digital library is a 4-tuple DL=(R,Cat,Serv,Soc) where R is a repository Cat={DMC1,DMC2,...,DMCK} is a set of metadata catalogues

for all collections {C1, C2,..., CK} in R Serv is a set of services Soc is a society.

36

Formal Definition of Service

The formal definition of a 5S DL identifies a service Serv={Serv1,...,ServK} as a set of services, containing at least services for browsing, indexing and searching.

Furthermore, a service is defined as a set of scenarios.

A scenario again is, according to the formal definition, a sequence of related transition events.

37

5S Service

38A

Formal Definition of a DL

A digital library is a 4-tuple DL=(R,Cat,Serv,Soc) where R is a repository Cat={DMC1,DMC2,...,DMCK} is a set of metadata catalogues

for all collections {C1, C2,..., CK} in R Serv is a set of services Soc is a society.

38

Formal Definition of Society

A society Soc=(CM,RS) is a tuple where CM={cm1,cm2,...,cmn} is a set of conceptual communities, each community referring to a set of individuals of the same class or type RS={rs1,rs2,...,rsm} is a set of relationships rsj=(ej,ij) each relationship being a tuple, where ej is a Cartesian product, specifying the communities involved in the relationship ij is an activity that describes the interactions or communications among individuals.

39

5S Society

40

What about the Spaces?

A Space is by definition “a measureable space, measure space, probability space, vector space, topological space or a metric space.” Digital libraries can use the space concepts for many representations, e.g. visualisation of documents, indexing, communication between user and system.

41

UML model of the 5S DL

42

References

• Endres, A.; Fellner, D.W.: Digitale Bibliotheken. Heidelberg: d-punkt, 2000.

• Goncalves, M.A.; Fox, E.A.; Watson, L.T.; Kipp, N.: Streams, Structures , Spaces, Scenarios, Societies (5S): A formal model for digital libraries. Technical report 03-04, Virginia Tech., 2004. Link: http://portal.acm.org/citation.cfm?id=984321.984325