30
Flexible and Extensible Digital Object and Repository Architecture (FEDORA) Sandra Payette Cornell University [email protected]. edu http://www.cs.cornell.edu/payette/presentations/fedora- gdz.ppt Dritter Workshop der Digitalisierungszentren, October 5, 1999

Flexible and Extensible Digital Object and Repository Architecture (FEDORA)

Embed Size (px)

DESCRIPTION

Flexible and Extensible Digital Object and Repository Architecture (FEDORA). Sandra Payette Cornell University [email protected]. Dritter Workshop der Digitalisierungszentren, October 5, 1999. http://www.cs.cornell.edu/payette/presentations/fedora-gdz.ppt. - PowerPoint PPT Presentation

Citation preview

Flexible and Extensible Digital Object and Repository

Architecture (FEDORA)

Sandra PayetteCornell University

[email protected]

http://www.cs.cornell.edu/payette/presentations/fedora-gdz.ppt

Dritter Workshop der Digitalisierungszentren, October 5, 1999

Cornell Digital Library Research Group

• Computer Science Department Bill Arms Carl Lagoze Sandy Payette Naomi Dushay David Fielding

• Affiliates Anne Kenney (Cornell Library) Geri Gay (Human Computer Interaction) CNRI

CDLRG - Projects

• Prism (DLI2)

• Fedora

• Harmony (IDL)

• Dienst and NCSTRL

• Electronic Scholarly Publishing D-Lib Citation Linking (IDL)

Library of Congress

Cornell Digital Library

Digital Library Interoperability

Principles for Digital Library Architecture

• Open Architecture functionality partitioned into set of well-defined services services accessible via well-defined protocol

• Modularization promotes interoperability scalable to different clientele (library, informal web)

• Federation enable aggregations into logical collections

• Distribution of content and services of administration and management

Repository Service

Component-Ware Digital Libraries

Collection Service

Index Service

Identifiers

NameService

DigitalObjects

UI GatewayService

Query MediatorService

UI

FEDORA

• Digital Object Model container for aggregating any digital material disseminations of complex types global extensibility mechanisms access management

• Repository Service Service layer for “contained” DigitalObjects Object lifecycle management Secure environment open interface

FEDORA: Goals

• Distribution - of digital content and services

• Interface Stability - for digital objects

• Interoperability - for digital objects and repositories

• Extensibility - naturally evolving type system

• Flexibility - community-driven type development

• Security - rights management and access control

• Preservation - longevity of digital objects

FEDORA History

• Kahn/Wilensky

• Warwick Framework

• Distributed Active Relationships

• Cornell FEDORA (Lagoze, Payette)

• CNRI Repository (Arms, Blanchi, Overly)

• CNRI/FEDORA - Interoperability Project

• UVA - Complex disseminators, distribution

• Project Prism (DLI2)

FEDORA DigitalObjects can be...

• Simple, familiar entities

• Complex, compound, dynamic objects

DublinCore

Book

Dia

ry

Fu

ture

FEDORA DigitalObject Model

Internal DataStream

MIME-typed stream of bytes

Reference DataStream

Service Request upon external source

Dissemination

Disseminator Type

A set of behaviors that formally describes the functionality of any global or community-specific notion of content.

getSectiongetArticle

getChaptergetPage

getFramegetLength

Disseminator

A generic component that associates

a set of behaviors with a DigitalObject.

PrimitiveDisseminator

Extensible Type Disseminator

Generic behaviors Extended behaviors

FEDORA DigitalObject

application/MARC

application/postscript

PrimitiveDisseminator

image/gifimage/gif

image/gifimage/gif

application/MARC DS1

application/postscript DS2

PrimitiveDisseminator

Client communicates with generic requests

Book, DublinCore

ListDisseminatorTypesBook

DisseminatorDublinCore

Disseminator

GetDissemination(Book.GetPage(1))

GetChapterGetTOCGetPage

GetChapter(n), GetPage(n),GetTOC()

GetMethods(Book)

A Disseminator...

GetDCField(Title), GetDCRecord

GetMethods(DC)

application/MARC

DC

DS1

application/postscript

DS2

… references a Servlet TYPE DESCRIPTION = DublinCore

SERVLET = cornell.dli2/DC-from-MARC

… to produce non-generic behaviors for the DigitalObject

GetDCFieldGetDCRecord

DigitalObject Interface Stability

MechanismStructure Interface

Disseminator Type

Servlet-2

Servlet-1

Servlet-3

Mechanisms can be updated or replaced as technology changes ...

… and the interface tothe Digital Object

remains stable

DigitalObject Extensibility: Adding New Types

MechanismStructure Interface

Book

The sameunderlyingdata...

Boo

k

can be operatedon in novel ways…

Photo Collection

to create new disseminationsnot originally conceived of

for the particular digital object.

Pho

toC

olle

ct

Extensibility: a look under the hood

application/MARC

DC servlet

application/postscriptDublinCore

Record

GetDissemination( GetDCRecord)

DC

Servlet = URNDC1

DC sign

atur

e

GetDCFie

ld

GetDCRec

ord

DCMethodListSignature

Disseminator

URNDC

DublinCoreDisseminator Type

Signature(Interface Definition)

DublinCoreMechanism

(Servlet)

DC Mechanism

URNDC1

ServletDisseminator

Proliferation of Disseminator Types

• We use FEDORA DigitalObjects to store Disseminator Signatures and Servlets.

• Type Registration (via name service) a Disseminator Type’s global identifier is

… the URN of a DigitalObject containing a Signature

a Servlet’s global identifier is… the URN of a DigitalObject containing a Servlet

Types can be globally recognizable and mechanisms can be shared.

Repository

Interoperable Digital Objects and Repositories

Identifiers

NameService

RAP Client

Image Database System

Repository Repository

Cornell Library CollectionsAudio/Visual Archive

Persistent Identifiers

• In FEDORA, use them for: Repositories DigitalObjects Disseminator Types Servlet Mechanisms

• Benefits: Ensure uniqueness Provide stability (location independence) Promote global extensibility Promote interoperability

Identifiers

NameService

Identifiers - A Brief Primer

IETF Uniform Resource Name (URN) Spec• Naming Scheme

The policies and procedures for creating and assigning URNs within a particular domain.

• Resolution System A system that translates URNs into their location-specific

identifiers (e.g., URLs).

• Registries A set of global directories that provide information on

which resolution systems can translate any particular URN.

Identifiers - Existing Solutions

• CNRI’s Handle System good implementation of URN specification 1 Handle >> one or more locations resolve to different data types (URL, IOR,…)

• OCLC’s PURL persistent URLs, not really URNs 1 PURL >> only one location (a HTTP redirect)

• Community-specific Initiatives Digital Object Identifier (DOI) - publishers

• Handle System + Rights Metadata

PubMedID - Medline BibCode - astro-physics journals

FEDORA Status

• Reference Implementation CORBA IDL defines open interfaces for

Repository Access Protocol (RAP) Java/CORBA repository and clients

• Collaborations CNRI

• core design and interoperability• complex disseminations (dynamic)

U of Virginia• web integration• complex disseminations (e.g., e-texts)

New Research

• DLI2 - Project Prism security (associating enforceable policies

and mechanisms with DigitalObjects) preservation (enable long-term survival of

DigitalObjects in distributed environment)

• IDL - Harmony aggregation and interaction of multiple,

complex metadata sets in DigitalObjects RDF and XML

PRISM Security Policy Enforcement

• Challenges what is enforceable? distributed object environment interoperability and extensibility

• Monitor all operations, generic and extended

• Enforce a wide array of policies basic security violations rights management access control

application/MARC

text/x-acl

DC

GetDCFieldGetDCRecord

PRISM: Preservation

Handles

Preservation Service

FedoraRepositories

PRISM: Preservation Policy Enforcement

preservationmetadata

PreserveP

DS1

application/postscript

DS2

Book

Preservation Service

Monitors DigitalObject stateand catches unacceptable,

or risky transitions

Preservation Surrogate

Object

References• Payette, Blanchi, Lagoze, and Overly: Interoperability for Digital Objects

and Repositories: The Cornell/CNRI Experiments, D-Lib Magazine, May

1999. http://www.dlib.org/dlib/may99/payette/05payette.html

• Payette and Lagoze: Flexible and Extensible Digital Object and

Repository Architecture (FEDORA), ECDL 1998. http://www.cs.cornell.edu/payette/papers/ECDL98/FEDORA.html

• Lagoze and Payette: An Infrastructure for Open-Architecture Digital

Libraries http://ncstrl.cs.cornell.edu/Dienst/UI/1.0/Display/ncstrl.cornell/TR98-1690

• Daniel, Lagoze, and Payette, A Metadata Architecture for Digital Libraries,

IEEE ADL 1998. http://www.cs.cornell.edu/lagoze/papers/ADL98/dar-adl.html

• FEDORA Home Page http://www.cs.cornell.edu/NCSTRL/CDLRG/FEDORA.html

• Payette: Persistent Identifiers on the Digital Terrain, RLG DigiNews,April 1998, Volume 2, Number 2. http://www.rlg.org/preserv/diginews/diginews22.html