83
Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina May 17-18, 2004 Edward A. Fox [email protected] http://fox.cs.vt.edu

Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Embed Size (px)

Citation preview

Page 1: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Next Generation Digital Libraries: Supporting Interoperability,

Semantics, and Quality

Biblioteca CentralUniversidad Nacional del Sur

Bahia Blanca, Argentina May 17-18, 2004

Edward A. Fox

[email protected] http://fox.cs.vt.edu

Page 2: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Acknowledgements (Selected)• Sponsors: ACM, Adobe, AOL, IBM, Microsoft,

NASA, NLM, NSF, OCLC, SUN, US Dept. of Ed.

• VT Faculty/Staff: Debra Dudley, Weiguo Fan, Gail McMillan, Manuel Perez, Naren Ramakrishnan, Layne Watson, …

• VT Students: Yuxin Chen, Shahrooz Feizabadi, Marcos Gonçalves, Nithiwat Kampanya, S.H. Kim, Bing Liu, Paul Mather, Fernando Das Neves, Unni. Ravindranathan, Ryan Richardson, Rao Shen, Ricardo Torres, Wensi Xi, Baoping Zhang, Qinwei Zhu, …

Page 3: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

ACKNOWLEDGEMENTS (NDLTD)• NDLTD Board of Directors, previous Steering Committee + other

NDLTD committees; those running Electronic Thesis & Dissertation (ETD) initiatives in universities, regions, countries

• Helpful sponsorship by many organizations, especially Adobe (new initiative!), CONACyT, DFG, FIPSE (US Dept. Education), IBM, Microsoft, NSF (IIS-9986089, 0086227, 0080748, 0325579; DUE-0121679, 0136690, 0121741, 0333601), OCLC, SOLINET, SUN, SURA, UNESCO, VTLS, many governments (Australia, Germany, India, …), …

• Colleagues at Virginia Tech (faculty, staff, students), and collaborators at many universities

• Slides included from: Vinod Chachra, Thom Hickey, Joan Lippincott, Gail McMillan, Axel Plathe, Hussein Suleman, …

Page 4: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Other Collaborators (Selected)

• Brazil: FUA, UFMG, UNICAMP• Case Western Reserve University• Emory, Notre Dame, Oregon State• Germany: Univ. Oldenburg• Mexico: UDLA (Puebla), Monterrey• College of NJ, Hofstra, Penn State, Villanova• University of Arizona• University of Florida, Univ. of Illinois• University of Virginia

• Endowment: VTLS

Page 5: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

UNESCO

• Cláudio Menezes [[email protected]]• Purpose:

• Reinforce local solutions, commitments

• Emphasize:• ETD does not need many resources.• Open source and free software is available.• International cooperation can help.• Local training is crucial. • => Inclusion of ETD in practices, processes• => Schedule for ETD projects

Page 6: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Part 4

Next Generation DigitalLibraries: Supporting

Interoperability,Semantics, and Quality

Page 7: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Digital Libraries in Education

• Analytical Survey, ed. Leonid Kalinichenko

• © 2003, www.iite-unesco.org, [email protected]

• Transforming the Way to Learn

• DLs of Educational Resources & Services

• Integrated/Virtual Learning Environment

• Educational Metadata

• Current DLEs: US (NSDL, DLESE, CITIDEL, NDLTD), Europe (Scholnet, Cyclades), UK (Distributed National Electronic Resource)

Page 8: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Digital Libraries in Education - 2

• Advanced Frameworks & Methodologies• Instructional course development with learning

module repositories, Learning Object reuse• Community organization around DLEs• Other content for science and research• Cyberinfrastructure, data grids• Curriculum-based interfaces (see Krowne et al.)• Concept-based organization of learning materials

and courses (CMs, ontologies)

Page 9: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

DLEs: Future Vision (p. 6)

• Global learning environment of the future:

• Student-centered

• Interactive and dynamic

• Enabling group work on real world problems

• Enabling students to determine their own learning routes (styles, personalization)

• Supporting lifelong learning

Page 10: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

DLEs: Objectives (p. 11)

• Long-range: lifelong/distance/anytime-anywhere

• Intermediate goals• Support for students, teachers, parents• Enhanced student performance• More students excited about science• More Internet-based science educational resources

• with increased quality and comprehensiveness,• easy to discover and retrieve,• preserved and universally available

Page 11: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

DLEs: Guiding Principles (p. 12)

• Driven by educational and science needs

• Facilitating educational innovation

• Stable, reliable, permanent

• Accessible to all

• Leveraging prior research: DL, courseware, …

• Adaptable to new technologies

• Supporting decentralized services

• Resource integration thru tools/organization

Page 12: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

CITIDEL Technology Features•Component architecture (Open Digital Library)

•Re-use and compose re-deployable digital library components.

•Built Using Open Standards & Technologies

•OAI: Used to collect DL Resources and DL Interoperability

•XSL and XML: Interface rendering with multi-lingual community based translation of screens and content (Spanish, …)

•Perl: Component Integration

•ESSEX: Search Engine Functionality

•Very fast, utilizing in-memory processing

•Includes snap-shots for persistence

•Multi-scheming

•Integrates multiple classifications / views through maps, closure

Page 13: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

English

Spanish

Nominated

Editor reviewed

Java

Multimedia

LLaanngguuaaggee TTooppiicc

QQuuaalliittyy

Identified by crawl

Peer reviewed

Algorithms

Multi-dimensional Categorization

Page 14: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

PIPE: Personalization by Partial Evaluation

• Interactions at existing web sites are predefined by the site designer

• Personalization is achieved by the designer’s anticipation of users’ expectations

• PIPE allows automatic personalization of a web site without designer anticipation• Recognized with the 2001 New Century Technology

Council Innovation award

Page 15: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

CITIDEL + PIPE

• Adds Interaction Personalization to CITIDEL

•Automatically handles multi-modal conversion to Cell phone, PDA, Etc.

•Can be adopted to any digital data set, only requires XML file of content with hierarchy maintained.

Page 16: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

PIPE provides Mixed-Initiative Interaction

• Involves an extra specification window (e.g., a toolbar)• system-initiated + user-initiated modes of interaction

Traditional browser: the user merely clicks on available hyperlinks.

PIPE window: the user can type in any information out-of-turn

Page 17: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Features of PIPE

• Applicable to many information system technologies

• web sites (even third-party)

•Digital Libraries (currently working on CITIDEL integration)• voice-activated systems (e.g., pizza ordering, movie information, and flight reservation services)

• PIPE is available for licensing and is ready for commercialization, through VTIP• PIPE has been featured in IEEE Internet Computing, IEEE IT Professional, and the Appian Web Personalization Report.

Page 18: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

OAI, ODL, DL-in-a-box

• Open Archives Initiative• since 1999, www.openarchives.org

• Open Digital Libraries• since 2001, from www.dlib.vt.edu• with Hussein Suleman (now U. Cape Town)

• DL-in-a-box• NSDL support since 2001• Aimed to help new collections / services projects• http://dlbox.nudl.org

Page 19: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Open Archives Initiative (OAI)

• Advocacy for interoperability• Standard for transferring metadata among

digital libraries• Protocol for Metadata Harvesting (PMH)

• Simplicity• Generality• Extensibility

• Support for PMH => Open Archive (OA)

Page 20: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

OAI = Technical Umbrella forPractical Interoperability…

ReferenceLibraries

PublishersE-Print

Archives

…that can be exploited by different communities

Museums

Page 21: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

OAI – Repository Perspective

Required: Protocol

DODO DO DO

MDO

MDO MDOMDOMDO

MDOMDOMDO

Page 22: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

OAI – Black Box Perspective

OA 1

OA 2

OA 4

OA 3

OA 5OA 6

OA 7

Page 23: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Tiered Model of Interoperability

Mediator services

Metadata harvesting

Document models

Page 24: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

DiscoveryCurrent

AwarenessPreservation

Service Providers

Data Providers

Meta

data

harv

estin

g

The World According to OAI

Page 25: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Document1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Image1010100101010010101010010101010101010101

Video

1010100101010010101010010101010101010101

Video

1010100101010010101010010101010101010101

Video

users digital objects

?

Page 26: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

?1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Image1010100101010010101010010101010101010101

Video

1010100101010010101010010101010101010101

Video

1010100101010010101010010101010101010101

Video?digital library

Monolithicand/or

Custom-builtweb-basedapplication

Page 27: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Video

1010100101010010101010010101010101010101

Video

1010100101010010101010010101010101010101

Video

componentized digital library

?

?

?

?

???

?

?

?

?

??

? ?

?

?

?

?

?

?

?

Page 28: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Video

1010100101010010101010010101010101010101

Video

1010100101010010101010010101010101010101

Video

open digital library

OA OA

OA

OA

OA

OA

OA

OA

OA

PMH

PMH

XPMH

XPMH

XPMH

XPMH

XPMH

XPMH

XPMH

XPMH

XPMH

XPMH

XPMH

Page 29: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Open Digital Library Protocol

Extended OAI-PMH

Protocol for Metadata Harvesting

Page 30: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Open Digital Library Component

Extended OPEN ARCHIVE

OPENARCHIVE

Page 31: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Open Digital Library Deployments

• NDLTD (www.ndltd.org)• Computer Science Teaching Center (www.cstc.org)• Computing and Information Technology

Interactive Digital Educational Library (www.citidel.org)

• Open Archives Distributed (NSF, DFG) – enhancements to PhysNet

• OCKHAM• Open to others through DL-in-a-box

Page 32: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Open Digital Library

• Network of Extended Open Archives where each node acts as either a provider of data, services or both.

• Component = Node

• Protocol = Arc

Page 33: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Open Digital Library Components

• Running now• XML-File (data provider from file system)• Search: simple or in-memory (Essex) or generalized• Union, browse, recent, filter• E-journal/review, Submit, Edit, Annotation• Recommender, Rating; Mirroring (see JCDL’02)• Working with NCSA: from DB, unstructured text

• Others in process• Classification/categorization• Registry (and other connections with web services)

Page 34: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

Document

1010100101010010101010010101010101010101

ETD-1

1010100101010010101010010101010101010101

Program

1010100101010010101010010101010101010101

ETD-2

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

Image

1010100101010010101010010101010101010101

ETD-3

1010100101010010101010010101010101010101

Video

1010100101010010101010010101010101010101

Video

1010100101010010101010010101010101010101

ETD-4

ETD DL for the Networked Digital Library of Theses and Dissertations

(www.ndltd.org)

Search

Filter

Filter

Union

Recent

Browse

PMH

PMH

PMH

ODLRecent

ODLBrowse

ODLUnion

ODLUnion

ODLSearch

ODLUnionPMH

PMH

US

ER

INT

ER

FA

CE

Students and researchers ETD collections

Example Open Digital Library

Page 35: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Harvest from data providers

DBUnion Archive Merger Component

DBBrowse Browse Engine

IRDB-1 Search Engine

As Metadata Search Service Provider

As Metadata Browse Service Provider

XML File Coll. & Data Provider 1

XML File Coll. & Data Provider 2

XML File Coll. & Data Provider 3

Open Digital Library: Extended

What’s NewEngine

As What’s New Service Provider

OAI-PMHData Provider

Submit Archive

OAIB (NCSA:from RDBMS)

Filter

Recommend

RateEngine

AnnotationEngine

IRDB-2 Search Engine

As Annotation Search Service

Provider

As Recommend & Rate Service Provider

Page 36: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

New ODL Component: Generalized

Search Platform

CS6604 Client: Patrick Fan, Wensi Xi

Group Member: Ming Luo, Rui Yang, Xiaoyan Yu

Page 37: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Introduction

• Background• The importance of search service in a digital

library• Problems of search engines in DLRL

IRDB Low search effectiveness, insufficient parsing component

ESSEX Less scalability due to in-memory Index

MARIAN Low search efficiency

Page 38: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Algorithms

• Phrase Searching Algorithms• Adjacency of terms

• Ranking Functions• Okapi (baseline)• GP-based ranking function

Page 39: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Genetic Programming (GP)

• A problem solving system designed based on principles of evolution and heredity

Order Doc. Rele.1 A 12 D 13 F 14 G 15 B 06 C 07 E 0

Order Doc. Rele.1 A 12 B 03 C 04 D 15 E 06 F 17 G 1

Feedback

Training

Data

Input

Ranking FunctionDiscovery

Ranking

Function f

Output

Order Doc. Rele.1 A 12 D 13 F 14 G 15 B 06 C 07 E 0

Order Doc. Rele.1 A 12 B 03 C 04 D 15 E 06 F 17 G 1

Feedback

Training

Data

Input

Order Doc. Rele.1 A 12 B 03 C 04 D 15 E 06 F 17 G 1

Feedback

Training

Data

Input

Ranking FunctionDiscovery

Ranking

Function f

Output

Page 40: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

An Example of GP-based RF(log (+ (* df (log (log (* (* (/ n df) (* (* (/ n df) (* (* df_max_Col tf) (+ df_max_Col tf_avg))) (* (/ tf tf_max) (log tf_avg_Col)))) (* (/ (* (* (/ n df) (* (* df_max_Col tf) (+ df_max_Col tf_avg))) (* (/ tf tf_max) (log tf_avg_Col))) (+ (* length df) tf_avg_Col)) (log tf_avg_Col)))))) (+ (* (* df_max_Col tf) (/ (* (* (/ (/ (* tf 6.720) (/ df N)) (* df_max_Col tf)) (* (* tf N) (+ df_max_Col tf_avg))) (* (/ tf tf_max) (log tf_avg_Col))) (+ (* length df) (* (* (/ tf tf_max) (+ (* length df) (* 2.812 1))) tf_avg)))) (+ (/ df tf_avg) tf))))

tf Query term frequency in the document ( vector )

tf_query Query term frequency in the query ( vector )

tf_max The maximum term frequency in a document ( scalar )

Length Document length in the number of words ( scalar )

Length_avg Average document length in the number of words ( scalar )

N Number of documents in the collection ( scalar )

tf_avg Average term frequency in the current document (scalar)

tf_avg_Col Average term frequency for all the documents in the collection ( scalar )

df_max_Col Maximum document frequency for a word in the collection ( scalar )

df Document frequency for the query words ( vector )

tf Query term frequency in the document ( vector )

tf_query Query term frequency in the query ( vector )

tf_max The maximum term frequency in a document ( scalar )

Length Document length in the number of words ( scalar )

Length_avg Average document length in the number of words ( scalar )

N Number of documents in the collection ( scalar )

tf_avg Average term frequency in the current document (scalar)

tf_avg_Col Average term frequency for all the documents in the collection ( scalar )

df_max_Col Maximum document frequency for a word in the collection ( scalar )

df Document frequency for the query words ( vector )

tftf Query term frequency in the document ( vector ) Query term frequency in the document ( vector )

tf_querytf_query Query term frequency in the query ( vector )Query term frequency in the query ( vector )

tf_maxtf_max The maximum term frequency in a document ( scalar )The maximum term frequency in a document ( scalar )

LengthLength Document length in the number of words ( scalar )Document length in the number of words ( scalar )

Length_avgLength_avg Average document length in the number of words ( scalar )Average document length in the number of words ( scalar )

NN Number of documents in the collection ( scalar )Number of documents in the collection ( scalar )

tf_avgtf_avg Average term frequency in the current document (scalar)Average term frequency in the current document (scalar)

tf_avg_Coltf_avg_Col Average term frequency for all the documents in the collection ( scalar )Average term frequency for all the documents in the collection ( scalar )

df_max_Coldf_max_Col Maximum document frequency for a word in the collection ( scalar )Maximum document frequency for a word in the collection ( scalar )

dfdf Document frequency for the query words ( vector )Document frequency for the query words ( vector )

Page 41: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Parser

• Flexibility• TREC Style SGML/HTML• Configurable tagging

• Abbreviation and number detection

• Case sensitive

• Phrase parsing

Page 42: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Interface –(I)

1. Receive user query

2. Send query to search engine

3. Get ranked list

4. Search database

5. Get document information

6. Return results to user

Servlet

Socket

JDBC

1

6

Database

4

5

Search Engine

2 3

Page 43: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Interface –(II)

1. Receive user query thru ODL’s XOAI searching protocol

2. Send query to search engine

3. Get ranked list4. Request metadata5. Get metadata6. Return results in

format complying with ODL’s searching protocol

Perl Adaptor

Socket

1

6OAI data provider

4

5

Search Engine

2 3

As an ODL component

Page 44: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

OCKHAM Initiative, Contact Info

• Supported by DL Federation, Mellon, NSF, …• P2P University Network involving:• Emory, Notre Dame, U. Arizona, Virginia Tech, …• PI: Martin Halbert

Phone 404-727-2204

Email: [email protected]

• OCKHAM URL:

http://ockham.library.emory.edu

Page 45: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

The Problem

• Digital library development is complex and expensive.

• Various DL development communities (in the USA at least) are not working together well.

• Results exhibit much incompatibility, little common practice, slow progress, and no leverage on investment.

• If this continues, we are just going to languish and fester.

Page 46: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Lightweight Protocols

• “Lightweight”, or relatively small and simple protocols seem to have clear advantages over “Full” protocols that attempt to be comprehensive.

• Successes of protocols considered lightweight is illuminating.

• Examples: TCP/IP, HTTP, LDAP, and the OAI PMH

Page 47: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Reference Models

• Reference Model: a common vocabulary and description of components, services, and inter-relationships that comprise a system under consideration

• Useful as a tool to foster consensus and common understanding in a time of rapid change and/or disagreement

• Explored in CS6604 class project with 2 focus groups: librarians, education experts

Page 48: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Current Focus: Peer-to-Peer (P2P) Lightweight (Protocol) Reference Models

• Builds on successful example of the OAI PMH, clearly understood minimalist concept of metadata distribution, implemented in simple protocols (e.g., ODL)

• Leads to developing simple reference models of specific subsystems, with associated simple protocols and standards

• Testing in NSDL, connecting university libraries to support teaching & learning

Page 49: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

OCKHAM Proposed Services

• Alerting• Browsing• Cataloging• Conversion• OAI – Z39.50• Pathfinding• Registry – prototype in CS6604 now• (plus others such as from adapted ODL)

Page 50: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

DL Student Research: Gonçalves

• 5S as a basis for developing digital libraries

• Theory

• Syntax, Semantics; Definitions, Relationships

• Specification of requirements

• Generation of systems

• Quality

Page 51: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Motivation for 5S

• DLs are not benefiting from formal theories as have other CS fields: DB, IR, PL, etc.

• DL construction: difficult, ad-hoc, lacking support for tailoring/customization

• Conceptual modeling, requirements analysis, and methodological approaches are rarely supported in DL development.• Lack of specific DL models, formalisms,

languages

Page 52: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina
Page 53: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

5S Layers

Societies

Scenarios

Spaces

Structures

Streams

Page 54: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

5S Model: Examples, Objectives

Models Examples ObjectivesStream Text; video; audio; image Describes properties of the DL

content such as encoding and language for textual material or particular forms of multimedia data

Structures Collection; catalog; hypertext; document; metadata; organization tools

Specifies organizational aspects of the DL content

Spatial Measure; measurable, topological, vector, probabilistic

Defines logical and presentational views of several DL components

Scenarios Searching, browsing, recommending,

Details the behavior of DL services

Societies Service managers, learners, Teachers, etc.

Defines managers, responsible for running DL services; actors, that use those services; and relationships among them

Page 55: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Intra-Model Relationships: Streams

• Participant concepts: {text, image, video, audio}• Relations:

• contains video image video audio

• Streams define the basic content types over which digital objects are built; the latter being the ultimate carriers of the information in the DL.

• However some complex types of streams (e.g., video) may themselves be associated with simpler types of streams (e.g., images, audio).

• This relation indicates that a video contains a image as one of its frames or a specific audio recording.

Page 56: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Streams

text

audio

image

video do mss

R

C DMc

describes

stores

is_version_of

Ic

Se

Sc

e

extendsreuses

SM

Ac

opexecutes

participates_in

recipient

runs

Scenarios

Societies

inherits_from/includes

association

uses

Top

Pr Metric

Measurable

Measure

describes

employsproduces

employsproduces

employsproduces

Structures

Spaces

Vec

belongs_to

contains

ms

is_ais_a

precedeshappens_before

is_a

redefinesinvokes

Page 57: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

DL Services/Activities Taxonomy (Gonçalves)

BrowsingCollaboratingCustomizingFilteringProviding accessRecommendingRequestingSearchingVisualizing

AnnotatingClassifyingClusteringEvaluatingExtractingIndexing

MeasuringPublicizing

RatingReviewing (peer)

SurveyingTranslating (language)

ConservingConverting

Copying/ReplicatingEmulatingRenewing

Translating (format)

AcquiringCataloging

Crawling (focused)DescribingDigitizingFederatingHarvestingPurchasingSubmitting

PreservationalCreational

AddValue

Repository-Building

Information SatisfactionServices

Infrastructure Services

Page 58: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Services, Definitions, Parameters

• In the table each service is characterized by• parameters (input, output)

• of the initial and final events

• of the scenarios that compose those services and

• respective pre- and post-conditions which are represented in terms of rules on DL relations.

• All other previous definitions and keys apply here.• That set is complemented with the following

definitions:

Page 59: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Services Related Definitions

• A query q is the representation of user interest or information need.

• Hyptxt is an hypertext; wherein anchor is a node.• A log_entry is a descriptive metadata specification

about an event of a scenario.• Let {doi} = {doi1, doi2,…, doin } be a set of digital

objects and Ct = {c1, c2,…,cn} is a set of labels for categories. A classifier classCt: {doi} 2Ct is a function that maps a digital object to a set of categories.

• A cluster cluk = {do1k, do2k, …, donk} is a subset of a set of digital objects.

Page 60: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Service User input Other Service Input

Output

Acquiring {doi} Ci Cj

Browsing anchor Hyptxt {doi}

Cataloging doi, msi_k (hi, mssi_m) (hi, mssi_(m+k))

Classifying {doi} classCt, Ct {(doi, {ck_i})}

Clustering {doi} X {cluk_i}

Expanding (query) {doi} IC_i, qi qj

Indexing Ci X IC_i

Linking Ci X Hyptxtik

Logging X ei({pi}); log_entryj

Rating doi ,acj X {(doi,acj,rk)}

Searching q, Ci IC_i {dok}

Visualizing {doi} tfrk spik

Page 61: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Searching Browsing

Ic

AcquiringUser interests/needs

query anchor

UniversalCollection

Ci

DMCi

Indexing

Society

actor

DescribingCataloguing

Linking

Hypertext

Infra-structure Services(fundamental)

Information Satisfaction Services(fundamental)

criteria sortOrder

{doi}

Submitting

Authoring

dok

mskj

Page 62: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

DL Services I/O Behavior

• Regarding the prior figure, which shows:• Instantiations of the “Services Definition” model• Inputs and outputs of examples of infrastructure

and information satisfaction DL services

• Key: • CDL = Collection

• ICDL = index for collection CDL

• {doi} = digital object

• Soc = Society

Page 63: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

SearchingBrowsing

queryanchor

Society

actor

criteria sortOrder

Ck, {doi}

Recommending Filtering Binding Visualizing Expanding query

user model/expr Classifier/expr {doj}

{doR} {doF}

bi

InformationSatisfaction Services

spV query’

fundamental

Rating/Reviewing (peer)

Training

Infrastructure

Services (Add_Value)

composite

Page 64: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Defining Quality in Digital LibrariesDL Concept Dimensions of Quality

Digital object Accessibility

Pertinence (*)

Preservability (*)

Relevance

Similarity

Significance

Timeliness (*)

Metadata specification Accuracy

Completeness

Conformance

Collection Completeness

Impact Factor

Catalog Completeness

Consistency

Repository Completeness

Consistency

Structures for Navigation Navigability (*)

Services Composability

Efficiency

Effectiveness

Extensibility

Reusability

Reliability

Page 65: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

AuthoringModifying

OrganizingIndexing

Storing

Archiving

NetworkingAccessing

Filtering

Creation

DistributionUtilization

Reputation

Similarity

Desirability

AccuracyCompletenessConformance

Discovery

SearchingBrowsingRecommending

Relevance

Timeliness

Accessibility

Usage

Inactive

Active

Discard

RetentionMining

Semi-Active

Preservability

Timeliness

Page 66: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Completeness of Metadata (1)

• Degree of completeness of a metadata specification msx

• Completeness(msx) = 1 - (no. of missing attributes in msx/ total attributes of the schema to which msx conforms)

• According to 5S definition of conformance

Page 67: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Completeness of Metadata (2)

• Example of application: • OCLC NDLTD Union

• average of completeness of all metadata specifications (records)• of the NDLTD union Archive• administered by OCLC• as of Feb, 23, 2004• regarding to the Dublin Core metadata standard

(15 attributes)

Page 68: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

00. 10. 20. 30. 40. 50. 60. 70. 80. 9

1

GW

UD

LSU

VTETD

MIT

UBC

PH

YSN

ET

VTIN

DIV

VAN

DER

BILT

NC

SU

USASK

PIT

T

HKU

HU

MBO

LT

OC

LC

BG

MYU

DR

ESD

EN

VIE

NN

A

GATEC

H

ETSU

USF

MU

EN

CH

EN

UTEN

N

CC

SD

WATER

LOO

NSYSU

LAVAL

UPSALL

A

CALT

EC

H

UC

L

WagU

niv

Completeness of Metadata (3)

Page 69: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Collection Completeness (1)

• Defn: A complete DL collection Cx is one which contains all the pertinent existing digital objects.

• completeness(Cx)• = |Cx| /|ideal collection’|• can be defined as the ratio between the size of

Cx and the ideal real-world collection

Page 70: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Collection Completeness (2)

• Example of use. Computing collections• The ACM Guide is a collection of bibliographic

references and abstracts of works published by ACM and other publishers.

• The Guide can be considered a good approximation of an ideal computing collection – it contains most of the different types of computing-related literature (about 735K works)

Page 71: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Degree of Completeness

ACM Guide 1

DBLP 0.652

CITIDEL(DBLP + ACM + NCSTRL + NDLTD-CS) 0.467

IEEE-DL 0.168

ACM-DL 0.146

Page 72: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Reliability (1)

• Scope: operations of DL

• Defn: the probability that the service will not fail during a given period of time [Hansen83]

• Example of use: CITIDEL services

• Example details: using log analysis April 1

Page 73: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Reliability (2)

CITIDEL service No. of failures/ no. of accesses

Reliability

searching 73/14370 0.994

browsing 4130/153369 0.973

requesting (getobject) 1569/318036 0.995

structured search 214/752 0.66

contributing 0/980 1

Page 74: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Extensibility, Reusability (1)

• Scope: Design and Implementation of DL services

• Two main classes1. Composability of services:

• Extensibility

• Reusability

2. Quality aspects of models and implementations: • completeness, consistency, correctness, soundness

Page 75: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Extensibility, Reusability (2)

• Micro-Reusability(Serv) = ( LOC(smx) * reused(sei),• smx SM, sei Serv, sex runs sei) / |

LOC(sm), sm SM|,• where LOC corresponds to the number of lines

of code of a service manager

• Macro-Reusability(Serv) = reused(sei), sei Serv/ |Serv|, where reused is a indicator function defined as :• 1, if smj: sej reuses si;• 0, otherwise

Page 76: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Extensibility, Reusability (3)

• Example: ETANA-DL

• Consider:• Services• Use of existing ODL component• Lines of Code (LOC)

• Reused from component

• Added for implementation

Page 77: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Service Component Based

LOC for implementing

service

LOC reused from

component

Total LOC

Searching – Back-end Yes - 1650 1650

Search Wrapping No 100 - 100

Recommending Yes - 700 700

Recommend Wrapping No 200 - 200

Annotating – Back-end Yes 50 600 600

Annotate Wrapping No 50 - 50

Union Catalog Yes - 680 680

User Interface Service No 1800 - 1600

Browsing No 1390 - 1390

Comparing (objects) No 650 - 650

Marking Items No 550 - 550

Items of Interest No 480 - 480

Recent Searches/Discussions No 230 - 230

Collections Description No 250 - 250

User Management No 600 - 600

Framework Code No 2000 - 2000

Total 8280 3630 11910

Page 78: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Extensibility, Reusability (5)

• Macro-Reusability(ETANA DL Services)• = 3/13 = 0.23• only a few important services are

componentized

• Micro-Reusability• = 3630/11910 = 0.304• we can re-use a very significant percentage of

DL code by implementing common DL services as components

Page 79: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Review of Gonçalves Achievements in Past Year

• Book Chapters1. Fox, E. A., Gonçalves, M. A., Luo, M., Chen, Y., Krowne, A., Zhang, B., McDevitt,, K.

Pérez-Quiñones, M., Cassel, L. N. Harvesting: Broadening the Field of Distributed Information Retrieval. In Multimedia Distributed Information Retrieval, eds. Fabio Crestani, Mark Sanderson, and Jamie Callan, 2003.

2. Fox, E., McMillan, G., Suleman, H., Gonçalves, M., Networked Digital Library of Theses and Dissertations. Invited chapter for “Digital Libraries: Policy, Planning, and Practice”, eds. Judith Andrews and Derek Law, Ashgate Publishing, 2003

• Journal papers1. 5S TOIS paper (April 2004, issue)2. S. Perugini, M. A. Gonçalves, and E. A. Fox. A Connection-Centric Survey of

Recommender Systems Research. Journal of Intelligent Information Systems, Jun, 2004.

3. Zhu, Q., Gonçalves, M. A., Fox, E. A.. 5SGraph: A Domain-Specific Visual Modeling Tool for Digital Libraries. Journal of the American Society for Information Science and Technology, submitted 2003, in revision

4. Baoping Zhang, Marcos Andre Goncalves, Yuxin Chen, Edward A. Fox, and Pavel Calado, "Combining Support Vector Machines and Structural Rules for Effective Filtering of OAI-Based Repositories", submitted to Journal of Digital Libraries (Springer Verlag) Special Issue on Asian Digital Libraries, 2004

Page 80: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

• Conference papers1. Pável P. Calado, Marcos André Gonçalves, Edward A. Fox, Berthier Ribeiro-Neto, Alberto H.

F. Laender, Altigran S. da Silva, Davi C. Reis, Pablo A. Roberto,Monique V. Vieira, and Juliano P. Lage. The Web-DL Environment for Building Digital Libraries from the Web. JCDL'2003, Third Joint ACM / IEEE-CS Joint Conference on Digital Libraries, May 27-31, 2003, Houston.

2. Marcos André Gonçalves, Ganesh Panchanathan, Unnikrishnan Ravindranathan, Aaron Krowne, Edward A. Fox, Filip Jagodzinski, and Lillian Cassel. The XML Log Standard for Digital Libraries: Analysis, Evolution, and Deployment. Proc. JCDL'2003, Third Joint ACM / IEEE-CS Joint Conference on Digital Libraries, May 27-31, 2003, Houston.

3. Qinwei Zhu, Marcos André Gonçalves, Rao Shen, Lillian Cassel, Edward A. Fox. Visual Semantic Modeling of Digital Libraries. ECDL'2003, 7th European Conference on Research and Advanced Technology for Digital Libraries, 17-22 August, 2003, Trondheim, Norway.

4. Rohit Kelapure, Marcos André Gonçalves, Edward A. Fox. Scenario-Based Generation of Digital Library Services. ECDL'2003, 7th European Conference on Research and Advanced Technology for Digital Libraries, 17-22 August, Trondheim, Norway

5. Marco Cristo, Pavel Calado, Edleno Moura, Nivio Ziviani, Berthier Ribeiro-Neto, and Marcos André Gonçalves. Combining Link-Based and Content-Based Methods for Web Document Classification. CIKM 2003, 3-8 November, New Orleans, Louisiana, USA, 2003.

6. Baoping Zhang, Marcos Andre Goncalves, and Edward A. Fox. An OAI-based Filtering Service for CITIDEL from NDLTD. ICADL 2003, 6th International Conference of Asian Digital Libraries, 8-11 December, Kuala Lumpur, Malaysia, 2003

7. U. Ravindranathan, R. Shen, M. A. Goncalves, W. Fan, E. A. Fox, and J. W. Flanagan. ETANA-DL: A Digital Library for Integrated Handling of Heterogeneous Archaeological Data. To be presented at ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004.

Page 81: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

• Conference papers8. U. Ravindranathan, R. Shen, M. A. Goncalves, W. Fan, E. A. Fox, and J. W. Flanagan. ETANA-DL: A Digital

Library for Integrated Handling of Heterogeneous Archaeological Data. To be presented at ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004.

9. M. A. Goncalves, E. A. Fox, A. Krowne, P. Calado, A. H. F. Laender, A. S. da Silva, and B. Ribeiro-Neto. The Effectiveness of Automatically Structured Queries in Digital Libraries. To be presented at ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004.

10. Alberto H. F. Laender, M. A. Goncalves, Pablo A. Roberto. BDBComp: Building a Digital Library for the Brazilian Computer Science Community. To be presented at ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004.

11. U. Ravindranathan, R. Shen, M. A. Goncalves, W. Fan, E. A. Fox, and J. W. Flanagan. Prototyping Digital Libraries Handling Heterogeneous Data Sources - The ETANA-DL Case Study. European Conference on Digital Libraries (ECDL 2004), Bath, UK, September 12-17, 2004. (submitted)

Other publications1. R. da S. Torres, C. B. Medeiros, M. A. Goncalves, and E. A. Fox. An OAI-based Digital Library Framework for

Biodiversity Information Systems. Department of Computer Science, Virginia Tech, Technical Report No. TR-04-01, 2004.

2. R. da S. Torres, C. B. Medeiros, M. A. Goncalves, and E. A. Fox. An OAI Compliant Content-Based Image Search Component. Demo to be presented at ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004.

3. R. da S. Torres, C. B. Medeiros, Renata Q. Dividino, Mauricio A. Figueiredo, M. A. Goncalves, E. A. Fox, and R. Richardson. Using Digital Library Components for Biodiversity Systems. Poster to be presented at ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004.

4. U. Ravindranathan, R. Shen, M. A. Goncalves, W. Fan, E. A. Fox, and J. W. Flanagan. ETANA-DL: Managing Complex Information Applications – An Archaeology Digital Library. Demo to be presented at ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004.

5. Qinwei Zhu, Marcos André Gonçalves, E. Fox. 5SGraph Demo: A Graphical Modeling Tool for Digital Libraries. Proc. JCDL'2003, Third Joint ACM / IEEE-CS Joint Conference on Digital Libraries, May 27-31, 2003, Houston.

Page 82: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Proposed Outline of Dissertation(Marcos André Gonçalves)

• Chapter 1 – Introduction and Motivation• Chapter 2 – Background and Related Work• Chapter 3 – Streams, Structures, Spaces, Scenarios and Societies: the 5S

Formal Model for Digital Libraries• Chapter 4 – Towards a Digital Library Theory: A Formal Digital Library

Ontology based on 5S• Chapter 5 – Applications of the 5S Model/Ontology

• 5.1 Declarative Specification of DLs: the 5S Language• 5.2 Semantic Visual Modeling of DLs: the 5SGraph Tool• 5.3 (Semi-) Automatic Generation of Componentized DLs: The 5SGen Tool• 5.4 Evaluating DLs: The XML Log Standard for DLs• 5.5 Formally comparing Architectures: Fedora and Buckets (time

permitting)

• Chapter 6 – Defining Quality in Digital Libraries• Chapter 7 – Conclusions and Future Work• Appendix 1- Mathematical Preliminaries

Page 83: Next Generation Digital Libraries: Supporting Interoperability, Semantics, and Quality Biblioteca Central Universidad Nacional del Sur Bahia Blanca, Argentina

Questions/Discussion?