141
1 Technology (Vienna – 21 Sept 2007) From information retrieval to digital libraries to computer science educationEdward A. Fox [email protected] http://fox.cs.vt.edu • Dept. of Computer Science, Virginia Tech • Blacksburg, VA 24061 USA

1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

1

Vienna University of Technology(Vienna – 21 Sept 2007)

“From information retrieval to digital libraries to computer

science education”

Edward A. Fox

[email protected] http://fox.cs.vt.edu

• Dept. of Computer Science, Virginia Tech

• Blacksburg, VA 24061 USA

Page 2: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

2

“From information retrieval to digital libraries to computer science education”

• ABSTRACT: Information is a fundamental human need. The field of information retrieval has helped address this need since the 1960s, with a range of models and systems. A broad view of this field leads to digital libraries, a re-definition of the concepts, systems, and human involvement in sharing information across time and space, supported by digital technologies. We can formalize and better operationalize this through the 5S framework, which addresses information with regard to Societies, Scenarios, Spaces, Structures, and Streams. This approach has supported our work with personalization and computer science syllabi, curriculum development regarding digital libraries, and ensuring that college graduates are prepared not only to live in, but also to help build our future cyberinfrastructure, i.e., for Living In the KnowlEdge Society (LIKES). This talk will summarize our related research and education innovation.

Page 3: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

Acknowledgements (selected)

• Colleagues: Lillian Cassel, Debra Dudley, Weiguo Fan, Marcos Gonçalves, Doug Gorton, Rohit Kelapure, Neill Kipp, Aaron Krowne, Ming Luo, Uma Murthy, Manuel Perez, Ananth Raghavan, Rao Shen, Hussein Suleman, Srinivas Vemuri, Layne Watson, …

• Sponsors: ACM, AOL, CAPES, DFG, Google, IBM, IMLS, INL, Microsoft, NSF (CCF-0722259; IIS-9986089, 0080748, 0086227, 0307867, 0325579, 0535057, 0535060, 0736055 ; DUE-0121679, 0121741, 0136690, 0333531, 0333601, 0435059, 0532825), SUN, …

Page 4: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

4

Acknowledgements - Mentors

• JCR Licklider – undergrad advisor (1969-71)– Author in 1965 of “Libraries of the Future”– Before, at ARPA, funded start of Internet

• Michael Kessler – BS thesis advisor– Project TIP (technical information project)– Defined bibliographic coupling

• Gerard Salton – graduate advisor (1978-83)– “Father of Information Retrieval”

Page 5: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

5

Information Retrieval:Algorithms and Heuristics 2nd Ed.

• By

• David A. Grossman &

• Ophir Frieder

• Kluwer Academic Publishers

Page 6: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

6

Document Retrieval(Grossman & Frieder Fig. 1.1)

Page 7: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

7

Vector Space Model – 2 terms(Grossman & Frieder Fig. 2.2)

Page 8: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

8

Language Model(Grossman & Frieder Fig. 2.5)

Page 9: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

9

Document-Term-Query Inference Network(Grossman & Frieder Fig. 2.7)

Page 10: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

10

Inference Network Layers(Grossman & Frieder Fig. 2.8)

Page 11: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

11

Relevance Feedback Process(Grossman & Frieder Fig. 3.1)

Page 12: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

12

Information Life Cycle

AuthoringModifying

OrganizingIndexing

StoringRetrieving

DistributingNetworking

Retention/ Mining

AccessingFiltering

UsingCreating

Page 13: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

13

Asynchronous, Digital Library Mediated Scholarly Communication

Different time and/or place

Page 14: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

14

DLs Shorten the Chain to

Author

Reader

Digital

LibraryEditor

Reviewer

Teacher

Learner

Librarian

Page 15: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

15

DL Definitions - 1

• “A digital library is an organized and focused collection of digital objects, including text, images, video, and audio, along with methods of access and retrieval, and for selection, creation, organization, maintenance, and sharing of the collection.”

• Witten & Bainbridge – “How to Build a Digital Library” – Morgan Kaufmann 2003

Page 16: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

16

DL Definitions - 2

• “Digital libraries are organizations that provide the resources, including the specialized staff, to select, structure, offer intellectual access to, interpret, distribute, preserve the integrity of, and ensure the persistence over time of collections of digital works so that they are readily and economically available for use by a defined community or set of communities”

• Waters,D.J. CLIR Issues, July/August 1998• www.clir.org/pubs/issues/issues04.html

Page 17: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

17

DL Definitions - 3

• Issues and Spectra

– Collection vs. Institution

– Content vs. System

– Access vs. Preservation

– “Free” vs. Quality

– Managed vs. Comprehensive

– Centralized vs. Distributed

Page 18: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

18

DL Definitions - 4

• NOT a “digitized library”• NOT a “deconstruction” of existing

systems and institutions, moving them to an electronic box in a Library

• IS a new way to deal with knowledge– Authoring, Self-archiving, Collecting,– Organizing, Preserving,– Accessing, Propagating, Re-using

Page 19: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

19

D ig ita l L ib ra r y C o n te n t

A rtic le s ,R e p o rts,

B o o ks

T e xtD o cum e n ts

S p ee ch ,M u s ic

V id eoA u d io

(A e ria l)P h o tos

G e og rap h icIn fo rm ation

M o d e lsS im u la tio ns

S o ftw a re ,P ro g ra m s

G e no m eH u m a n,a n im a l,

p la n t

B ioIn fo rm ation

2 D , 3 D ,V R ,C A T

Im ag es a ndG ra p h ics

C o nte n tT yp e s

Page 20: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

20

Informal 5S & DL Definitions

DLs are complex systems that

• help satisfy info needs of users (societies)

• provide info services (scenarios)

• organize info in usable ways (structures)

• present info in usable ways (spaces)

• communicate info with users (streams)

Page 21: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

21

Hypotheses

• A formal theory for DLs can be built based on 5S.

• The formalization can serve as a basis for modeling and building high-quality DLs.

Page 22: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

22

• “Streams”

- All types of (multimedia) content

(as well as communications and flows over networks, or into sensors, or sense perceptions; data stream management systems)

• “Structures”

- Organizational schemes

(including data structures, databases, and knowledge representations – taxonomies, ontologies)

5S Framework

Page 23: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

23

5S Framework

• “Spaces” - 2D and 3D interfaces, GIS data,

representations of documents and queries • “Scenarios”

- System states and events, but also can represent situations of use by human users (or machine

processes, yielding services or transformations of data)

• “Societies” - Both software “service managers” and fairly generic

“actors” who could be (collaborating) human (users).

Page 24: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

24

5Ss

Ss Examples Objectives

Streams Text; video; audio; image Describes properties of the DL content such as encoding and language for textual material or particular forms of multimedia data

Structures Collection; catalog; hypertext; document; metadata

Specifies organizational aspects of the DL content

Spaces Measure; measurable, topological, vector, probabilistic

Defines logical and presentational views of several DL components

Scenarios Searching, browsing, recommending

Details the behavior of DL services

Societies Service managers, learners, teachers, etc.

Defines managers, responsible for running DL services; actors, that use those services; and relationships among them

Page 25: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

25

Page 26: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

26

ETANA-DL

• Archaeological DL• Integrated DL

– Heterogeneous data handling

• Applies and extends the OAI-PMH– Open Archives Initiative Protocol for Metadata

Handling

• Design considerations– Componentized– Extensible– Portable

Page 27: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

27

Page 28: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

28

ETANA Societies

1. Historic and pre-historic societies (being studied)2. Archaeologists (in academic institutes, fieldwork

settings, or local and national governmental bodies)

3. Project directors4. Technical staff (consisting of photographers,

technical illustrators, and their assistants)5. Field staff (responsible for the actual work of

excavation)6. Camp staff (e.g., camp managers, registrars, tool

stewards)7. General public (e.g., educators, learners, citizens)

Page 29: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

29

ETANA Societies

• Social issues1. Who owns the finds?

2. Where should they be preserved?

3. What nationality and ethnicity do they represent?

4. Who has publication rights?

5. What interactions took place between those at the site studied, and others? What theories are proposed by whom about this?

Page 30: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

30

ETANA Scenarios1. Life in the site in former times2. Digital recording: the planning stage and the excavation stage 3. Planning stage: remote sensing, fieldwalking, field surveys, building

surveys, consulting historical and other documentary sources, and managing the sites and monuments

4. Excavation1. Detailed information is recorded, including for each layer of soil, and for

features such as pole holes, pits, and ditches. 2. Data about each artifact is recorded together with information about its

exact find spot. 3. Numerous environmental and other samples are taken for laboratory

analysis, and the location and purpose of each is carefully recorded. 4. Large numbers of photographs are taken, both general views of the

progress of excavation and detailed shots showing the contexts of finds. 5. Organization and storage of material6. Analysis and hypotheses generation and testing7. Publications, museum displays8. Information services for the general public

Page 31: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

31

ETANA Spaces

1. Geographic distribution of found artifacts2. Temporal dimension (as inferred by

archaeologists) 3. Metric or vector spaces

1. used to support retrieval operations, and to calculate distance (and similarity)

2. used to browse / constrain searches spatially

4. 3D models of the past, used to reconstruct and visualize archaeological ruins

5. 2D interfaces for human-computer interaction

Page 32: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

32

ETANA Structures

1. Site Organization1. Region, site, partition, sub-partition, locus,

2. Temporal orderings (ages, periods)

3. Taxonomies1. for bones, seeds, building materials, …

4. Stratigraphic relationships1. above, beneath, coexistent

Page 33: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

33

ETANA Streams

1. successive photos and drawings of excavation sites, loci, unearthed artifacts

2. audio and video recordings of excavation activities and discussions

3. textual reports

4. 3D models used to reconstruct and visualize archaeological ruins.

Page 34: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

34

5S and DL formal definitions and compositions (April 2004 TOIS)

5S

structures (d.10)streams (d.9) spaces (d.18) scenarios (d.21) societies (d. 24)

structural metadataspecification(d.25)

descriptive metadataspecification(d.26)

repository(d. 33)

collection (d. 31)

(d.34)indexingservice

structured stream (d.29)

digitalobject (d.30)

metadata catalog (d.32)

browsingservice

(d.37)

searchingservice (d.35)

digital library(minimal) (d. 38)

services (d.22)

sequence (d. 3)

graph (d. 6)function (d. 2)

measurable(d.12), measure(d.13), probability (d.14), vector (d.15), topological (d.16) spaces

event (d.10)state (d. 18)

hypertext(d.36)

sequence (d. 3)

transmission(d.23)

relation (d. 1) language (d.5)

grammar (d. 7)

tuple (d. 4)*

Page 35: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

35

Fox & Gonçalves Book Outline

• Ch. 1. Introduction (Motivation, Synopsis)

• Part 1 – The “Ss”

• Part 2 – Higher DL Constructs

• Part 3 – Advanced Topics

• Appendix

Page 36: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

36

Book Parts and Chapters - 1

• Ch. 1. Introduction (Motivation, Synopsis)

• Part 1 – The “Ss”– Ch. 2: Streams

– Ch. 3: Structures

– Ch. 4: Spaces

– Ch. 5: Scenarios

– Ch. 6: Societies

Page 37: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

37

Book Parts and Chapters - 2

• Part 2 – Higher DL Constructs– Ch. 7: Collections

– Ch. 8: Catalogs

– Ch. 9: Repositories and Archives

– Ch. 10: Services

– Ch. 11: Systems

– Ch. 12: Case Studies

Page 38: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

38

Book Parts and Chapters - 3

• Part 3 – Advanced Topics– Ch. 13: Quality– Ch. 14: Integration– Ch. 15: How to build a digital library– Ch. 16: Research Challenges, Future Perspectives

• Appendix– A: Mathematical preliminaries– B: Formal Definitions: Ss – C: Formal Definitions: DL terms, Minimal DL– D: Formal Definitions: Archeological DL– E: Glossary of terms, mappings

Page 39: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

39

Chapter 3: (Degree of) Structure

Chaotic Organized Structured

Web DLs DBs

Page 40: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

40

Digital Objects (DOs)

• Born digital

• Digitized version of “real” object– Is the DO version the same, better, or worse?– Decision for ETDs: structured + rendered

• Surrogate for “real” object– Not covered explicitly in metamodel for a

minimal DL– Crucial in metamodel for archaeology DL

Page 41: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

41

Metadata: Complex to Simple

MARC ($50) Dublin Core (DC)

+thesis

Page 42: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

42

Also Important: Epub, SGML, XML

• 5S perspective: streams, structures, scenarios

• Authoring

• Rendering, presenting

• Tagging, Markup, DOM

• Semi-structured information

• Dual-publishing, eBooks

• Styles (XSL, XSLT)

• Structured queries

Page 43: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

43

Chapter 4 Overview (Spaces)

• Retrieval models

– Boolean, extended Boolean

– Vector, LSI

– Probabilistic: classical, belief network, inference network, language models

• User interfaces and visualization – cont’d

Page 44: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

44

User interfaces and visualization

• 2D interfaces

• 3D interfaces

• GIS

• Other paradigms: trees, graphs, bubbles, coordinated views, …

• Stepping Stones and Pathways– http://fox.cs.vt.edu/SSP/

Page 45: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

45

Chapter 6 Overview (Societies)

• User communities– Authors, editors, teachers, students, readers– Personal(ization), group(ware), community, global– Accessibility, universal access

• Librarians: reference, acquisition, operations• Research community

– Associations, conferences, publications, labs, projects• Economics

– Copyright, intellectual property rights, digital rights management, authorization, authentication, security, privacy, self-archiving (eprints)

– Publishers, catalogers, distributors, sustainability– Open source, commercial, hybrid

Page 46: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

46

Chapter 9 Archives & Repositories

• Open Archives Initiative (OAI)• Institutional Repositories

• Persistent storage of digital objects• Coupling of metadata with digital objects• Use of “handles” as identifiers for digital

objects

• Put, get, harvest

Page 47: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

47

OAI - Open Archives Initiative

• Advocacy for interoperability

• Standard for transferring metadata among digital libraries– Protocol for Metadata Harvesting (PMH)

• Simplicity• Generality• Extensibility

• Support for PMH => Open Archive (OA)

Page 48: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

48

OAI – Repository PerspectiveRequired: Protocol

DODO DO DO

MDO

MDO MDOMDOMDO

MDOMDOMDO

Page 49: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

49

OAI – Black Box Perspective

OA 1

OA 2

OA 4

OA 3

OA 5OA 6

OA 7

Page 50: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

50

Tiered Model of Interoperability

Mediator services

Metadata harvesting

Document models

Page 51: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

51

Institutional Repositories - 1

• “Institutional repositories are digital collections that capture and preserve the intellectual output of a single university or a multiple institution community of colleges and universities.”

• Crow, R. “Institutional repository checklist and resource guide”, SPARC, Washington, D.C., USA

• www.arl.org/sparc/IR/IR_Guide_v1.pdf

Page 52: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

52

Institutional Repositories - 2

• “A university-based institutional repository is a set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members. It is most essentially an organizational commitment to the stewardship of these digital materials, including long-term preservation where appropriate, as well as organization and access or distribution.”

• Lynch, C.A. In ARL Bimonthly Report 226, pp. 1-7, Feb. 2003, www.arl.org/newsltr/226/ir.html

Page 53: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

53

What is aDigital Object Repository?

Also called: digital rep., digital asset rep., institutional repository

Stores and maintains digital objects (assets)Provides external interface for Digital Objects

Creation, Modification, Access

Enforces access policiesProvides for content type disseminations

Adapted from Slide by V. Chachra, VTLS

Page 54: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

54

Goals of Institutional Repositories (by Steven Harnad, U. Southampton) Self Archiving of Institutional ResearchSelf Archiving of Institutional Research

Thesis and Dissertations (VTLS NDLTD Project)Thesis and Dissertations (VTLS NDLTD Project)Article preprints and post printsArticle preprints and post printsInternal documents and mapsInternal documents and maps

Management of digital collectionsManagement of digital collections

Preservation of materials – decentralized approachPreservation of materials – decentralized approach

Housing of teaching materialsHousing of teaching materials

Electronic Publishing of journals, books, posters, maps, Electronic Publishing of journals, books, posters, maps, audio, video and other multimedia objectsaudio, video and other multimedia objects

Adapted from Slide by V. Chachra, VTLS

Page 55: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

55

Chapter 10 Services

• Taxonomy of services

• Ontology, composition, reuse

• Evaluation

• Key services in-depth:– Crawling, indexing– Clustering, classifying– Recommending, using social networks– Logging

Page 56: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

56

Browsing Collaborating Customizing Filtering Providing access Recommending Requesting Searching Visualizing

Annotating Classifying Clustering Evaluating Extracting Indexing

Measuring Publicizing

Rating Reviewing (peer)

Surveying Translating

(language)

Conserving Converting

Copying/Replicating Emulating Renewing

Translating (format)

Acquiring Cataloging

Crawling (focused) Describing Digitizing

Federating Harvesting Purchasing Submitting

Preservational Creational

Add Value

Repository-Building

Information Satisfaction

Services

Infrastructure Services

Page 57: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

57

Ontology: Applications

• Expand definition of minimal DL by characterizing– typical DL services – in the context of “employs” and “produces”

relationships

• Use characterization to:– Reason about how DL services can be built

from other DL components– As well as be composed with other services

through extension or reuse

Page 58: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

58

Streams

text

audio

image

video digitalobject

Repository

CollectionCatalog

describes

stores

is_version_of/ cites/links_to

Index

Service

Scenario

event

extends

reuses

ServiceManager

Actor

operationexecutes

participates_in

recipient

runs

Scenarios

Societies

inherits_from/includes

association

uses

Topological

ProbabilisticMetric

Measurable

Measure

describes

employsproduces

employsproduces

employs

produces

Structures

Spaces

Vector

contains

metadata specifications

is_a is_a

precedes

happens_before

is_a

redefinesinvokes

contains

contains

Page 59: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

59

Ontology: Applications

Page 60: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

60

SearchingBrowsing

queryanchor

Society

actor

Collection, {digital object}

Recommending Filtering Binding Visualizing Expanding query

user model query/category {digital object}

{digital object} {digital object}

binder

InformationSatisfaction Services

space query’

fundamental

Rating Training

Infrastructure

Services (Add_Value)

composite

Requesting

handle

p pp

e e e{(digital object, actor, rate) }

p

e

e

p p p p p

e e

classifier

e ee e

e

p

e

Indexing

Index

p

e

transformer

e

Page 61: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

61

5S and Generating DLs

• 5S Framework

• 5S definitions, services taxonomy, ontology

• 5SL (specification language)

• 5SGraph (to prepare 5SL)

• 5SGen (for DL development, incl. DSpace)

• SchemaMapper for development of union DL

Page 62: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

62

Requirements Analysis Design Implementation Test

5S 5SLOO ClassesWorkflow Components

DLEvaluation

5SGraph 5SLGenFormalTheory/Metamodel

DL XMLLog

Page 63: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

63

Chapter 11 Systems:Architectural Issues

• Independent system vs. part of federation• Centralized vs. distributed vs. open services• Monolithic vs. modular vs. componentized• Topologies: bus vs. star vs. hierarchical vs. network• Decompositions vary

– search engine, browser, DBMS, MM support– repository, handle server, client– information resources + mediators, bus or agent

collection + client with workspace/environment

Page 64: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

64

Also Important: Agents

• 5S perspective: societies, streams, spaces, scenarios, structures

• Protocols: light-weight

• Knowledge interchange: mediators, wrappers

• Negotiation, registries

• Distributed issues

• Webbots (automatic indexing)

• Ontologies (standard upper)

Page 65: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

65

Fedora™ Digital Object ArchitecturePersistent ID (PID)

Disseminators

System Metadata

EAD, TEI, DC, MARC,

VRA Core, MIX, etc.

Datastreams

Images, E-books, E-journals, Music, Video, etc.

Globally unique persistent id

Public view: access methods for obtaining “disseminations” of digital object content

Internal view: metadata necessary to manage the object

Protected view: content that makes up the “basis” of the object

The Mellon Fedora Project

Adapted from Slide by V. Chachra, VTLS

Page 66: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

66

Example DisseminatorsPersistent ID (PID)

Default

Disseminators

Simple Image

System Metadata

Datastreams

Get ProfileList ItemsGet Item

List MethodsGet DC Record

Get ThumbnailGet Medium

Get HighGet VeryHigh

Page 67: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

67

Fedora™Repository

E x ter n a lC o n ten tS o u r c e

E x ter n a lC o n ten tS o u r c e

HT

TP

E x ter n a l C o n ten tR etr iev er

X M L F ile s

Re la t io n a l D B

S e s s io n M a n a g e me n tU s e r A u th e n t ic a t io n

P o l icies

U s ers /G ro u p s

H T T P

F T P

D atas tr eam s

D ig ita l O b jec tsS to rag e S u b s ys te m

S e c u rityS u b s ys te m

W e b Se r vi c eE xpo s ur eL aye r

SO

AP

R em o teS er v ic e

L o c alS er v ic e

M an ag e A c c e s s S e arc h O A I P ro v id e r

M an ag e m e n tS u b s ys te m

A c c e s sS u b s ys te m

HT

TP

FT

P

H T T PH T T P S O A P H T T P S O A P H T T P S O A P

C lie n tA pplica t io n

B a tchPro g ra m

S e rv e rA pplica t io n

W e bB ro ws e r

Co mp o n e n t M g mt

O b je c t M g mt

O b je c t Va lid a t io n

P ID Ge n e ra t io n

O b je c t D is s e min a t io n

O b je c t Re fle c t io n

P o lic y En fo rc e me n t

P o lic y M g mt

Co n te n t

Web Service Web Service Exposure Exposure LayerLayer

Adapted from Slide by V. Chachra, VTLS

Page 68: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

68

5SL: a DL design language

• Domain specific languages – Address a particular class of problems by offering

specific abstractions and notations for the domain at hand

– Advantages: domain-specific analysis, program management, visualization, testing, maintenance, modeling, and rapid prototyping.

• XML-based realization of 5S– Interoperability– Use of many sub-languages (e.g., MIME types, XML

Schemas, UML notations)

Page 69: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

69

• Help users model their own instances of a digital library (DL) in the 5S language (5SL).

• A simple modeling process which enables rapid generation of digital libraries

• Features– 5SGraph loads and displays a metamodel in a structured toolbox.– The structured editor of 5SGraph provides a top-down visual

building environment for the DL designer.– 5SGraph produces syntactically correct 5SL files according to the

visual model built by the designer.

5SGraph: A DL Modeling Tool

Page 70: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

70

Overview of 5SGraph

Workspace

(instance model)

Structured

toolbox

(metamodel)

Page 71: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

71

Page 72: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

72

Page 73: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

73

Page 74: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

74

Page 75: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

75

5SGen

• Version 1 – MARIAN as the target system– Focused on rich structures: semantic networks– Behavior attached to nodes/links

• Version 2 – Shifted for later work to componentized (ODL) approach – Focused on scenarios/societies– Structures/Spaces encapsulated within components

(e.g., relational tables, indexes)– Only textual streams supported

• Version 3 – Into DSpace (practical DL)

Page 76: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

76

5SLGen – Version 2: ODL, Services, Scenarios

5SL-SocietiesModel (1)

XPATH/JDOMTransform (2)

XMI:ClassModel (3)

Xmi2Java (4)

JavaClasses

Model (5)

superclass

DeterministicFSM (10)

SMC (11)

JavaFinite

State MachineClass

Controller (12)

5SL-ScenarioModel (6)

XPath/JDOMTransform (7)

StateChartModel (8)

Scenario Synthesis (9)

ODLSearch

Java

Wrapping

import

ComponentPool

ODLBrowse

Java

Wrapping

import

.

.

.

JSPUser

InterfaceView (13)

Generated DL Services

DLDesigner

DLDesigner

binds

5SLGen

5SL-SocietiesModel (1)

XPATH/JDOMTransform (2)

XMI:ClassModel (3)

Xmi2Java (4)

JavaClasses

Model (5)

superclass

DeterministicFSM (10)

SMC (11)

JavaFinite

State MachineClass

Controller (12)

5SL-ScenarioModel (6)

XPath/JDOMTransform (7)

StateChartModel (8)

Scenario Synthesis (9)

ODLSearch

Java

Wrapping

import

ComponentPool

ODLBrowse

Java

Wrapping

import

.

.

.

ODLSearch

Java

Wrapping

import

ComponentPool

ODLBrowse

Java

Wrapping

import

.

.

.

JSPUser

InterfaceView (13)

Generated DL Services

DLDesigner

DLDesigner

binds

5SLGen

Page 77: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

77

Tools/Applications

5S MetaModel

5SGraphDL

Expert

DL Designer

5SL DL

Model

5SLGen

Practitioner

Researcher

TailoredDL

Teacher

componentpool

ODLSearch,ODLBrowse,ODLRate,ODLReview,

…….

Logging ModuleXMLLog

Page 78: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

78

5SGraph5S Archaeology

MetaModelArchDL Expert ArchDL Designer

Structure Sub-model

ETANA-DLUnion Services

Descriptions

HarvestingMapping

SearchingBrowsing

Scenario Sub-model

VN Metadata Format

ETANA-DL Metadata Format

HD Metadata Format

Mapping Tool

Wrapper4VN Wrapper4HD

Inverted Files

Services DB

Index

Index

BrowseService

SearchService

Browse DB

OtherETANA-DL

Services

Web

Interface

XOAI

XOAI

VNCatalog

HDCatalog

UnionCatalog

5SGen

ComponentPool

Browsing…

Page 79: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

Ch. 12 Case Studies: CS -> CSTC

• NSF and ACM Education Committee funded a 2 year project “A Computer Science Teaching Center” - CSTC - http://www.cstc.org/

• College of NJ, U. Ill. Springfield, Virginia Tech• Focus initially on labs, visualization,

multimedia• Multimedia part supported by a 2nd grant to

Virginia Tech and The George Washington University (with curricular guidelines)

Page 80: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

CS Teaching Center (CSTC)

• Instead of building large, expensive multimedia packages, that become obsolete and are difficult to re-use, concentrate on small knowledge units.

• Learners benefit from having well-crafted modules that have been reviewed and tested.

• Use digital libraries to build a powerful base of support for learners, upon which a variety of courses, self-study tutorials & reference resources can be built.

• ACM support led to Journal of Educational Resources in Computing (JERIC): completed 2 co-EIC terms

Page 81: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

81

Page 82: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

82

Browsing (2)

Page 83: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

83

Page 84: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

84

Page 85: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

85

Computing and Information Technology Interactive Digital Educational Library (CITIDEL)

• Domain: computing / information technology

• Genre: one-stop-shopping for teachers & learners: courseware (CSTC, JERIC), leading DLs (ACM, IEEE-CS, DB&LP, CiteSeer), PlanetMath.org, NCSTRL (technical reports), …

• Submission & Collection: sub/partner collections www.citidel.org

Page 86: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

86

DIGITAL LIBRARY SERVICES

REPOSITORIES

USER PORTALS

Overview of CITIDEL architecture

Page 87: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

87

Union Metadata Repository

OAI Data

Provider

Laboratories Repository

Applets Repository

Papers Repository

Syllabi Repository

. . .

Digital Library Services

OAI Data

Harvester

Distributed repository structure

Page 88: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

88

Annotations

OAI Data

Harvester

EDUCATORS

ADMINISTRATORS LEARNERS

Multilingual Searching

Revising Annotating Filtering Browsing Administering

Filtering Profiles User Profiles

Union Metadata

OAI Data

Provider

Remote and Peer Digital Libraries (eg. NSDL -CIS)

PORTALS

SERVICES

REPOSITORIES

Digital library architecture for localand interoperable CITIDEL services

Page 89: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

89

Page 90: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

90

Page 91: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

91

Page 92: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

92

Page 93: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

93

Page 94: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

CITIDEL -> NSDL

• A collection project in the

• National STEM (science, technolgy, engineering, and mathematics) education Digital Library – NSDL

• National Science Digital Library

• www.nsdl.org

• (Next slides courtesy Lee Zia, NSF)

Page 95: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

95

Connects:

Users: students, educators, life-long learners

Content: structured learning materials; large real-time or archived datasets; audio, images, animations; primary sources; digital learning objects (e.g. applets); interactive (virtual, remote) laboratories; ...

Tools: search; refer; validate; integrate; create; customize; publish; share; notify; collaborate; ...

Page 96: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

96

Enables:Environments for

• Communication

• Collaboration

• Creation

• Validation

• Evaluation

• Recognition

• ...

• Discovery

• Stability

• Reliability

• Reusability

• Interoperability

• Customizability

• ...

of Resources

AND

Page 97: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

97

Collections

• Discovery of content• Classification and cataloguing• Acquisition and/or linking; referencing• Disciplinary-based themes define a natural body of content,

but other possibilities are also encouraged • Access to massive real-time or archived datasets• Software tool suites for analysis, modeling, simulation, or

visualization• Reviewed commentary on learning materials and pedagogy

Page 98: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

98

Services• Help services, frequently asked questions, etc.

• Synchronous/asynchronous collaborative learning environments using shared resources

• Mechanisms for building personal annotated digital information spaces

• Reliability testing for applets or other digital learning objects

• Audio, image, and video search capability

• Metadata system translation

• Community feedback mechanisms

Page 99: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

99

Page 100: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

100

Page 101: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

101

Page 102: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

102

NSDL Information ArchitectureEssentially as developed by the Technical Infrastructure Workgroup

referenceditems &

collections

referenceditems &

collections

Special Databases

NSDLServicesNSDL

ServicesOther NSDLServices

CI Services

annotation

CI Services

discussion

CI Services

personalization

CI Services

authentication

CI Services

browsing

Core Services:information retrieval

Core Collection-Building Services

harvesting

Core Collection-Building Services

protocols

Core Services:metadata gathering

Portals &ClientsPortals &

ClientsPortals &Clients

Usage Enhancement

Collection Building

User Interfaces

NSDLCollections

NSDLCollections

NSDLCollections

CoreNSDL“Bus”

Page 103: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

A Digital Library Case Study

• Domain: graduate education, research

• Genre:ETDs=electronic theses & dissertations

• Submission: ETD-db, DSpace, Proquest, …

• Collection: local archives, regional collaborations, global union catalog

Project: Networked Digital Library of Theses & Dissertations (NDLTD) www.ndltd.org

Page 104: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

104

Page 105: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

Student Gets CommitteeSignatures and Submits ETD

Signed

Grad School

Page 106: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

• Aiding universities to enhance graduate education, publishing and IPR efforts

• Helping improve the availability and content of theses and dissertations

• Educating ALL future scholars so they can publish electronically and effectively use digital libraries (i.e., are Information Literate and can be more expressive)

What are we doing?

Page 107: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

107

Why ETD? Short Answer

• For Students:– Gain knowledge and skills for the Information Age– Richer communication (digital information, multimedia, …)

• For Universities: – Easy way to enter the digital library field and benefit

thereby

• For the World: – Global digital library – large, useful, many services

• General:– Save time and money– Increased visibility for all associated with research results

Page 108: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

108

Metamodels in the 5S Framework

• Modeling archaeological information systems using the 5S theory to better understand the domain and design the system and the supported services

• Minimal DL

• Minimal ArchDL

• …

Page 109: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

109

Digital Object

RepositoryCollection Minimal DL

Metadata Catalog

Descriptive Metadata

Specification

A Minimal DL in the 5S Framework

Structural Metadata

Specification

Streams Structures Spaces Scenarios Societies

indexing

browsing searching

services

hypertext

Structured Stream

Page 110: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

110

Streams Structures Spaces Scenarios Societies

indexing

browsing searching

services

hypertext

Structured Stream

Descriptive Metadata

specification

SpaTemOrg

StraDia

Arch Descriptive Metadata specification

ArchDO

ArchObj

ArchColl

Arch Metadata catalog

ArchDColl ArchDR Minimal ArchDL

A Minimal ArchDL in the 5S Framework

Page 111: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

111

Moving from a minimal DL towards a DL reference model (1/2)

Minimal DL DL reference model

Multimedia

Annotation

Knowledge management Practical DL

systems

PIMDL quality

Domain-specific DLs

Page 112: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

112

Moving from a minimal DL towards a DL reference model (2/2)

• Content-based image retrieval services in a DL

• A superimposed-information-supported DL

• Practical DL generation

Page 113: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

113

Superimposing information

Superimposed layerNew information/structures

Base layerExisting information from heterogeneous sources: text, images, audio/video documents

MarkReference to base information element

Page 114: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

114

Preliminary SI-DL metamodel

Page 115: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

115

Stream Structure Space Service Society

ImageStream

FeatureVector

Image Descriptor

StructuredFeatuteVector

ImageContent

Description

ImageDigitalObject

ImageObject

User InfoNeed

ImageCollection

VisualizationOperation

Content-based ImageSearching Service

Image DescriptorMetadata Catalog

Composite Descriptor

KNNQ

RQ

Minimal CBIR DL

Page 116: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

116

Summary• 5S and Generating DLs

– 5S Framework– 5S definitions, services taxonomy, ontology– 5SL– 5SGraph– 5SGen (and DL development)– DL development of union DL– 5SGen into DSpace

• 5S Metamodels – Minimal DL– Archaeology DL– Multimedia (CBIR) DL– Union DL– Practical DL, superimposed information, personal DL, …

Page 117: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

117

NSF Workshop on DL Future, Chatham, MA

Page 118: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

118

People

• Digital librarians

• DL system developers

• DL system administrators

• DL managers

• DL collection development staff

• DL evaluators

• DL users

Page 119: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

119

Page 120: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

120

Living In the KnowlEdge Society

(LIKES)

Grant: NSF 06-608, CPATH

Proposal: for VT Pathways(themed version of core curric.)

PI: Edward A. Fox

Page 121: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

121

Purpose• Graduates from colleges & universities should be

prepared to live in and contribute to the Knowledge Society emerging in the 21st century.

• Computing/LIS education can be revitalized:

• if the LIKES theme spreads in programs (so graduates can help build the Knowledge Society);

• if faculty collaborate (both in education and research endeavors) with colleagues globally who are interested in LIKES.

Page 122: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

122

Knowledge Society

HCI

Visualization

Knowledge Management

Systems Analysis & Design

Programming

Database

Algorithms

Architecture

Net-Centricity

Intelligent Systems

Social & Ethical

Library Information Science

Simulation

Chemistry

Biology

Communi-

cations

Healthcare

Art

Music

Marketing

Finance

Modeling

Engineering

Sociology

Psychology

Physics

Architecture

History

Political Science

Geography

Knowledge Society

HCI

Visualization

Knowledge

Systems Analysis & Design

Database

Algorithms

Intelligent Systems

Social & Ethical

Library & Information Science

Economics

Simulation

Chemistry

Biology

Healthcare

Art

Music

Marketing

Finance

Engineering

Sociology

Psychology

Physics

Architecture

History

Political Science

Geography

English

Math

Living In the KnowlEdge Society (LIKES):Core surrounded by enabling concepts, problem providing disciplines

Page 123: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

123

Objectives – 1 of 3

• Enhance education in the discipline:

– New courses: Living in the Global Knowledge Society, Knowledge Management

– Enhanced courses to be more driven by the LIKES theme: Artificial Intelligence, Data Mining, Digital Libraries, Multimedia/Hypertext/Information Access, …

Page 124: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

124

Objectives – 2 of 3• Give special attention, inside the discipline and across

disciplines:• to the areas of data, information, and knowledge;• to key concepts and methods, such as:

representation/views search/discovery

inference/decisions comparison/matching

complexity/heuristics analysis/mining

integration/mapping modeling/simulation

Page 125: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

125

Objectives – 3 of 3

• Engage researchers and teachers and students in the Knowledge Society’s problems, as motivation, orientation, and to help with solutions, e.g.,– Shifting toward digital government, including statutes,

rules, regulations, and procedures;– Handling attacks, including spam and viruses;– Ensuring quality even with disinformation, through

knowledge sourcing, provenance, and sharing of community expertise;

– Ensuring changes through education, that is cross-disciplinary, globally contextualized, based on awareness of human development, learning theory, and cognitive psychology

Page 126: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

126

Potential Course Areas/Courses• Personal Knowledge Management

– Computer Science and Information Systems, e.g., multi-media, process design and evaluation, and Human-Computer / Human-Information interaction.

– Psychology, e.g., knowledge organization principles, human cognitive processes.– Industrial Systems Engineering, e.g., Ergonomic factors of knowledge environments. – Ethics, e.g., ethical issues of information disclosure.

• Communication and Collaboration– Communications, e.g., Communication using digital visualizations, using knowledge access

in constructing digital messages.– Information Systems and Computer Science, e.g., computer supported cooperative work

and group support systems.– Marketing, e.g., influence of knowledge presentation on on-line customer behavior.

• Organization– Information Systems, e.g., service innovation and development, system design and

development.– Management Science, e.g., decision support systems concepts, capabilities, techniques,

and tools.– Management, Marketing, Accounting, and Finance, e.g., business in the information age.

• Society– Sociology, e.g., impact of knowledge differentials across society and countries.– Political Science, e.g., governmental collection and use of knowledge, impact of technology

on elections and government.

Page 127: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

127

DL Curriculum Project (NSF supporting VT, UNC-CH)

• Identify, develop and test educational DL modules, guided by

- Experts, international collaborators

- Computing Curriculum 2001

- 5S framework

- Analysis of DL course syllabi

Page 128: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

128

CC2001 Information Management Areas

IM1. Information models and systems*

IM8. Distributed DBs

IM2. Database systems* IM9. Physical DB design

IM3. Data modeling* IM10. Data mining

IM4. Relational DBs IM11. Information storage and retrieval

IM5. Database query languages IM12. Hypertext and hypermedia

IM6. Relational DB design IM13. Multimedia information & systems

IM7. Transaction processing IM14. Digital libraries

Page 129: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

129

Why Modular Design

• Flexibility, e.g., for ETD programs:– Self-study by NDLTD trainers– Self-study by ETD authors– Short courses by NDLTD trainers of ETD

authors– A course based on a single module– Course sequence (program) from multiple

modules– Plug in modules into an existing course

(enhancement)• Module 1. Overview + Module 10. DL

Education & Research

Page 130: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

130

Modules

1. Collection Development2. Digital objects / Composites / Packages3. Metadata, Cataloging, Author submission4. Architecture, Interoperability5. Data visualization6. Services7. Intellectual property rights management,

Privacy, Protection8. Social issues / Future of DLs9. Archiving and Preservation

Page 131: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

131

Ascertaining Priority Topics

• We’ve manually classified and analyzed publications using 9 Modules:

Source Count

Proceedings JCDL ’01 – ’05 354

Proceedings ACM DL ’96 – ’00 189

Magazine articles D-Lib ’95 – ‘06 521

Session titles JCDL, ACM DL, ECDL

264

Page 132: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

132

Conference papers x modules

0

20

40

60

80

100

120

140

160

180

200

1 2 3 4 5 6 7 8 9

Module ID

Nu

mb

er

of

co

nfe

ren

ce

pa

pe

rs

JCDL 05

JCDL 04

JCDL 03

JCDL 02

JCDL 01

ACM DL 00

ACM DL 99

ACM DL 98

ACM DL 97

ACM DL 96

Page 133: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

133

• Analysis Results:

- Total of 543 proceedings:

Most popular topics were architecture (module 4) and services (module 6)

Page 134: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

134

Distribution of D-Lib Magazine Articles

across Module Topics

0

20

40

60

80

100

120

140

160

180

200

1 2 3 4 5 6 7 8 9

Module ID

Nu

mb

er

of

D-L

ib a

rtic

les

D-Lib 06

D-Lib 05

D-Lib 04

D-Lib 03

D-Lib 02

D-Lib 01

D-Lib 00

D-Lib 99

D-Lib 98

D-Lib 97

D-Lib 96

D-Lib 95

Page 135: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

135

• Analysis Results:

- Total of 521 articles:

Most popular topics were architecture (module 4), services (module 6)

and social issues (module 8)

Page 136: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

136

Distribution of Session Titles

across Module Topics

0

5

10

15

20

25

30

35

1 2 3 4 5 6 7 8 9

Module ID

Nu

mb

er

of

pa

nel

se

ssio

ns

JCDL & ACM DL

ECDL

ICADL

Page 137: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

137

• Analysis Results:

- Total of 264 session titles (JCDL, ECDL, ICADL):

Most popular topic was services (module 6)

followed by architecture (module 4)

Page 138: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

138

Fox & Gonçalves Book Outline

• Ch. 1. Introduction (Motivation, Synopsis)

• Part 1 – The “Ss”– Ch. 2: Streams

– Ch. 3: Structures

– Ch. 4: Spaces

– Ch. 5: Scenarios

– Ch. 6: Societies

Page 139: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

139

Textbook Outline (2)

• Part 2 – Higher DL Constructs– Ch. 7: Collections

– Ch. 8: Catalogs

– Ch. 9: Repositories and Archives

– Ch. 10: Services

– Ch. 11: Systems

– Ch. 12: Case Studies

Page 140: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

140

Textbook Outline (3)

• Part 3 – Advanced Topics– Ch. 13: Quality– Ch. 14: Integration– Ch. 15: How to build a digital library– Ch. 16: Research Challenges, Future Perspectives

• Appendix– A: Mathematical preliminaries– B: Formal Definitions: Ss – C: Formal Definitions: DL terms, Minimal DL– D: Formal Definitions: Archeological DL– E: Glossary of terms, mappings

Page 141: 1 Vienna University of Technology (Vienna – 21 Sept 2007) “From information retrieval to digital libraries to computer science education” Edward A. Fox

141

Pointers and Summary

• http://fox.cs.vt.edu

• http://fox.cs.vt.edu/talks

• www.dlib.vt.edu

[email protected]

• IR -> DL

• Education: CSTC, CITIDEL, NSDL, NDLTD, LIKES, DLcurric