Upload
griffin-rice
View
222
Download
0
Tags:
Embed Size (px)
Citation preview
The Technical Infrastructure of the NSDL
Dean Krafft, Cornell [email protected]
NSDL Technical Overview
Structure of the talk: NSDL 1.0 Overview The Fedora-based NSDL Data Repository
(NDR) and NSDL 2.0 Inspiring Contribution and Collaboration –
ExpertVoices Other NSDL 2.0 Services and Tools Q&A
What is the NSDL?
An NSF-funded $20 million/year program in Science, Technology, Engineering and Mathematics (STEM) education
A digital library describing nearly two million carefully selected online STEM resources from well over 100 collections (at http://nsdl.org)
A core integration team (Cornell, UCAR, Columbia) working with 9 “pathways” portals and over 200 NSF grantees
A large community of researchers, librarians, content providers, developers, students, and teachers
NSDL 1.0
Create a “union catalog” of Dublin Core metadata records for STEM resources
Harvest those records from collections using OAI-PMH (openarchives.org)
Store records in an Oracle DB and re-serve qualified DC through OAI-PMH
Build a search index using metadata plus full-text of available content pages
Create a web portal at nsdl.org for K-gray access to NSDL resources
Infrastructure overview: NSDL 1.0
STEMCollectionson the Web
CentralMetadata
Repository
SearchService
ArchiveService
Collection RegistrationSystem
NSDL.org Portal
Protocol:OAI-PMHHTTPRESTSQL
NSDL 1.0 Lessons
Metadata Repository was quick to implement using known technologies, but
Limited model Metadata-centric orientation No content – only metadata Limited relationships – collection/item Limits on context, structure, and access Severe limits on contribution and collaboration One-way data flow: NSDL → Users
Rather than one portal for everyone, support communities with common interests: Pathways now provide discipline and area-specific portals
NSDL 2.0
Create an NSDL that guides not just resource discovery, but resource selection, use, organization, annotation and contribution Supports creating “context” for resources Presents resources in context: linked to related
concepts; with user ratings; with codes and data Supports creating a permanent archive of resources Enables community tools for structuring, evaluation,
annotation, contribution, collaboration Provides two-way data flow: NSDL ↔ users
Goal: Create a dynamic, living library
Creating the NSDL Data Repository
Supports storing both content and metadata
Allows arbitrary relationships among resource and metadata objects: organization, annotation, citation
Accessible through web service architecture of remixable data sources and transformations
Fedora: the NDR middleware A Flexible, Extensible Digital Object
Repository Architecture (http://www.fedora.info)
Open source project with $2.2 million in Mellon funding 2002-2007
Collaboration of Cornell and Univ. of Virginia Key funded users include:
eSciDoc project (collaboration of the Max Planck Society and FIZ Karlsruhe)
Public Library of Science (Topaz Foundation) VTLS Corp., Harris Corp., Library of Congress Australian Research Repositories Online to the
World Royal Library Denmark, National Library, and DTU
What is Fedora? An architecture, toolkit, and implementation:
middleware, not a vertical application DSpace in contrast: a vertical application
with a fixed workflow targeted at users Stores arbitrary internal and external digital
objects, disseminations (transformations and combinations), relationships among objects
Entirely SOAP/REST based, disseminations are URLs
XML data store; RDBMS cache; RDF triplestore supports relationship queries
NSDL Data Repository (NDR) References to roughly 2 million
selected STEM resources on the web Sourced metadata statements about
those resources A REST API to allow authenticated
access by Pathways and providers Support for annotation, aggregation,
and other relationships
Sample NDR Objects & Relationships
PublicationResource
Data SetMetadata
PublicationMetadata
Data SetResource
CodeResourceCites
Metadata for
Member of
MetadataProvider MatForge
CollectionSoft MatterCollection
Member of
Cites
Metadata for
CornellCCMR
MatDLPathway Selector
forSelectorfor
An Information Network Overlay Think of the NDR as a lens for viewing
science content on the net Content can be:
Local: stored directly in the NDR Remote: accessed through a URL Computed: derived from a database or
web service Archived: an older version stored at SDSC
It all has a repository-based URL
Network Overlay View
User View
API/UI
Repository View with Relations & Annotations
Resources on the Web
How should we use the NDR? The NDR provides powerful capabilities
for: Creating context around resources Enabling the NSDL community to directly
contribute resources and context Representing a web of relationships among
science resources and information about those resources
How do we use it? Here’s one specific example …
Soft Matter Wiki: Planned NDR Integration Community of approved contributors (e.g.
teachers, librarians, materials scientists) are granted edit access to Soft Matter wiki
New resources and metadata are created as wiki pages and reflected into the NDR
Relevant non-wiki-based NDR resources and metadata are displayed as read-only wiki pages, subject to comment and linking
User and project pages organize NDR resources
Will work with MatDL on integrating these capabilities into Soft Matter Wiki
NDR Entry for Soft Matter Wiki
Wiki Entry
NewMetadata
NewAudience
MD
ReferencedNew
Resource 1
ReferencedExisting
Resource 2
Annotates
Metadata for
Metadata for
Member ofMetadataProvider
MetadataProvider
ExistingCollection
Soft MatterWiki
Member of
Inferred relationshipbetween resources
But an NDR-integrated wiki is just the beginning …
Expert Voices A system using blogging technology to:
Support STEM conversations among scientists, teachers and students
Tie NSDL resources to real-world science news Create context for resources to enhance
discovery, selection and use Enable NSDL community members to become
NSDL contributors: of resources, questions, reviews, annotations, and metadata
Expert Voices ≠ LiveJournal Contributors are carefully selected,
contributions are about science, the process of science, and education
Expert Voices Implementation
Open source multi-user blogging system Published entries become NSDL resources Owner controls publication of entries and
visibility of comments Entries can contain linked references to NSDL
resources, references to URLs that should become resources, and new resource metadata
Integrated with NSDL Shibboleth-based community sign-on
MyNSDL: NDR-integrated tagging, bookmarking, and recommendation Based on Connotea open-source
folksonomic tagging/bookmarking system
Tags and bookmarking structure are reflected back into the NDR
Authorized users can “automatically” recommend new NSDL resources simply by tagging them
Gives user a personal view of NSDL resources
Other proposed applications iVia-based Expert-Guided crawl: Tool
for Pathways and others to turn websites into resource collections (in development at UC Riverside)
Moodle Course Management System – courses integrated with NSDL resources
Electronic lab notebook – integrating lab notes with code, data sets, and reference materials within the library archival framework
…
NSDL 2.0 Ecosystem
Protocol:OAI-PMHHTTPRESTNDR API
STEMCollections
SearchServiceArchive
Service
Fedora-basedNDR
What does this mean for the user?
NSDL 2.0 applications situate resources in context, aiding both discovery and use
Users become contributors, adding new resources, ratings, annotations, and organizational structure – frequently as a side effect of using the library
Specialized portals, tagging, and powerful relationship queries and filtering support user-specific “views” into the library
Summary
NSDL 1.0 created a large, production digital library of STEM resources for education.
NSDL 2.0 and its tools allow scientists, mathematicians, teachers, engineers, librarians, and students to create a unique web of context, contribution, and collaboration around the high-quality STEM education resources at the core of the NSDL.
Acknowledgements
NSDL NSF Program Officers Lee Zia David McArthur
NSDL Core Integration Team UCAR: Kaye Howe, PI and Executive Director Cornell: Dean Krafft, PI Columbia: Kate Wittenberg, PI
Fedora Development Team Cornell: Sandy Payette & Carl Lagoze Univ. of Virginia: Thornton Staples
Questions?
Contact Information
Dean B. KrafftCornell Information Science301 College Ave.Ithaca, NY [email protected]
This work is licensed under the Creative Commons Attribution-NoDerivs 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nd/2.5/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.