NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 1
NSDL and the Open Archives Initiative
NSDL – OAI – and the Eisenhower National Clearinghouse
Presented byStephen P [email protected]
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 2
Eisenhower National Clearinghouse
• Since 1992 ENC for Math & Science Ed. has collected and disseminated info for teaching resources for grades K-12
• ~75 people including grad students
• 7 collections
• 5 collections participate in NSDL
• >5700 resources (2003-09)
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 3
Architecture
• Windows SQL Server 2000 for all collections• Data entry
– Web ASP for the DL collections– PowerBuilder for ENC Online (both digital and non-
digital resources)
• ENC uses– Autonomy Server search engine (UNIX)– Vignette Content Management Server for page
rendering and caching (UNIX)
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 4
ENC Online Collection
• http://www.enc.org/
• Main (original) collection
• >25,000 total resources
• >2,300 digital resources participating in NSDL (2003-09)
• Target audience: K-12 science and math teachers, parents and children
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 5
LM Collection
• The Learning Matrix
• http://thelearningmatrix.enc.org/
• >900 born-digital resources (2003-09)
• 100% NSDL participation
• Target audience: future math and science teachers
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 6
GSDL Collection
• Gender and Science Digital Library
• http://gsdl.enc.org
• >500 born-digital resources (2003-09)
• 100% NSDL participation
• Goal: assisting educators in integrating gender-equitable instruction into the classroom
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 7
ICON Collection
• Innovative Curriculum Online Network
• http://icontechlit.enc.org/
• >1600 born-digital resources (2003-09)
• 100% NSDL participation
• Target audience: teachers and professors for K-12 and beyond
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 8
FEDRL
• Federal Education Digital Resources Library
• No web site
• >250 born-digital resources (2003-09)
• 100% NSDL participation
• Purpose: to identify federally-supported resources that are not yet part of the NSF NSDL
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 9
Repository Architecture
• Windows SQL Server 2000 for control
• Individually formatted OAI XML files
• XML files stored separately from database
• Future: possible storage of XML records in database
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 10
Repository Updates
• OAI records generated daily for– Updated records– New records– Deleted records
• Each collection has its own OAI generator
• Records are moved to repository server
• Records are backed up on UNIX
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 11
Repository Updates (2)
• OAI file <header>’s are preprocessed
• <identifier> and <datestamp> data are loaded into a temporary database table
• Master database table updated from temporary database table
• OAI files are moved to production folder
• Perl
• Stored procedures
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 12
Branding
• Originally all collections used the same icon when displayed on nsdl.org at initial release 2002-12
• Needed to show proper icons to reflect partners and funding
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 13
Branding (2)
• Used <setSpec> to identify different brands
• Provided NSDL with JPEGs of each• Pro: NSDL was able to control icon with
<setSpec> value– Not sure what grief this caused them but they
did not complain
• Con: could not register all the different collections with NSDL
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 14
NSDL Branding
• ENC• LM• FEDRL• GSDL• ICON
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 15
Harvesting
• Harvest request is validated
• Resumption token generated or validated where necessary
• ListIdentifiers = 500 per request
• ListRecords = 300 per request
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 16
Resumption Token
• !!1601!1656696!500!5486
• Token is for you – the data provider– Invoke as you see fit
• ! is standard field separator– 1601 = token id– 1656696 = resumption hash– 500 = total records returned– 5486 = total records to return
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 17
Resumption TokenExpiration
• Expiration bookkeeping was problematic• Spec indicates restart will supply only the
original requested set of records– Updated OAI record could not be substituted for that
selected at beginning of harvest
• Tokens do not expire until next repository update is made– ENC does not have to monitor time– Harvester has a chance to complete a harvest even if
having local difficulties
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 18
Rights
• <metadata><dc:rights> tag is for the target resource
• There is no mechanism to identify ENC’s (repository owner’s) rights
• Adding rights information to <about> would not pass schema validation
• Modified every <dc:description> to include copyright statement
• Would like to see this addressed in the next version of OAI
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 19
Repository Explorer
• http://oai.dlib.vt.edu/cgi-bin/Explorer/oai2.0/testoai
• Use it!
• Do not try to register with NSDL until all tests are passed
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 20
Resources - People
• Hussein Suleman– Instrumental in the creation of the Repository
Explorer– Presented tutorial ‘Introduction to the Open
Archives Initiative Protocol for Metadata Harvesting’ at JCDL 2002
• Attend it if you get the chance
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 21
Resources – People (2)
• Michael L Nelson (NASA Langley)
• Herbert Van de Sompel (Los Alamos)
• Simeon Warner (Cornell)– Jointly presented tutorial ‘Advanced Overview
of Version 2.0 of the Open Archives Initiative Protocol for Metadata Harvesting’ at JCDL 2002 and 2003
• Attend it if you get the chance• http://ils.unc.edu/~mln/jcdl02/
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 22
Resources – Web
• Open Archives Initiative http://www.openarchives.org/
• The Open Archives Initiative Protocol for Metadata Harvesting http://www.openarchives.org/OAI/openarchivesprotocol.html
NSDL October 12-15, 2003 Eisenhower National Clearinghouse Slide 23
Questions?
These slides are available at http://enc.org/oai/enc-2003-oai
.ppt