Y.T. a brief history of the OAI 0 Kaynak: Herbert van de Sompel

Embed Size (px)

Citation preview

  • Slide 1

Y.T. a brief history of the OAI 0 Kaynak: Herbert van de Sompel Slide 2 Y.T. Open Archives Initiative Protocol for Metadata Harvesting Yaar Tonta H BBY DOK 422: Information Networks Slide 3 Y.T. The Open Archives Initiative has been set up to create a forum to discuss and solve matters of interoperability between preprint solutions, as a way to promote their global acceptance. Paul Ginsparg, Rick Luce & Herbert Van de Sompel the OAI roots => Santa Fe Convention: preprint metadata harvesting Kaynak: Herbert van de Sompel Slide 4 Y.T. interest from other communities Digital Library Federation meetings ~ research library community has many materials for which they would like to expose metadata OAI San Antonio meeting: ~ interest from librarians, publishers, others,... Kaynak: Herbert van de Sompel Slide 5 Y.T. resulting actions: organizational establish organizational stability for the OAI: institutional backing from CNI & DLF steering committee: policy guidance technical committee: technical specifications executive group: day to day coordination workshops: public dissemination, feedback Kaynak: Herbert van de Sompel Slide 6 Y.T. resulting actions: technical [09/2000] revise specifications to allow adoption beyond preprints: technical committee [09/2000-01/2001]compile new specifications: editing by Carl and Herbert [11/2000-01/2001] alpha-test specifications: oai- alpha group [01/2001] discontinue the Santa Fe Convention [01/2001] release version 1.0 of the OAI protocol Kaynak: Herbert van de Sompel Slide 7 Y.T. the OAI Metadata Harvesting protocol 1 Kaynak: Herbert van de Sompel Slide 8 Y.T. The OAMH protocol is a low-barrier interoperability specification for the recurrent exchange of metadata between systems Kaynak: Herbert van de Sompel Slide 9 Y.T. the OAMH protocol service providerdata provider Requests Replies repositoryrepository harvesterharvester 6 Kaynak: Herbert van de Sompel Slide 10 Y.T. A&I federated services imageFTXTOPACe-print Kaynak: Herbert van de Sompel Slide 11 Y.T. metadata harvesting via OAMH metadata A&IimageOPACe-print FTXT harvester FTXT Kaynak: Herbert van de Sompel Slide 12 Y.T. federated services via OAMH metadata A&IimageFTXTe-print Author Title Abstract Identifer OPAC Kaynak: Herbert van de Sompel Slide 13 Y.T. core concepts in OAMH low-barrier interoperability data-provider & service-provider model metadata harvesting model OAMH protocol Dublin Core HTTP based Reply XML Schema Self contained shared metadata format and parallel, community- specific metadata formats Kaynak: Herbert van de Sompel Slide 14 Y.T. OAI harvesting tools service providerdata provider Datestamp Identifier Set Records repositoryrepository harvesterharvester Kaynak: Herbert van de Sompel Slide 15 Y.T. OAI harvesting tools Supporting protocol requests: Identify ListMetadataFormats ListSets Harvesting protocol requests: ListRecords ListIdentifiers GetRecord repositoryrepository service providerdata provider harvesterharvester Kaynak: Herbert van de Sompel Slide 16 Y.T. supporting protocol requests ListMetadataFormats ListMetadataFormats / Time / Request REPEAT Format prefix Format XML schema /REPEAT repositoryrepository service providerdata provider harvesterharvester Kaynak: Herbert van de Sompel Slide 17 Y.T. harvesting requests * from=a * until=b * set=klm ListRecords * metadataPrefix=dc ListRecords / Time / Request REPEAT Identifier Datestamp Metadata /REPEAT repositoryrepository service providerdata provider harvesterharvester Kaynak: Herbert van de Sompel Slide 18 Y.T. Applications of the OAMH protocol? federated services [S&R, SDI, alerting, linking,...] database synchronization harvesting the deep Web... Kaynak: Herbert van de Sompel Slide 19 Y.T. OAI background background in the e-prints (pre-prints) community need to provide search services across multiple e- prints archives distributed cross-searching felt not to be appropriate adopted approach based on metadata harvesting OAI has been linked to political agenda that wants to change the academic publishing model, but... ...core activity is the OAI-MHP - the OAI Metadata Harvesting Protocol Slide 20 Y.T. Slide 21 Slide 22 Slide 23 static repository 1 http://an.oai.org/ma/mini.xml static repository n http:// site1.org/mini/file1 Kaynak: Lagoze, http://eprints.rclis.org/archive/00000789/ Slide 24 Y.T. static repository 1 http://an.oai.org/ma/mini.xml static repository n http:// site1.org/mini/file1 http://gateway.institution.org/oai/ static repository gateway http://gateway.institution.org/oai /site1.org/mini/file1 http://gateway.institution.org/oai /an.oai.org/ma/mini.xml Kaynak: Lagoze, http://eprints.rclis.org/archive/00000789/ Slide 25 Y.T. static repository 1 http://an.oai.org/ma/mini.xml static repository n http:// site1.org/mini/file1 http://gateway.institution.org/oai/ static repository gateway http://gateway.institution.org/oai /site1.org/mini/file1 http://gateway.institution.org/oai /an.oai.org/ma/mini.xml OAI-PMH harvester OAI-PMH HTTP Kaynak: Lagoze, http://eprints.rclis.org/archive/00000789/ Slide 26 Y.T. The OAI-PMH data model Kaynak: http://www.dlib.org/dlib/december04/vandesompel/12vandesompel.html Slide 27 Y.T. Content transfer between archives using the OAI-PMH Kaynak: http://www.dlib.org/dlib/december04/vandesompel/12vandesompel.html Slide 28 Y.T. OAICat Slide 29 Y.T. Whats in a name? open means that specs are freely available may be some formal standards activity in the future currently at version 2.0 archive as in e-print archive - i.e. repository of documents NOT archive as used by the library and archival communities Slide 30 Y.T. OAI-MHP generic protocol for sharing metadata between services NOT a distributed search protocol response Service providers Repositories request Databases of stuff - metadata and/or full-text. May be partitioned into sets. Slide 31 Y.T. OAI-MHP requests sent as HTTP GET responses returned as XML over HTTP OAI-MHP based on HTTP, XML, XML schemas, XML namespaces 6 requests Identify, ListIdentifiers, ListRecords, GetRecord, ListMetadataFormats, ListSets large responses may be split using simple resumption token mechanism Slide 32 Y.T. Harvesting metadata service provider can ask repository for all records records in particular set records modified in particular date span metadata records returned using XML support for arbitrary XML schemas repositories MUST support simple DC XML record format some existing support for other schemas including an XML encoding for MARC