15
Open Archives Initiative Protocol for Metadata Harvesting

Open Archives Initiative

Embed Size (px)

DESCRIPTION

Open Archives Initiative. Protocol for Metadata Harvesting. Collections in isolation. Some thoughts A wonderful collection is of limited use if it is not well known. Very redundant collections are often wasteful. Virtual collections. - PowerPoint PPT Presentation

Citation preview

Page 1: Open Archives Initiative

Open Archives Initiative

Protocol for Metadata Harvesting

Page 2: Open Archives Initiative

Collections in isolation

Some thoughts A wonderful collection is of limited

use if it is not well known. Very redundant collections are

often wasteful

Page 3: Open Archives Initiative

Virtual collections

Some collections do not contain actual materials, only information about materials and links to the home site.

How do these virtual collections get the information about other collections? How do they stay up to date?

--> The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH)

Page 4: Open Archives Initiative

OAI - PMH

A protocol -- that is just an agreement to exchange messages and interpret them according to strict rules.

Metadata -- data about the data -- information about the material in the collection

Harvesting -- gathering in the desired part of the collection for further use

Page 5: Open Archives Initiative

The protocol

See http://www.openarchives.org/OAI/openarchivesprotocol.html

Two sides - the repository and the harvestor The repository (data providers)

Prepares the required metadata Responds to the harvester queries Acts like a server - responding to queries when they

come The Harvester (data gatherer)

Gathers the metadata from the collections Organizes the harvested metadata in a way to serve its

purpose. Acts like a client - requesting service when it needs it.

Page 6: Open Archives Initiative

Resource, item, record

Resource: the actual content of the collection; the point of the digital library

Item: a part of the repository that generates the metadata.

Record: metadata in a specific format available for dissemination. Encoded in XML Unique identifier Datestamp setSpecµ Optional status

Page 7: Open Archives Initiative

Sets

Repositories may organize items into sets Allows selective harvesting

Each node in a set organization has setSpec

Set may be hierarchical. If so, the levels are separated by colons

setName setDescription

Page 8: Open Archives Initiative

Requests

Request embedded in an HTTP request Valid OAI PMH Requests:

GetRecord Identify ListIdentifiers ListMetadataFormats ListRecords ListSets

Page 9: Open Archives Initiative

GetRecord

Required arguments Identifier = unique identifier of an item whose record is

requested metadataPrefix = prefix part of the metadata record

relevant to the requested item This identifies the type of metadata applied to the record.

Example = oai_dc (the OAI version of the Dublin Core -- standard 15 elements, no extension.)

Errors: badArgument, cannotDisseminateFormat, idDoesNotExist

Page 10: Open Archives Initiative

Identify No arguments Requests information about the repository. Response includes

repositoryName BaseURL protocolVersion earliestDatestamp deletedRecord (how does the repository handle deletions -- no,

transient, persistent Granularity (how finely can the datestamp be specified?) adminEmail compression (what schemes are supported) description

Optional

Page 11: Open Archives Initiative

ListIdentifiers

Required Argument metadataPrefix

Optional Arguments from until set

Exclusive argument resumptionToken (flow control token for resuming an

incompleted previous ListIdentifiers request) Errors: badArgument, badResumptionToken,

cannotDisseminateFormat, noRecordsMatch, noSetHierarchy

Page 12: Open Archives Initiative

ListMetadataFormats

Optional argument identifier (if metadataformat is needed only for

some particular item) Errors - badArgument, idDoesNotExist,

noMetadataFormats Response includes both metadataPrefix and

the associated schema

Page 13: Open Archives Initiative

ListRecords

Required arguments metadataPrefix - Only records for which the specified

metadataPrefix applies should be returned Optional arguments

from until set

Exclusive arguments resumtpionToken

Page 14: Open Archives Initiative

ListSets

Exclusive Argument resumptionToken (used to continue a previous

incomplete response to ListSets) Errors - badArgument, badResumtpionToken,

noSetHierarchy