Upload
leah-ramos
View
224
Download
4
Tags:
Embed Size (px)
Citation preview
SRW/U for DSpace
Ralph LeVanResearch Scientist
What is SRW/UWhat is SRW/U
• A Pair of HTTP-based Text Query Protocols
– SRW: Search and Retrieve Web Service
– SRU: Search and Retrieve URL Service
• An alternative to Z39.50
The Weaknesses of Classic Z39.50The Weaknesses of Classic Z39.50
• Not popular with the Web community
– Connection-based Sessions
– Binary Encoding
– Transmitted directly over TCP/IP
• Complicated
The Strengths of Classic Z39.50The Strengths of Classic Z39.50
• Result Sets (a.k.a. Statefulness)
• Abstraction
– Abstract Access Points (Attribute Sets)
– Abstract Record Schemas
• Explain
SRW: Search and Retrieve Web ServiceSRW: Search and Retrieve Web Service
• SOAP (Simple Object Access Protocol) Based
– HTTP
– XML
• Records Described in WSDL (Web Service Description Language)
• 3 Services: SearchRetrieve, Scan and Explain
SRW: The BasicsSRW: The Basics
• Only one database per request
• String (not structure) based queries
• Index Sets, not Attribute Sets
• One Record Syntax (XML)
The Explain RequestThe Explain Request
• An empty request
– E.g. http://alcme.oclc.org/srw/search/SOAR
The Explain ResponseThe Explain Response
• A description of the database
• A list of the supported indexes
• A list of the supported record schemas
The SearchRetrieve RequestThe SearchRetrieve Request
• String CQL Query
• Integer StartRecord
• Integer MaximumRecords
• String RecordSchema
http://alcme.oclc.org/srw/search/SOAR?query=dog
The SearchRetrieve ResponseThe SearchRetrieve Response
• ResultSetReference
– String resultSetName
– Integer resultSetTimeToLive
• Integer numberOfRecords
• Records
• Diagnostics
CQL: Common Query LanguageCQL: Common Query Language
• Loosely based on CCL Search
• Boolean & Proximity Operators
• Index Sets & Indexes
• Truncation Characters ‘*’, ‘#’ & ‘?’
• Example:
dc.title=“harry potter” or bib1.isbn=123-456-78x
The Scan RequestThe Scan Request
• String CQL scanClause
• Integer maximumTerms
• Integer responsePosition
http://alcme.oclc.org/srw/search/SOAR?operation=scan&scanClause=dog&maximumTerms=3&responsePosition=3
The Scan ResponseThe Scan Response
• Terms
– A term for searching
– Possibly a term for displaying
– The number of records retrieved by the term
• Diagnostics
Using SRUUsing SRU
• Send the URL and get the responseBufferedReader in = new BufferedReader(
new InputStreamReader(new URL(“http://alcme.oclc.org/srw/SOAR?query=dog”) .openStream()));
String inputLine=null, response;
StringBuffer content=new StringBuffer();
while((inputLine=in.readLine())!=null) content.append(inputLine);
response=content.toString();
Using SRUUsing SRU
• Parse the response using String methods
int i=response.indexOf(“<numberOfRecords>”, j=response.indexOf(“</numberOfRecords>”), count=Integer.parseInt(response.substring(i+17, j);
Using SRUUsing SRU
• Parse the response using DOM classes
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(record)));
Using SRWUsing SRW
• Get WSDL from server or LOC
http://alcme.oclc.org/srw/search/SOAR?wsdl
or
http://www.loc.gov/z3950/agency/zing/srw/srw-sample-service.wsdl
Using SRWUsing SRW
• Convert WSDL to code
java org.apache.axis.wsdl.WSDL2Java --server-side --skeletonDeploy true srw-sample-service.wsdl
Using SRWUsing SRW
• Write Client
SRWSampleServiceLocator service=new SRWSampleServiceLocator(); URL url=new URL("http://alcme.oclc.org/srw/search/SOAR"); SRWPort port=service.getSRW(url); SearchRetrieveRequestType request=new SearchRetrieveRequestType(); request.setQuery(“dog"); SearchRetrieveResponseType response= port.searchRetrieveOperation(request); int postings=response.getNumberOfRecords());
DSpace ImplementationDSpace Implementation
• Reads list of Lucene indexes from SRWDatabase.props
• Converts CQL queries to Lucene queries
• Gets Dublin Core record from database
InstallationInstallation
• Get the SRW.war file from http://www.oclc.org/research/software/srw
• Start tomcat (to unpack the .war file)
• Edit the SRWServer.props configuration file
• Copy the SRWDatabase.props file to your DSpace/config directory
• Restart tomcat
• http://yourserver/SRW/search/DSpace
SRWServer.propsSRWServer.props
# parameters for the SRW Servlet
SRW.Home=d:/Apache Tomcat 4.1/webapps/SRW/
default.database=DSpace
resultSetIdleTime=300
db.DSpace.class=ORG.oclc.os.SRW.SRWLuceneDatabase
db.DSpace.home=d:/dspace/dspace-1.1/
db.DSpace.configuration=config/SRWDatabase.props
ExamplesExamples
• http://alcme.oclc.org/srw/search/GSAFD?
• http://alcme.oclc.org/srw/search/SOAR?
• http://alcme.oclc.org/srw/search/NDL?
LinksLinks
• http://www.loc.gov/srw
• http://www.loc.gov/z3950/srutest.html
• http://www.oclc.org/research/software/srw
• http://staff.oclc.org/~levan/docs/SRWforDSpace.ppt
&QuestionsAA
nswersnswers