13
This article was downloaded by: [University of Chicago Library] On: 06 October 2014, At: 21:20 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Internet Reference Services Quarterly Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/wirs20 Using the Google Search Appliance for Federated Searching Mary Taylor a a University of Nevada, Reno Libraries , 1664 North Virginia Street/MS 322, Reno, NV, 89434, USA Published online: 22 Oct 2008. To cite this article: Mary Taylor (2005) Using the Google Search Appliance for Federated Searching, Internet Reference Services Quarterly, 10:3-4, 45-55, DOI: 10.1300/J136v10n03_06 To link to this article: http://dx.doi.org/10.1300/J136v10n03_06 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content.

Using the Google Search Appliance for Federated Searching

  • Upload
    mary

  • View
    214

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Using the Google Search Appliance for Federated Searching

This article was downloaded by: [University of Chicago Library]On: 06 October 2014, At: 21:20Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH,UK

Internet Reference ServicesQuarterlyPublication details, including instructions forauthors and subscription information:http://www.tandfonline.com/loi/wirs20

Using the Google SearchAppliance for FederatedSearchingMary Taylor aa University of Nevada, Reno Libraries , 1664 NorthVirginia Street/MS 322, Reno, NV, 89434, USAPublished online: 22 Oct 2008.

To cite this article: Mary Taylor (2005) Using the Google Search Appliance forFederated Searching, Internet Reference Services Quarterly, 10:3-4, 45-55, DOI:10.1300/J136v10n03_06

To link to this article: http://dx.doi.org/10.1300/J136v10n03_06

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all theinformation (the “Content”) contained in the publications on our platform.However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness,or suitability for any purpose of the Content. Any opinions and viewsexpressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of theContent should not be relied upon and should be independently verified withprimary sources of information. Taylor and Francis shall not be liable for anylosses, actions, claims, proceedings, demands, costs, expenses, damages,and other liabilities whatsoever or howsoever caused arising directly orindirectly in connection with, in relation to or arising out of the use of theContent.

Page 2: Using the Google Search Appliance for Federated Searching

This article may be used for research, teaching, and private study purposes.Any substantial or systematic reproduction, redistribution, reselling, loan,sub-licensing, systematic supply, or distribution in any form to anyone isexpressly forbidden. Terms & Conditions of access and use can be found athttp://www.tandfonline.com/page/terms-and-conditions

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 2

1:20

06

Oct

ober

201

4

Page 3: Using the Google Search Appliance for Federated Searching

Using the Google Search Appliancefor Federated Searching:

A Case StudyMary Taylor

SUMMARY. This article discusses the University of Nevada, Reno’sexperiment of federated searching with version 4.1 of the Google SearchAppliance. The project’s testbed included locally held CONTENTdmand geospatial data collections and a sample of records from EBSCO’sAcademic Search Premiere database. The latter set of records revealedthe GSA’s limitations in being able to index and retrieve content that isdynamically generated and that requires third party authentication. [Ar-ticle copies available for a fee from The Haworth Document Delivery Service:1-800-HAWORTH. E-mail address: <[email protected]> Web-site: <http://www.HaworthPress.com> © 2005 by The Haworth Press, Inc. Allrights reserved.]

KEYWORDS. Academic Search Premier, dynamically generated con-tent, EBSCO Academic Search Premier, Google Search Appliance,XML

Mary Taylor is Metadata Services Coordinator, University of Nevada, Reno Librar-ies, 1664 North Virginia Street/MS 322, Reno, NV 89434 (E-mail: [email protected]).

Google® is a Registered Service Mark of Google, Inc., Mountain View, California.Libraries and Google® is an independent publication offered by The Haworth Press,Inc., Binghamton, New York, and is not affiliated with, nor has it been authorized,sponsored, endorsed, licensed, or otherwise approved by, Google, Inc.

[Haworth co-indexing entry note]: “Using the Google Search Appliance for Federated Searching: A CaseStudy.” Taylor, Mary. Co-published simultaneously in Internet Reference Services Quarterly (The HaworthInformation Press, an imprint of The Haworth Press, Inc.) Vol. 10, No. 3/4, 2005, pp. 45-55; and: Librariesand Google® (ed: William Miller, and Rita M. Pellen) The Haworth Information Press, an imprint of TheHaworth Press, Inc., 2005, pp. 45-55. Single or multiple copies of this article are available for a fee from TheHaworth Document Delivery Service [1-800-HAWORTH, 9:00 a.m. - 5:00 p.m. (EST). E-mail address:[email protected]].

Available online at http://www.haworthpress.com/web/IRSQ© 2005 by The Haworth Press, Inc. All rights reserved.

doi:10.1300/J136v10n03_06 45

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 2

1:20

06

Oct

ober

201

4

Page 4: Using the Google Search Appliance for Federated Searching

INTRODUCTION

As the Metadata Services Coordinator for the University of Nevada,Reno Libraries, one of my areas of interest includes federated search-ing. During the past year, I served on a committee to review and evalu-ate federated search products. After numerous meetings with vendorsand several product trials, the committee’s conclusion was that thefunctionality offered by these products did not merit the price quotes.During my first days in this position, I asked what would be the fivemost important tasks for my position. The Dean of University Libraries,Dr. Steven D. Zink, mentioned that finding out the cost and technical re-quirements for implementing the Google Search Appliance (GSA)should be at the top of this list. The GSA is an off-the-shelf combinationof a 2U standard rack-mountable server hardware (commonly known as“The Google Box”1) and an administrative software module that can beconfigured to perform searches on a defined network (such as anintranet or Web site) of over 220 different types of file formats.

Dr. Zink, who is also the university’s Vice President of InformationTechnology, had made inquires with Google several years earlier aboutthe cost of implementing the GSA as a federated search product. At thattime, the cost of GSA was too great to justify an in-depth investigation.As part the federated search committee, I decided to make a new inquiryfor the sake of due diligence. I also wanted to learn more about how li-braries could use the GSA for searching their digital collections and alsothe price for the non-profit and education sector. The Dean supportedthis idea, especially because Google offers potential customers a 60-daytrial period to install and test the GSA. Our attitude was that even if thecost of the GSA was still too expensive, Google’s policy of offering afree trial made it worthwhile to at least bring in the GSA for a test. Atthat time, the model was version 4.1 and our greatest interest was inevaluating its ability to do federated searching of digital collections andresources.

CROSSWALKING THE GOOGLE SEARCH APPLIANCEWITH ACADEMIC SEARCH PREMIER

During the summer and fall of 2004, we held several conferencecalls with sales and systems engineering staff from Google’s Enter-prise Search group. The main question that we posed was how theSearch Appliance would work with the types of content that libraries

46 LIBRARIES AND GOOGLE®

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 2

1:20

06

Oct

ober

201

4

Page 5: Using the Google Search Appliance for Federated Searching

manage. We gave them an overview of our digital collections, includingthe online library catalog (Innovative Millennium), GIS and map data,CONTENTdm collections, electronic journals, and third party vendordatabases. As an initial test, our colleagues at Google had a demonstra-tion server that emulates the GSA index and retrieves content from theGIS and CONTENTdm collections. Ideally we wanted to be able to usethe GSA as a single point of access for vendor databases and electronicjournals.

As a result of the conference calls, we gained a basic understandingof how the GSA works. It follows the document model, which is a rep-resentation of an item’s physical and logical structure. The GSA crawlsand caches content based on URL or filename. The cache index storesthis information as the item’s unique identifier in order to link it back tosubsequent search results: “The Google Search Appliance crawls yourcontent and creates a master index of documents that’s ready for instantretrieval using Google’s search technology whenever a customer or em-ployee types in a search query.”2

During the first conference call, we gave our counterparts an over-view of vendor databases like Academic Search Premier and how articlesin these databases are dynamically generated and contain session-gen-erated URLs. As an example, the Persistent URL (PURL) for an articleabout granting “most favored nation” status to China contains infor-mation about the port that authenticates UNR access to EBSCOhost(innopac.library.unr.edu:80) and also the identifiers for the article’sstorage database (db=f5h) and item number (an=9609131521):

http://0-search.epnet.com.innopac.library.unr.edu:80/login.aspx?direct=true&db=f5h&an=9609131521

A session-generated URL for the same article is longer and, in addi-tion to information about its storage database (db=f5h), it also includesthe session identifier (sessionmgr5) and search query (2DChina++%22most++favored++nation):

http://0-web18.epnet.com.innopac.library.unr.edu/citation.asp?tb=1&_ug=sid+D7C7A13A%2D8CB9%2D443D%2D8E41%2DC47B312BF7BA%40sessionmgr5+dbs+f5h+cp+1+50E5&_us=frn+1+hd+True+hs+True+cst+0%3B1+or+Date+ss+SO+sm+KS+sl+0+dstb+KS+mh+1+ri+KAAACB1B00094089+DF6E&_uso=tg%5B0+%2D+db%5B0+%2Df5h+hd+False+clv%5B1+%2DY+clv%5B0+%2DY+op%5B0+%2D+cli%5B1+%2DRV+cli%5B0+%2DFT+st%5B0+%2DChina++%22most++favored++nation%22+mdb%5B0+%2Dimh+0817&cf=1&fn=1&rn=1

Mary Taylor 47

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 2

1:20

06

Oct

ober

201

4

Page 6: Using the Google Search Appliance for Federated Searching

Because we wanted to determine if the GSA could function as a pointof entrance to vendor databases and electronic journal collections, weneeded to find a database to crosswalk with it. EBSCO’s AcademicSearch Premier database would be ideal because of its wide subject cov-erage and appeal to undergraduate and novice library users. Having theGSA as its point of access would make it an easy and attractive researchtool for these users and hopefully increase its overall usage. Afterspeaking with the staff at Google’s Enterprise Search group, we thencontacted EBSCO’s Chief Information Officer, Michael Gorrell, fortwo conference calls. The first call was to give him an overview of theGoogle Search Appliance and to see if EBSCO would be willing to al-low use of Academic Search Premier for a test of the GSA. The secondconference call included the Systems Engineer for Google’s EnterpriseSearch Group, John Gregory, to discuss with Gorrell how their productswould work together. EBSCO agreed to participate in a small test, inwhich the demo servers would index and cache content from AcademicSearch Premier.

The bulk of digital collections in libraries are vendor databases con-taining session-generated URLs, which makes it hard to answer thequestion of how the GSA would cache this content. If an article’s URLchanges from session to session, whatever information that the GSAstores in the cache would direct subsequent search results to an Error404 page because the referring URL no longer exists. It was not clearhow the GSA could link the session-generated URLs stored in the cacheback to the originating articles. While explaining this potential barrier toGoogle, we asked if they had customers who use the GSA to search dy-namically generated content. Some of their customers do use the GSAto search Customer Relationship Management systems, which have dy-namically generated content. However we did not learn the specific de-tails about how these customers had resolved this issue. A short timeafter completing the test, we found a review of the latest release of theGSA that mentioned its inability to integrate with content or documentmanagement systems.3

Gregory suggested three possible solutions to how the GSA could in-dex and cache session generated URLs. The library’s authenticationprocess is a proxy rewrite, so we could investigate how to rewrite a ses-sion-specific URL back to its persistent URL (PURL) before caching.Another solution would be to see if the proxy rewrite could strip out thesession-generated sections of a URL, such as the search query, and thenpass the remaining information to the GSA as the URL. This solutionwas problematic partially because the session specific information in-

48 LIBRARIES AND GOOGLE®

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 2

1:20

06

Oct

ober

201

4

Page 7: Using the Google Search Appliance for Federated Searching

cluded in the URL is located in different sections and not at the begin-ning or end of the URL string. More important was that the SystemsOffice had already invested time and effort into implementing a proxyrewrite for the library’s vendor database and electronic journal collec-tions. Neither of these options was realistic for our institution. Gregoryalso suggested that EBSCOhost generate a list of all of AcademicSearch Premier’s persistent URLs (about 18 million) for the demo serv-ers to crawl. Again, this suggestion was not realistic given the strain itwould place both on EBSCOhost’s network and staff.

Gorrell proposed a fourth alternative that eventually proved to be themost efficient process for the test–EBSCO’s OEMDirect XML Serviceis a database interface that employs a Simple Object Access Protocol(SOAP) interface.4 SOAP is an XML-based format for exchanging in-formation in a decentralized environment. The advantages of using thisformat instead of the traditional Z39.50 standard are that it makes it pos-sible to search content that is in other formats besides MARC, such asjournal databases. Customers can access the SOAP interface throughEBSCOhost or can locally host a database of XML formatted records.

One concern that EBSCO had about the test was the potential strainthat the demo servers could place on Academic Search Premier. Gorrellprovided a sample set of XML records to host on our network forGoogle’s demo servers to crawl and index. The Systems Office loadedthese records onto a server and then generated a URL. It quickly becameclear that the GSA could not return meaningful search results for theserecords because it could not differentiate between markup tags and arti-cle content. It treated both equally as text. In order for the demo serverto be able to crawl and index only the text from the article, there neededto be a way to transform the records into HTML. I asked the libraries’Webmaster, Araby Greene, for her opinion about the best way to trans-form these records. She experimented with two different approaches tomake the files “interpretable” to the demo servers.

The first approach was to create an index page and correspondingXSLT style sheet to generate an HTML file from the persistent URL em-bedded in the XML file. Gregory reviewed these files and replied thateven with the style sheet to transform the persistent URLs unlike a Webbrowser, the demo server cannot interpret tags from within a file. Giventhis feedback, Greene returned to the question of how to transform theXML files to HTML in a manner that would work with the GSA. Thesolution was to write and run a script that generated an HTML file foreach of the XML files. This transformation was successful, except foran error message for two files, which required manual correction. Sub-

Mary Taylor 49

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 2

1:20

06

Oct

ober

201

4

Page 8: Using the Google Search Appliance for Federated Searching

sequent attempts to use this script for transforming XML files have beensuccessful with none of the files requiring manual correction. She alsodetermined that if the GSA could crawl the files starting from the homepage, it would be more efficient to change the folder holding the samplerecords into a subweb on the library’s network. This approach was suc-cessful and keyword searches about the two topics Gorrell selected forthe records (the 1989 Tiananmen Square Protests and granting “MostFavored Nation” status to China) generated meaningful search results.

CHALLENGES OF IMPLEMENTING GSA VERSION 4.1

Despite the successful outcome for the test, the final decision was todelay bringing in the GSA for an onsite trial. Working with the samplerecords from Academic Search Premier raised both technical and fi-nancial issues. While we knew that the GSA would be able to cacheand index CONTENTdm and GIS collections, doing a full test on Aca-demic Search Premier content would have required switching to theSOAP-based interface and taking an additional charge to our existingcontract with EBSCO. It made no sense to make a significant financialinvestment in a database interface that we would potentially have noother use for besides the test. We had also not gained any insight into itsability to pass through the authentication process for a third party site orif it was capable of indexing and caching dynamically generated contentor database records. The only option that we knew could make Aca-demic Search Premier work with the GSA would be to locally host theXML formatted records. Using that approach would entail additionalcosts to our contract with EBSCO and also the investment of time andeffort to create scripts and style sheets for transforming the XML filesinto HTML. Greene estimated that she spent at least ten hours figuringout this process.

Most of the case studies5 on the Enterprise Search section ofGoogle’s Web site list clients who are using the GSA for publicly acces-sible Web sites or intranets. Considering how Google had primarily de-veloped and marketed the GSA for these types of digital environments,one explanation was that the GSA might only be able to search eithercompletely inside or outside of a firewall. In fact, the GSA section ofGoogle’s Web site includes two categories for its case studies, “IntranetDeployments”6 and “Public Web site Deployments.”7 Another factor toconsider is that the content in these environments generally consists ofdiscrete documents and stable URLs, making it easier for the GSA to

50 LIBRARIES AND GOOGLE®

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 2

1:20

06

Oct

ober

201

4

Page 9: Using the Google Search Appliance for Federated Searching

create a stable unique key to store in its cache. An evaluation of the GSAversion 4.1 backs up this conclusion: “[The GSA is] best used in tacticalexternal or internal intranet installations where content need not be in-dexed directly from dynamic repositories.”8

Oxford University published a brief article in Ariadne about its trialof the GSA that supports this theory. They also had issues with beingable to index both public and restricted sections of its network.9 Theirconclusion was that the most straightforward–though expensive–solu-tion would be to implement two models, a GSA for publicly availablecontent and another for restricted content:

. . . one for outside the firewall and one for inside. The most likelyinvolved routing all searches through a proxy server maintainedseparately, which would check all accesses to see if they wouldwork from outside the firewall, and annotating the database ac-cordingly. It is worth noting that if all Oxford Web sites had put theirrestricted material on a separate Web server (e.g., oucs-oxford.ox.ac.uk) or used a naming convention (e.g., oucs.ox.ac.uk/oxonly/),it would be easy to configure the box to provide the external searchas needed.10

Our original reason for looking into a test of the GSA was Google’spolicy of offering a free trial period. Paying an additional charge for theSOAP interface to EBSCOhost meant that the test would no longer befree. The other option of hosting a local version of Academic SearchPremier on our network would also include an additional charge to ourcontract with EBSCO and also require additional support from the Sys-tems Office. The most serious financial issue for implementing theGSA is that it not only uses the document model for content indexingand caching, but also in setting the price. At present there are four dif-ferent versions of the GSA. The recently released “mini” GSA starts at$3,000, can search up to 100,000 documents and is marketed to smalland medium-sized businesses.11 The model that would most likely fitwithin a library’s budget and collections requirements, the GB-1001,starts at around $30,000 and can search up to 500,000 documents.12

Considering the huge amounts of data stored in an average vendor data-base, such as the 18 million unique articles in Academic Search Pre-mier, using the document model as the basic pricing unit would increasethe GSA’s cost far above $28,000. Even the highest end model of theGSA (the GB-8008 which costs around $450,000)13 would not be ableto cache and index the full scope of Academic Search Premiere.

Mary Taylor 51

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 2

1:20

06

Oct

ober

201

4

Page 10: Using the Google Search Appliance for Federated Searching

FOLLOW UP WITH NELLCOAND SEARCH APPLIANCE RELEASE 4.2

During the conference calls with Google, they mentioned that the nextscheduled version for the GSA (Release 4.3) would include added func-tionality that could resolve its issues with dynamically generated contentand authentication. Several months after the decision to delay an onsitetrial of the GSA, a colleague located a summary of presentations at theInternational Coalition of Library Consortia’s April 2005 meeting inBoston. The New England Law Library Consortium’s (NELLCO) pre-sentation about federated searching and the GSA, “NELLCO’s Blue SkyThinking–a possible alternative to federated searching”14 discussed fa-miliar issues. Like us, they had completed an unsatisfactory review of thefederated search products and consequently started looking at the GSA asan alternative. At the time of the presentation, they were planning a test ofthe GSA’s ability to search local collections and vendor databases.

Out of curiosity, I contacted NELLCO’s executive director, Tracy L.Thompson, who delivered the presentation, to ask about the presenta-tion. This contact led to several other telephone and e-mail conversa-tions about the GSA both with Thompson and Roberta Woods of theFranklin Pierce Law Library. I shared with them our notes from the con-ference calls and test and they in turn discussed the research that theyhad been doing for an upcoming meeting with Google and an interestedvendor about testing the GSA at the Franklin Pierce Law Library. Dur-ing these conversations, Thompson made the very astute observationthat while Google Scholar might provide an adequate short term solu-tion for applying Google’s PageRank technology for searching schol-arly content, it also means that collection development occurs based onwhatever publishing companies and institutions decide to participate,rather than local needs and policies. At present, Google Print (the divi-sion that oversees the Google Scholar and the “Google for Libraries”initiatives) has declined to name all of the participating publishers. Fur-thermore, organizations such as NELLCO’s member institutions havediscovered that niche publications, especially for professions such astheirs, are not yet making it into Google Scholar.

XML FEEDS AND AUTHORIZATION API IN RELEASE 4.2

Speaking with Thompson and Woods helped to further refine ourtechnical knowledge about the GSA. Woods located two key pieces of

52 LIBRARIES AND GOOGLE®

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 2

1:20

06

Oct

ober

201

4

Page 11: Using the Google Search Appliance for Federated Searching

information that we had unsuccessfully requested during our confer-ence calls. Information about how to cache and authenticate third-partycontent is not found on the Search Appliance section of Google’s Website, but instead in the Code section. According to the documentation,Release 4.2 of the Search Appliance has a “Third-Party Content FeedAPI” that makes it possible to handle dynamically generated content byconverting search results into XML. We were able to find a case studyon the Search Appliance section of Google’s Web site, about Sur La Ta-ble’s e-commerce Web site, which discusses the process of convertingthe search results for dynamically generated content from a third-partydatabase into an XML feed:

Because the Google Search Appliance gives administrators accessto Google results as an XML data feed, Grant was able to integratethe results easily into a Cold Fusion environment. Pointing theGoogle Search Appliance at an offline product database, Grantthen mapped the search results to live URLs using a few simplescripts.15

Both the announcement for release 4.2 and the accompanying doc-umentation for its new “Database Crawler16 feature, mention thatthe GSA can now crawl and index content held in “standard enter-prise relational.”17 However from a review of the documentation,it appears that this function works for locally held databases (i.e.,on one’s own network) that do not require authentication for usersoutside of a firewall. Finally, the document model for pricing ap-pears to also be in place for database content, in that “each data-base record is counted as a ‘document’ toward the license limit.”18

Woods also located the documentation about the Authorization Ap-plication Protocol Interface (API), for release 4.2, the “Secure ContentAPI,” although after the answer given to us during the chat session, itstill is not clear how authentication scales to the level of working withelectronic journals and databases, where access is for a large and dis-tributed user base and requires more than a single user id and password.The latest developer’s guide to the Application Feeds Protocol, releasedon June 2nd 2005, does describe a process similar to the one we used forour test:

To create a feed, you will convert your data to XML. You will thenupload the XML to the appliance using a web form or a script. A

Mary Taylor 53

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 2

1:20

06

Oct

ober

201

4

Page 12: Using the Google Search Appliance for Federated Searching

script that creates the XML data and pushes it to the appliance isknown as a custom connector. The XML data that you push to theappliance is the feed.19

CONCLUSIONS

Despite the “plug and play” statements in their promotional materi-als, doing a full implementation of the Google Search Appliance in thelibrary environment not only requires purchasing costs and annual li-censing fees, but also additional manpower and unforeseen costs likehaving a SOAP database interface. At present, our institution has de-cided to defer doing an onsite trial and seriously considering it as a fed-erated search tool until it becomes clearer how it can best authenticateand index/retrieve dynamically generated content from third-party data-bases. The short-term solution has been to enable the library’s link re-solver to join Google Scholar. This decision does not mean that we haveabandoned the idea of using the Search Appliance, but instead that wewant to gain a better understanding of the technologies underneath itand to also advocate for development that can address these needs. Ourimpression is that working with this type of content, especially when itconsists of dynamically generated, discrete documents, is new to Google,especially given its use of the document model for pricing. It will likelytake more work by Google and customer input in order to make theSearch Appliance be a “plug and play” solution for third-party, dynami-cally generated content. We continue to stay in touch with our contactsat Google and EBSCO and are following the work that NELLCO is do-ing with their experiment to test the Search Appliance on FranklinPierce’s collections.

REFERENCES

1. Bryan Mjaanes. “Review: Implementing the Google Search Appliance in anIntranet environment.” Available at http://www.macosx.com/articles/review-implementing-the-google-search-appliance-in-an-intranet-enviro.html (accessed March 12, 2005).

2.“Google Enterprise Solutions: The Google Search Appliance.” Available athttp://www.google.com/enterprise/gsa/ (accessed January 10, 2005).

3. Mjaanes, Ibid.4. Endeavor Information Systems Incorporated. “Endeavor, EBSCO partner for

XML gateway development providing dependable search and retrieval functionality.”(Des Plaines, IL.: Endeavor Information Systems Incorporated.), press release. Avail-

54 LIBRARIES AND GOOGLE®

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 2

1:20

06

Oct

ober

201

4

Page 13: Using the Google Search Appliance for Federated Searching

able at http://www.endinfosys.com/cgi-bin/news/viewer.cgi?ID=55 (accessed July 29,2005).

5. Google Enterprise Solutions. “Customers.” Available at http://www.google.com/enterprise/customers.html (accessed January 10, 2005).

6. Google Enterprise Solutions. “Intranet Deployments.” Available at http://www.google.com/appliance/intranet.html (accessed January 10, 2005).

7. Google Enterprise Solutions. “Site Search.” Available at http://www.google.com/appliance/sitesearch.html (accessed January 10, 2005).

8. Ann Bednarz. “Google upgrades search appliance.” Available at http://www.networkworld.com/news/2004/0607google.html (accessed June 25, 2005).

9. Sebastian Rahtz. “Looking for a Google Box?” Available from http://www.ariadne.ac.uk/issue42/rahtz/ (accessed January 10, 2005).

10. Ibid.11. Google Enterprise Solutions. “The Google Mini.” Available from http://www.

google.com/enterprise/mini/ (accessed January 10, 2005).12. Google Enterprise Solutions. “Google Search Appliance: Product Models.”

Available from http://www.google.com/appliance/products.html (accessed January10, 2005).

13. Google Enterprise Solutions. “Google Search Appliance: Product Models.”14. Tracy Thompson (2005, April). “NELLCO’s Blue Sky Thinking a possible al-

ternative to federated searching.” Presentation at International Coalition of LibraryConsortia (ICOLC) Spring 2005 Meeting, Boston, MA.

15. Google Enterprise Solutions. “Google Search Appliance–Database CrawlerFeature Snippet” Available from http://code.google.com (accessed May 25, 2005).

16. Google Enterprise Solutions. “Product Features–What’s New Snapshot.” Avail-able from http://www.google.com/enterprise/gsa/features.html (accessed June 5, 2005).

17. Google Enterprise Solutions. “Google Search Appliance–Database CrawlerFeature Snippet” Available from http://code.google.com (accessed May 25, 2005).

18. Google Enterprise Solutions. “For licensing purposes, how are ‘documents’counted with respect to database records?” Available from http://www.google.com/support/gsa/bin/answer.py?answer=16586&topic=-1 (accessed May 25, 2005).

19. Google Code. “Google Search Appliance Feeds Protocol Developer’s Guide.”Available from http://code.google.com (accessed June 25, 2005).

Mary Taylor 55

Dow

nloa

ded

by [

Uni

vers

ity o

f C

hica

go L

ibra

ry]

at 2

1:20

06

Oct

ober

201

4