11
Program Opensource tools for enhancing fulltext searching of OPACs: Use of Koha, Greenstone and Fedora K.T. Anuradha R. Sivakaminathan P. Arun Kumar Article information: To cite this document: K.T. Anuradha R. Sivakaminathan P. Arun Kumar, (2011),"Open#source tools for enhancing full#text searching of OPACs", Program, Vol. 45 Iss 2 pp. 231 - 239 Permanent link to this document: http://dx.doi.org/10.1108/00330331111129750 Downloaded on: 04 November 2014, At: 01:26 (PT) References: this document contains references to 19 other documents. To copy this document: [email protected] The fulltext of this document has been downloaded 1352 times since 2011* Users who downloaded this article also downloaded: Manisha Singh, Gareema Sanaman, (2012),"Open source integrated library management systems: Comparative analysis of Koha and NewGenLib", The Electronic Library, Vol. 30 Iss 6 pp. 809-832 Bojan Macan, Gladys Vanesa Fernández, Jadranka Stojanovski, (2013),"Open source solutions for libraries: ABCD vs Koha", Program, Vol. 47 Iss 2 pp. 136-154 Alan Poulter, Alexandria Payne, Vandana Singh, (2010),"Open source software use in libraries", Library Review, Vol. 59 Iss 9 pp. 708-717 Access to this document was granted through an Emerald subscription provided by 549055 [] For Authors If you would like to write for this, or any other Emerald publication, then please use our Emerald for Authors service information about how to choose which publication to write for and submission guidelines are available for all. Please visit www.emeraldinsight.com/authors for more information. About Emerald www.emeraldinsight.com Emerald is a global publisher linking research and practice to the benefit of society. The company manages a portfolio of more than 290 journals and over 2,350 books and book series volumes, as well as providing an extensive range of online products and additional customer resources and services. Emerald is both COUNTER 4 and TRANSFER compliant. The organization is a partner of the Committee on Publication Ethics (COPE) and also works with Portico and the LOCKSS initiative for digital archive preservation. *Related content and download information correct at time of download. Downloaded by GAZI UNIVERSITY At 01:26 04 November 2014 (PT)

Open-source tools for enhancing full-text searching of OPACs: Use of Koha, Greenstone and Fedora

  • Upload
    p-arun

  • View
    236

  • Download
    5

Embed Size (px)

Citation preview

Page 1: Open-source tools for enhancing full-text searching of OPACs: Use of Koha, Greenstone and Fedora

ProgramOpen‐source tools for enhancing full‐text searching of OPACs: Use of Koha, Greenstoneand FedoraK.T. Anuradha R. Sivakaminathan P. Arun Kumar

Article information:To cite this document:K.T. Anuradha R. Sivakaminathan P. Arun Kumar, (2011),"Open#source tools for enhancing full#textsearching of OPACs", Program, Vol. 45 Iss 2 pp. 231 - 239Permanent link to this document:http://dx.doi.org/10.1108/00330331111129750

Downloaded on: 04 November 2014, At: 01:26 (PT)References: this document contains references to 19 other documents.To copy this document: [email protected] fulltext of this document has been downloaded 1352 times since 2011*

Users who downloaded this article also downloaded:Manisha Singh, Gareema Sanaman, (2012),"Open source integrated library management systems:Comparative analysis of Koha and NewGenLib", The Electronic Library, Vol. 30 Iss 6 pp. 809-832Bojan Macan, Gladys Vanesa Fernández, Jadranka Stojanovski, (2013),"Open source solutions forlibraries: ABCD vs Koha", Program, Vol. 47 Iss 2 pp. 136-154Alan Poulter, Alexandria Payne, Vandana Singh, (2010),"Open source software use in libraries", LibraryReview, Vol. 59 Iss 9 pp. 708-717

Access to this document was granted through an Emerald subscription provided by 549055 []

For AuthorsIf you would like to write for this, or any other Emerald publication, then please use our Emerald forAuthors service information about how to choose which publication to write for and submission guidelinesare available for all. Please visit www.emeraldinsight.com/authors for more information.

About Emerald www.emeraldinsight.comEmerald is a global publisher linking research and practice to the benefit of society. The companymanages a portfolio of more than 290 journals and over 2,350 books and book series volumes, as well asproviding an extensive range of online products and additional customer resources and services.

Emerald is both COUNTER 4 and TRANSFER compliant. The organization is a partner of the Committeeon Publication Ethics (COPE) and also works with Portico and the LOCKSS initiative for digital archivepreservation.

*Related content and download information correct at time of download.

Dow

nloa

ded

by G

AZ

I U

NIV

ER

SIT

Y A

t 01:

26 0

4 N

ovem

ber

2014

(PT

)

Page 2: Open-source tools for enhancing full-text searching of OPACs: Use of Koha, Greenstone and Fedora

Open-source tools for enhancingfull-text searching of OPACsUse of Koha, Greenstone and Fedora

K.T. Anuradha, R. Sivakaminathan and P. Arun KumarNational Centre for Science Information, Indian Institute of Science,

Bangalore, India

Abstract

Purpose – There are many library automation packages available as open-source software,comprising two modules: staff-client module and online public access catalogue (OPAC). Although theOPAC of these library automation packages provides advanced features of searching and retrieval ofbibliographic records, none of them facilitate full-text searching. Most of the available open-sourcedigital library software facilitates indexing and searching of full-text documents in different formats.This paper makes an effort to enable full-text search features in the widely used open-source libraryautomation package Koha, by integrating it with two open-source digital library software packages,Greenstone Digital Library Software (GSDL) and Fedora Generic Search Service (FGSS),independently.

Design/methodology/approach – The implementation is done by making use of the Search andRetrieval by URL (SRU) feature available in Koha, GSDL and FGSS. The full-text documents areindexed both in Koha and GSDL and FGSS.

Findings – Full-text searching capability in Koha is achieved by integrating either GSDL or FGSSinto Koha and by passing an SRU request to GSDL or FGSS from Koha. The full-text documents areindexed both in the library automation package (Koha) and digital library software (GSDL, FGSS)

Originality/value – This is the first implementation enabling the full-text search feature in a libraryautomation software by integrating it into digital library software.

Keywords Information retrieval, Open systems, Systems software, Online catalogues,Library automation

Paper type Research paper

1. IntroductionLibraries had been looking forward to the use of better technologies even before theonset of the computers. The introduction of the typewriter into libraries was arevolutionary concept in the late 1800s. Later stages of modernisation witnessed theintroduction of unit record equipment, the move to offline computerisation, and use ofonline systems. By the 1960s, computers were being used for the production ofmachine-readable catalogue records (MARC) by the Library of Congress (LOC).Between 1965 and 1968, the LOC began the MARC I project, followed quickly byMARC II. MARC was designed as a way of ‘tagging’ bibliographic records using

The current issue and full text archive of this journal is available at

www.emeraldinsight.com/0033-0337.htm

This work was carried out as part of a project entitled “Enhancing Knowledge InnovationCulture of Libraries through Union Catalogues”, carried out at the National Centre for ScienceInformation, Indian Institute of Science, Bangalore, India, funded by the InternationalDevelopment Research Centre, Canada.

Open-sourcetools

231

Received 30 April 2010Revised 3 September 2010

Accepted 6 January 2011

Program: electronic library andinformation systems

Vol. 45 No. 2, 2011pp. 231-239

q Emerald Group Publishing Limited0033-0337

DOI 10.1108/00330331111129750

Dow

nloa

ded

by G

AZ

I U

NIV

ER

SIT

Y A

t 01:

26 0

4 N

ovem

ber

2014

(PT

)

Page 3: Open-source tools for enhancing full-text searching of OPACs: Use of Koha, Greenstone and Fedora

three-digit numbers to identify fields. In 1974, the MARC II format became the basis ofa standard incorporated by National Information Standards Organization (NISO). Thiswas a significant development because the standards created meant that abibliographic record could be read and transferred by the computer betweendifferent library systems (Nelson, 1991; Williams, 2002). This was followed by theonline systems of the 1970s (Griffiths and King, 2002). The 1980s saw the advent ofmicrocomputers and the emergence of CD-ROM technology (Segesta and Reid-Green,2002). Finally, we come to the internet revolution of the 1990s.

Along with MARC, another important standard developed for searching andretrieving catalogue records over the network is the ANSI/NISO Z39.50 standard. Firstapproved in 1992 and expanded in 1995, Z39.50 was ratified as the internationalstandard ISO 23950 in 1998 and reaffirmed by NISO in 2003 (ANSI/NISO, 2003). Thecombination of MARC for representing catalogue records and Z39.50 for locating andobtaining them enables a wide range of bibliographic applications to be built,achieving levels of semantic coherence not possible in most other application domains(Taylor et al., 2004).

The pre-computer era of punched cards, the early development of data structuresand standards for library applications (especially MARC), and many other similarearly applications were perceived in terms of transforming libraries. However, thedizzying advances in both computing power and network capacity, combined with theincreasing availability of networked resources, have provided a range of new activitiesfor libraries in the present period: the internet era. Libraries continue to be earlyadopters and developers of technology. The new developments can be attributed to thefactors such as: increasing availability of, and the concomitant demand for, networkedelectronic resources; increasing costs for these networked electronic resources;re-envisioned scholarly communication and publishing processes, and the advent andgrowth of virtual classrooms and remote students, especially in an academicenvironment (Graham, 2002). Lynch (2000) has identified this as the third phase ofinformation technology as applied to libraries – “the availability of content in theelectronic form”.

Open-source software (OSS) is computer software for which the source code andcertain other rights normally reserved for copyright holders are provided under asoftware licence that meets the open-source definition in the public domain (see www.opensource.org/). This permits users to use, change, and improve the software, and toredistribute it in modified or unmodified forms. It is very often developed in a public,collaborative manner. OSS is the most prominent example of open-source developmentand often compared to user-generated content. The term OSS originated as part of amarketing campaign for free software (Muir, 2005). An integrated library automationpackage (ILAP) is referred to as library automation software, integrating all theactivities and routines of the library. An ILAP means an enterprise resource planningsystem for a library through which all the library routine operations, such as trackingorders made for the documents, checking the bills paid, recording the documentsowned by a library and patrons who have borrowed the documents. In other words, itis an integrated package where all the library activities such as acquisitions,cataloguing, circulation, serials control and the online public access catalogue (OPAC)are automated. There are many ILAPs available in the market that meet the needs, aswell as budgets, of libraries. However, with the OSS movement catching up, a few

PROG45,2

232

Dow

nloa

ded

by G

AZ

I U

NIV

ER

SIT

Y A

t 01:

26 0

4 N

ovem

ber

2014

(PT

)

Page 4: Open-source tools for enhancing full-text searching of OPACs: Use of Koha, Greenstone and Fedora

open-source ILAPs are also available, which are comparable with any commercialILAP. Examples include Koha, Evergreen, OPAL, PhpMyBibli, OpenBook,OpenBiblio. Among these Koha is the first open-source library automation softwareand is widely used (Anuradha, 2009; Breeding, 2007; Bissels, 2008).

A digital library is a library in which collections are stored in digital formats (asopposed to print, microform, or other media) and accessible by computers. The digitalcontent may be stored locally, or accessed remotely via computer networks. A digitallibrary is a type of information retrieval system and the software used to enable thisfunctionality is known as digital library software (DLS). A more detailed workingdefinition of digital library is available at the Digital Library Federation (1998). Thereare many open-source DLS available, including Greenstone Digital Library Software(GSDL (www.greenstone.org)), Dspace (www.dspace.org), Fedora Commons(www.fedora-commons.org).

2. Need for the studyMost of the open-source ILAPs are either in development or do not have full functionalmodules. Of them, a few satisfy the key functional requirements of an ILAP andsupport essential modules like acquisition, cataloguing, circulation, serials control andOPAC. However, full-text search and retrieval features are not available in any ILAP,but these features are available in most of the DLS. As more and more content isavailable in electronic form, libraries have to facilitate full-text searching in theirOPAC. Hence an attempt is made here to integrate DLS into ILAP. Doing so will resultin an ILAP with full-text indexing, searching and retrieving options. Also, DLS canindex documents of a large size and in different formats such as PDF,.doc,.rtf and.ppt.A few DLS store documents in compressed form thereby saving hard disk space.

3. Objective and methodology of the studyThe main objective is to facilitate full-text indexing and searching in an ILAP byintegrating it with a DLS. An effort is made to add full-text indexing and searchingfeatures in Koha Version 3.0.2 by integrating it with the GSDL Version 2.83 and alsowith the Fedora Generic Search Service (FGSS), both independently.

Among open-source ILAPs, Koha is widely used and both GSDL and FedoraCommons are popular DLS. However, Fedora Commons combined with FGSS alsoprovides a full-text searching feature. All the three software packages are compatiblewith many library standards such as SRU/W feature, Z39.50 feature, and Dublin Core(DC) metadata definitions. To facilitate full-text searching in Koha, Search andRetrieval through URL (SRU) functionality available in Koha, GSDL and FGSS is used.Hence an SRU request is sent to the respective DL from Koha for full-text searching.

4. Overview of software used4.1 KohaIn 1999 when the Horowhenua Library Trust (HLT) in New Zealand, was looking for aYear 2000 (Y2K) compliant replacement for their library system, KatipoCommunications proposed a new system, using open-source tools to be releasedunder the general public licence (GPL). Koha (the Maori word for “gift” or “donation”)went live at HLT in January 2000, and was the world’s first open-source ILAP and isdistributed under GNU GPL licence.

Open-sourcetools

233

Dow

nloa

ded

by G

AZ

I U

NIV

ER

SIT

Y A

t 01:

26 0

4 N

ovem

ber

2014

(PT

)

Page 5: Open-source tools for enhancing full-text searching of OPACs: Use of Koha, Greenstone and Fedora

The current version is Koha-3.0.4 (Linux platform only) and Koha 2.2.9 (forWindows and other platforms) (http://koha.org). It runs on different platforms,including Linux, MacOSx, FreeBSD, Solaris, and Windows. Originally developed onthe Linux OS, Koha is written in Perl, uses the Apache web server, and has bettersupport for multi-RDBMS like MySQL, PostgreSQL. The OPAC interface is in CSSwith XHTML. It supports all major library standards such as MARC recordimport/export, Z39.50 and SRU/W feature. Koha-3.x supports the Zebra full-text searchengine as a backend, in addition to MySQL/PostgreSQL. Records are stored internallyin an SGML-like format and can be retrieved in MARCXML, Dublin Core, MODS, RSS,Atom, RDF-DC, SRW-DC, OAI-DC, and Endnote; and the OPAC can be used by citationtools such as Zotero. Koha’s default installation supports running Zebra which isconfigured to support SRU queries on bibliographic and authority data. Zebra itself iscapable of detecting Z39.50 or HTTP and responding with SRU if the incoming requestis HTTP.

4.2 GSDLGSDL (www.greenstone.org) is a suite of software for building, publishing anddistributing digital library collections either on the internet or on CD-ROM. It iscompatible with many library standards such as SRU/W feature, Z39.50, and MARCrecord import. These features of Greenstone make it an appropriate option forintegrating it with a library automation package for full-text indexing and searching. Itis produced by the New Zealand Digital Library Project at the University of Waikato,and developed and distributed in co-operation with UNESCO and the Human InfoNGO.

4.3 Fedora CommonsFedora (or Flexible Extensible Digital Object Repository Architecture) (not to beconfused with the Linux distribution named Fedora) has a modular architecture builton the principle that interoperability and extensibility is best achieved by theintegration of data, interfaces, and mechanisms (i.e. executable programs) as clearlydefined modules. Fedora is a digital asset management (DAM) architecture, on whichmany types of digital library, institutional repositories, digital archives, and digitallibraries systems might be built. Fedora is the underlying architecture for a digitalrepository, and is not a complete management, indexing, discovery, and deliveryapplication. However, by using the FGSS, which is part of the Fedora ServiceFramework, indexing of Fedora FOXML records, including the text content of datastreams and the results of disseminator calls is possible. More details can be obtainedfrom the Fedora Commons official site: http://fedora-commons.org

4.4 Full-text search with SRUWith the increasing ubiquity of XML as a canonical data-interchange meta-format,there is a perceived need to recast the MARC formats in terms of XML. The first sucheffort to gain widespread acceptance was the Library of Congress’s (2003) MARCXML,an XML schema designed to represent MARC21 records. In response to this perception,work began in December 2000 on recasting the powerful and expressive Z39.50semantics in terms of mechanisms more readily understood in the contemporaryinformation environment. The result of this effort is a family of two new protocols:

PROG45,2

234

Dow

nloa

ded

by G

AZ

I U

NIV

ER

SIT

Y A

t 01:

26 0

4 N

ovem

ber

2014

(PT

)

Page 6: Open-source tools for enhancing full-text searching of OPACs: Use of Koha, Greenstone and Fedora

SRW (the Search/Retrieve Web-service) and SRU (Taylor and Dickmeiss, 2005).SRW/U is an XML-based protocol designed to be a low-barrier-to-entry solution forsearching and other information retrieval operations. It uses existing, well-tested, andeasily available technologies, such as URI, XML, SOAP, REST, HTTP, and X-Path(Morgan, 2004). SRW/U essentially comes in two flavours: Representational StateTransfer (REST) and Simple Object Access Protocol (SOAP). A “REST-ful” webservice usually encodes commands from a client to a server in the query string of aURL. Each name/value pair of the query string specifies a set of input parameters forthe server. Once received, the server parses this name/value pairs, does someprocessing using them as input, and returns the results as an XML stream. The shapeof the query string as well as the shape of the XML stream is dictated by the protocol.By definition, the communication process between the client and the server isfacilitated over an HTTP connection (Morgan, 2004). SRU/W uses the Common QueryLanguage (CQL) as the format for submitting the queries. Although CQL is a formallanguage for representing queries to information retrieval systems, it has beendesigned to be human readable and writable. It allows both simple and very complexand powerful queries (Anuradha and Sivakaminathan, 2009). More details areavailable at: www.loc.gov/standards/sru/index.html

5. Implementation details of koha-gsdl integrationTo implement the full-text search and retrieval feature in Koha V3.0.2, the software isdownloaded and installed on Ubuntu 8.04. Koha 3.0.2, available for Linux platform,contains four components, namely: OPAC, intranet, Daemons, and database. OPAC isused to search and locate library holdings. There are two components in the OPACpackage. The main OPAC contains all the Perl scripts and the HTML OPAC containsall the HTML templates for all the HTML pages. In OPAC, searching can be done indifferent ways:

. advanced search;

. search by subject; and

. basic search.

Koha allows field-based searching by making use of field tags such as Title, Author,ISBN. The “intranet” is the “back office” and the “front desk” side of the system. Thereare nine components in the intranet package. The main intranet component contains allthe main Perl scripts to handle the navigation, login, logout and provides connection tothe other components. The HTML intranet component contains all the HTMLtemplates for all pages for the intranet side of the system. Daemons contain all thescripts for all the “daemons” in the system. There is currently only one daemon – theZ39.50 Daemon component – in the Daemons package. This component provides theconnectivity to Z39.50 servers for querying of library material using the Z39.50protocol. Database components make use of the MySQL database, which has 116tables. One of the tables is “biblioitems”, which contains all the metadata informationand passes this to the Zebra search engine in MARC format for indexing.

GSDL v2.83, downloaded from www.greenstone.org, was also installed. In GSDL,after indexing, for each collection (a set of documents form a collection) nine folderswere created namely: archives; building; index; etc; import; perllib; images; macros;

Open-sourcetools

235

Dow

nloa

ded

by G

AZ

I U

NIV

ER

SIT

Y A

t 01:

26 0

4 N

ovem

ber

2014

(PT

)

Page 7: Open-source tools for enhancing full-text searching of OPACs: Use of Koha, Greenstone and Fedora

tmp. The documents to be indexed are kept in import. The collect.cfg file in the etcfolder determines the look and feel of the collection.

A few freely available e-books were downloaded and were catalogued in Koha,giving all the required bibliographical details. The same e-books were indexed inGSDL. In the Koha OPAC the full-text search option is added and they are searchedthrough the SRU feature available both in Koha and GSDL. In the followingparagraphs the full-text indexing and full-text searching modules are explained indetail.

The bibliographical details of full-text documents are catalogued in Koha and alsoindexed in GSDL for carrying out a full-text search. In the Koha cataloguing module,the URL of the full-text document is specified under the tag 856, which is a repeatablefield, thereby multiple URLs for the same document can be given. After filling up therequired cataloguing details in Koha the record is saved. After saving the catalogueinformation a unique document number is assigned by Koha for each catalogueddocument. This document number along with other required metadata details andfull-text document location obtained through the catalogue form is passed on to GSDLfor carrying out the full-text indexing. This is enabled by modifying additems.tmpl inKoha. A PHP script is invoked to carry out indexing in GSDL through the commandline collection building option.

To enable the full-text search in Koha and displaying the results of GSDL in KohaOPAC, four different perl scripts are written:

(1) fulltextsearch.pl;

(2) fulltextsearch1.pl;

(3) fulltextsearch.tmpl; and

(4) fulltextsearch1.tmpl.

In fulltextsearch.pl the query term is obtained from the user and passed to GSDLthrough the SRU technique. The URL that is passed is split into four parts with aquestion mark as the delimiter. The four parts are: GSDL location in the system;collection name; required query; and Do option.

The following example aims to explain this process:

Query string passed through SRU from Koha to GSDL:First part (GSDL location): http://dharmaganja.ncsi.iisc.ernet.in/cgi-bin/library?Second part (GSDL collection name): e ¼ p-01000-00---off-0koha--00-1--0-10-0---0---0prompt-10---4-------0-1--11-en-50---20-about---00-3-1-00-0-0-11-1-0utfZz-8-00 kohaThird part (Query term and other search details): fqf ¼ TX&t ¼ 0&q ¼ softwareFourth part (DO option): perform

In Koha the Explain pragma can be configured via the koha-conf.xml file and can beretrieved by a search interface using a query such as: http: , serverhost . :9999/biblios?version ¼ 1.1&operation ¼ explain

Queries for results can be run using syntax such as the following: http:, serverhost . :9999/biblios? version ¼ 1.1&operation ¼ searchRetrieve&query¼ harry&startRecord ¼ 1&maximumRecords ¼ 20&recordSchema ¼ mods

Koha SRU link looks like: http:// , IP . :Port/cgi-bin/koha/opac-search.pl?idx ¼ti&q ¼ Query

PROG45,2

236

Dow

nloa

ded

by G

AZ

I U

NIV

ER

SIT

Y A

t 01:

26 0

4 N

ovem

ber

2014

(PT

)

Page 8: Open-source tools for enhancing full-text searching of OPACs: Use of Koha, Greenstone and Fedora

The GSDL retrieved results are displayed in another window. To get back to theKoha search window, a link is given to the document in Koha from the GSDL retrievedresults window.

Thus by indexing the bibliographical details in Koha and full-text in GSDL, full-textsearching in Koha is enabled by sending a SRU request to GSDL. By linking thedocuments in GSDL to that of Koha, after viewing the full-text documents in GSDL, theuser can get back to the Koha OPAC page.

A prototype search system using Koha and GSDL can be viewed at: http://dharmaganja.ncsi.iisc.ernet.in:8082/

6. Implementation details of koha-fgss integrationBy default, the Koha OPAC page does not have full-text search option. Hence the KohaOPAC page is modified to have a full-text search option and from this an SRU requestis invoked to do a full-text search in Fedora and the results are displayed on the samescreen. In the SRU request the specific full-text database on FGSS is specified. Thefull-text documents in FGSS are indexed by giving a link to records in the Kohadatabase so that a user can get back to the Koha OPAC page, after viewing the full textin FGSS.

6.1 Changes in the Koha OPAC pageThe full-text option is added to the Koha OPAC by modifying the masthead.inc whichis located in http://koha/opac/htdocs/opac-tmpl/prog/en/includes. By clicking thefull-text search option, a PHP script is invoked, which divides the screen into two parts:the first part to have a query box and a submit button to submit the query and thesecond part to display results from FGSS.

6.2 Passing SRU request to FGSS and getting the resultsThe request sent to FGSS consists of a minimum of four parts:

(1) FGSS location;

(2) FGSS operation (or details);

(3) Query term; and

(4) Do operation.

The following example explains this:

Query string passed through SRU from Koha to FGSS:First part (FGSS location): http://dharmaganja.ncsi.iisc.ernet.in:8080/fedoragsearch/restSecond part (FGSS details, separated by first part by ?): operation ¼ gfindObjectsThird part (Query term and other search details, separated by second part by &):query ¼ javaFourth part (DO option): perform

Once this request is sent to FGSS and the search is carried out, the retrieved results aredisplayed in the second part of the window. While indexing in FGSS, a link is given tothe corresponding Koha records and hence after viewing the full-text documents userscan get back to Koha OPAC.

A prototype search system using Koha and FGSS can be viewed at: http://dharmaganja.ncsi.iisc.ernet.in/fgs/index.html

Open-sourcetools

237

Dow

nloa

ded

by G

AZ

I U

NIV

ER

SIT

Y A

t 01:

26 0

4 N

ovem

ber

2014

(PT

)

Page 9: Open-source tools for enhancing full-text searching of OPACs: Use of Koha, Greenstone and Fedora

7. ConclusionThe objective of enabling full-text searching capability in Koha was achieved byintegrating either GSDL or FGSS into Koha and by passing an SRU request to GSDL orFGSS from Koha. However, from Koha Version 3.x onwards, Koha uses the Zebrasearch engine along with MySql. Zebra is a very powerful search engine, whichindexes and searches full-text documents. By making changes in Zebra configurationfiles and MARC fields, full-text searching can be carried out in Koha itself.

References

ANSI/NISO (2003), Information Retrieval Z39.50: Application Service Definition and ProtocolSpecification, NISO Press, Bethesda, MD.

Anuradha, K.T. (2009), “Koha: an overview”, paper presented at the NCSI-IDRC Workshop onIntegrated Library Automation Packages: Basics of KOHA and NewGenLib, IndianInstitute of Science, Bengalooru.

Anuradha, K.T. and Sivakaminathan, R. (2009), “Enhancing full text search capability in libraryautomation package: a case study with Koha and Greenstone Digital Library Software”,paper presented at the International Conference on Computer Science and InformationTechnology, Singapore, 28-29 October, available at: http://dharmaganja.ncsi.iisc.ernet.in/documents/koha-gsdl.pdf

Bissels, G. (2008), “Implementation of an open-source library management system: experienceswith Koha 3.0 at the Royal London Homoeopathic Hospital”, Program: electronic libraryinformation systems, Vol. 42 No. 3, pp. 303-14.

Breeding, M. (2007), “An update on open-source ILS”, Computers in Libraries, Vol. 27 No. 2,pp. 27-9.

Digital Library Federation (1998), “A working definition of digital library”, available at: www.diglib.org/about/dldefinition.htm (accessed 6 January 2011).

Graham, R. (2002), “Guest editor’s conclusion: reflections on the evolution of library computing”,IEEE Annals of the History of Computing, Vol. 24 No. 3, pp. 75-8.

Griffiths, J.-M. and King, D.W. (2002), “US information retrieval system evolution and evaluation(1945-1975)”, IEEE Annals of the History of Computing, Vol. 24 No. 3, pp. 35-55.

Library of Congress (2003), “MARCXML, MARC21 XML schema”, available at: www.loc.gov/standards/marcxml/

Lynch, C.A. (2000), “From automation to transformation: forty years of libraries and informationtechnology in higher education”, EDUCAUSE Review, Vol. 35 No. 1, pp. 60-8.

Morgan, E.L. (2004), “Introduction to search/retrieve URL service (SRU)”, Ariadne, No. 40,available at: www.ariadne.ac.uk/issue40/morgan/ (accessed 6 January 2011).

Muir, S.P. (2005), “An introduction to the open-source software issues”, Library Hi Tech, Vol. 23No. 4, pp. 465-8.

Nelson, N.M. (Ed.) (1991), “Library technology 1970-1990: shaping the library of the future,research contributions from the 1990 Computers in Libraries Conference”, Computers inLibraries, No. 25, Meckler, Westport, CT, supplement.

Segesta, J. and Reid-Green, K. (2002), “Harley Tillitt and computerized library searching”, IEEEAnnals of the History of Computing, Vol. 24 No. 3, pp. 23-34.

Taylor, M. and Dickmeiss, A. (2005), “Delivering MARC/XML records from the Library ofCongress catalogue using the open protocols SRW/U and Z39.50”, paper presented at 71stIFLA, Oslo, Norway, available at: http://archive.ifla.org/IV/ifla71/papers/065e-Taylor_Dickmeiss.pdf (accessed 6 January 2011).

PROG45,2

238

Dow

nloa

ded

by G

AZ

I U

NIV

ER

SIT

Y A

t 01:

26 0

4 N

ovem

ber

2014

(PT

)

Page 10: Open-source tools for enhancing full-text searching of OPACs: Use of Koha, Greenstone and Fedora

Taylor, M., Hammer, S., Sanders, A., Dickmeiss, A., Sanderson, R. and Lav, A. (2004), “ZOOM:the Z39.50 object-orientation model, v1.4”, available at: http://zoom.z3950.org/api/zoom-1.4.html (accessed 6 January 2011).

Williams, R.V. (2002), “The use of punched cards in US libraries and documentation centers,1936-1965”, IEEE Annals of the History of Computing, Vol. 24 No. 2, pp. 16-33.

Further reading

Library of Congress (2005), “MARC standards”, available at: www.loc.gov/marc/ (accessed 6January 2011).

Library of Congress (2007), “SRU: search/retrieval via url”, available at: www.loc.gov/standards/sru/simple.html (accessed 6 January 2011).

Corresponding authorK.T. Anuradha can be contacted at: [email protected]

Open-sourcetools

239

To purchase reprints of this article please e-mail: [email protected] visit our web site for further details: www.emeraldinsight.com/reprints

Dow

nloa

ded

by G

AZ

I U

NIV

ER

SIT

Y A

t 01:

26 0

4 N

ovem

ber

2014

(PT

)

Page 11: Open-source tools for enhancing full-text searching of OPACs: Use of Koha, Greenstone and Fedora

This article has been cited by:

1. Aaron Palmer, Namjoo Choi. 2014. The current state of library open source software research. LibraryHi Tech 32:1, 11-27. [Abstract] [Full Text] [PDF]

2. Preedip Balaji Babu, M. Krishnamurthy. 2013. Library automation to resource discovery: a review ofemerging challenges. The Electronic Library 31:4, 433-451. [Abstract] [Full Text] [PDF]

Dow

nloa

ded

by G

AZ

I U

NIV

ER

SIT

Y A

t 01:

26 0

4 N

ovem

ber

2014

(PT

)