7/31/2019 OpenSearch Installation Guide
1/13
Scbl OpenSearch build 27012012
OpenSearch Installation Guide
For Developers
1. Cygwin Linux Shell Emulator2. Installing Sol-R server in Eclipse3. Running Nutch
7/31/2019 OpenSearch Installation Guide
2/13
Scbl OpenSearch build 27012012
Cygwin Linux Shell Emulator
To emulate nutch crawler in our local environment we need cygwin that act as a shell of the linux binary
(emulator)
These are few easy steps for installing cygwin in our local environment.
Cygwin Installation
Step-1: Download cygwin installer from the cygwin online repository or from thelocal repository
Step-2: Run the installer, select Install from Internet or Install from Local Directory if you already have
the setup files or if you have downloaded from the local repository
Step-3: Select Root Directory for the cygwin, finished Installing Cygwin
Nutch & Solr Preparation
Extract the nutch-solr-latest-cimsa-build to any folder below the cygwin Root Directory eg:
C:\cygwin\home\{your_user_name} or C:\cygwin\home\{your_user_name}\nutch-solr
http://c/Users/arifnpm/Downloads/cygwin-25012012.rarhttp://c/Users/arifnpm/Downloads/cygwin-25012012.rarhttp://c/Users/arifnpm/Downloads/cygwin-25012012.rarhttp://10.10.5.29:8080/alfresco/d/a/workspace/SpacesStore/3ad75fed-7497-4490-a6ba-7259ebc9af7e/nutch-solr-27012012.rarhttp://10.10.5.29:8080/alfresco/d/a/workspace/SpacesStore/3ad75fed-7497-4490-a6ba-7259ebc9af7e/nutch-solr-27012012.rarhttp://10.10.5.29:8080/alfresco/d/a/workspace/SpacesStore/3ad75fed-7497-4490-a6ba-7259ebc9af7e/nutch-solr-27012012.rarhttp://c/Users/arifnpm/Downloads/cygwin-25012012.rar7/31/2019 OpenSearch Installation Guide
3/13
Scbl OpenSearch build 27012012
Installing Sol-R server in Eclipse
Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project.
Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering,
database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly
scalable, providing distributed search and index replication, and it powers the search and navigation
features of many of the world's largest internet sites.
We can have easy access to sol-r from our development environment by adding the sol-r tomcat server
in the Eclipse server interface.
Prerequisite
Step-1: extract apache-ant into C:\Development\tools\
Step-2: add C:\Development\tools\apache-ant-1.8.2\bin to PATH environment variable
See:http://en.wikipedia.org/wiki/Environment_variablefor more information
Importing Sol-r source
Step-1: extract thesol-r sourceinto any location in your hard drive
Step-2: open command line tool, go to sol-r source home
http://en.wikipedia.org/wiki/Environment_variablehttp://en.wikipedia.org/wiki/Environment_variablehttp://en.wikipedia.org/wiki/Environment_variablehttp://10.10.5.29:8080/alfresco/d/d/workspace/SpacesStore/31d23f05-6724-4d99-9bfb-97df583bc754/apache-solr-3.4.0-src.tgzhttp://10.10.5.29:8080/alfresco/d/d/workspace/SpacesStore/31d23f05-6724-4d99-9bfb-97df583bc754/apache-solr-3.4.0-src.tgzhttp://10.10.5.29:8080/alfresco/d/d/workspace/SpacesStore/31d23f05-6724-4d99-9bfb-97df583bc754/apache-solr-3.4.0-src.tgzhttp://10.10.5.29:8080/alfresco/d/d/workspace/SpacesStore/31d23f05-6724-4d99-9bfb-97df583bc754/apache-solr-3.4.0-src.tgzhttp://en.wikipedia.org/wiki/Environment_variable7/31/2019 OpenSearch Installation Guide
4/13
Scbl OpenSearch build 27012012
Step-3: run ant eclipse in the root of sol-r source
Step-4: import the source folder to eclipse IDE
7/31/2019 OpenSearch Installation Guide
5/13
Scbl OpenSearch build 27012012
Step-5: test build sol-r using the build.xml inside the solr subfolder (not the build.xml in the root project
folder) by running the default solr task usage in Ant window.
Step-6: start run-example to check if its run correctly
7/31/2019 OpenSearch Installation Guide
6/13
Scbl OpenSearch build 27012012
Preparing tomcat
Step-1: downloadapache-tomcatand extract the content into C:\Development\servers,
{your_tomcat_home} will be C:\Development\servers\apache-tomcat-6.xx.xx
Step-2: change port number in {your_tomcat_home}/conf/server.xml, add 100 to every default port
numbers to avoid conflict with other servers. Or downloadserver.xmland put it into
{your_tomcat_home}/conf/
Configure tomcat in Eclipse
Step-1: go to Window >Preferences and then go to Server > Runtime Environments
http://10.10.5.29:8080/alfresco/d/a/workspace/SpacesStore/2e3e08aa-0d4a-4a81-a75a-1fc73ddbb10f/apache-tomcat-6.0.35-windows-x64.ziphttp://10.10.5.29:8080/alfresco/d/a/workspace/SpacesStore/2e3e08aa-0d4a-4a81-a75a-1fc73ddbb10f/apache-tomcat-6.0.35-windows-x64.ziphttp://10.10.5.29:8080/alfresco/d/a/workspace/SpacesStore/2e3e08aa-0d4a-4a81-a75a-1fc73ddbb10f/apache-tomcat-6.0.35-windows-x64.ziphttp://10.10.5.29:8080/alfresco/d/a/workspace/SpacesStore/83e8d88b-743d-44cc-a8aa-aec645dc697c/server.xmlhttp://10.10.5.29:8080/alfresco/d/a/workspace/SpacesStore/83e8d88b-743d-44cc-a8aa-aec645dc697c/server.xmlhttp://10.10.5.29:8080/alfresco/d/a/workspace/SpacesStore/83e8d88b-743d-44cc-a8aa-aec645dc697c/server.xmlhttp://10.10.5.29:8080/alfresco/d/a/workspace/SpacesStore/83e8d88b-743d-44cc-a8aa-aec645dc697c/server.xmlhttp://10.10.5.29:8080/alfresco/d/a/workspace/SpacesStore/2e3e08aa-0d4a-4a81-a75a-1fc73ddbb10f/apache-tomcat-6.0.35-windows-x64.zip7/31/2019 OpenSearch Installation Guide
7/13
Scbl OpenSearch build 27012012
Step-2: add, select Apache > Apache Tomcat v6.0 and click Next
Step-3: Browse, select {your_tomcat_home} folder and click finish
Step-4: go to Window > Show View > Servers
Step-5: right click in empty space and select New > Server
Step-6: select Apache > Tomcat v6.0 Server and click finish.
Step-7: double click the server, set server locations as Use Tomcat Installation, save
Step-8: test run the server
Build and Deploy in Tomcat
Step-0: make sure the server is not running (stop the Apache Tomcat v6.0 server within eclipse)
Step-1: downloadsolr-tomcat-configand put it inside the {your_tomcat_home}/ so you will have
{your_tomcat_home}/solr
Step-2: downloadsolr.xmltomcat context and put it in{your_tomcat_home}/conf/Catalina/localhost/
See: wiki.apache.org/solr/SolrTomcat
Step-3a (optional): run ant dist from solr/build.xml within eclipse to build a new solr from source
Step-3b (optional): copy {your_solr_project_location}/solr/dist/apache-solr-*.war and rename it to
{your_tomcat_home}/solr/solr.war
Step-4: testhttp://localhost:8180/solr/pengkajian/admin/andhttp://localhost:8180/solr/taplai/admin/
http://10.10.5.29:8080/alfresco/d/a/workspace/SpacesStore/a01ad84e-9639-473f-bef1-754f9863284b/solr-tomcat-config.rarhttp://10.10.5.29:8080/alfresco/d/a/workspace/SpacesStore/a01ad84e-9639-473f-bef1-754f9863284b/solr-tomcat-config.rarhttp://10.10.5.29:8080/alfresco/d/a/workspace/SpacesStore/a01ad84e-9639-473f-bef1-754f9863284b/solr-tomcat-config.rarhttp://10.10.5.29:8080/alfresco/d/a/workspace/SpacesStore/a31bc582-f966-449a-8e86-8d6491bd5995/solr.xmlhttp://10.10.5.29:8080/alfresco/d/a/workspace/SpacesStore/a31bc582-f966-449a-8e86-8d6491bd5995/solr.xmlhttp://10.10.5.29:8080/alfresco/d/a/workspace/SpacesStore/a31bc582-f966-449a-8e86-8d6491bd5995/solr.xmlhttp://localhost:8180/solr/pengkajian/admin/http://localhost:8180/solr/pengkajian/admin/http://localhost:8180/solr/pengkajian/admin/http://localhost:8180/solr/taplai/admin/http://localhost:8180/solr/taplai/admin/http://localhost:8180/solr/taplai/admin/http://localhost:8180/solr/taplai/admin/http://localhost:8180/solr/pengkajian/admin/http://10.10.5.29:8080/alfresco/d/a/workspace/SpacesStore/a31bc582-f966-449a-8e86-8d6491bd5995/solr.xmlhttp://10.10.5.29:8080/alfresco/d/a/workspace/SpacesStore/a01ad84e-9639-473f-bef1-754f9863284b/solr-tomcat-config.rar7/31/2019 OpenSearch Installation Guide
8/13
Scbl OpenSearch build 27012012
Running Nutch
Apache Nutch is an open source web-search software project. Stemming from Apache Lucene, it now
builds on Apache Solr adding web-specifics, such as a crawler, a link-graph database and parsing support
handled by Apache Tika for HTML and and array other document formats.
These are few steps for running nutch using cygwin.
Before running nutch, please make sure that sol-r have been started correctly.
Step-1: open another instance of cygwin
Step-2: go to nutch home directory (the location of nutch is from previously extracted nutch-solr-latest-
cimsa-build). For example if you extract the file to the C:\cygwin\home\{your-user-name}\ then you
should type: cd /home/{your-user-name}/nutch-1.3
7/31/2019 OpenSearch Installation Guide
9/13
Scbl OpenSearch build 27012012
Step-3: go to runtime/local directory of nutch cd runtime/local
Step-4: set environment variable for nutch java home export
NUTCH_JAVA_HOME='C:\Development\jdk\jdk1.6.0_24
7/31/2019 OpenSearch Installation Guide
10/13
Scbl OpenSearch build 27012012
Step-5a: run nutch to fill pengkajian sol-r database bin/nutch crawl urlspengkajian -solr
http://localhost:8180/solr/pengkajian/ -depth 3 -topN 5
The next screen will tell you that it has updated the pengkajian database correctly:
7/31/2019 OpenSearch Installation Guide
11/13
Scbl OpenSearch build 27012012
Step-5b: run nutch to fill taplai sol-r database bin/nutch crawl urlstaplai -solr
http://localhost:8180/solr/taplai/ -depth 3 -topN 5
The next screen will tell you that it has updated the taplai database correctly:
7/31/2019 OpenSearch Installation Guide
12/13
Scbl OpenSearch build 27012012
Appendix A: Test Running Sol-R in Cygwin
These are few steps for running sol-r using cygwin.
Step-1 : Open Cygwin Terminal
Step-2: go to sol-r home directory (the location of solr is from previously extracted nutch-solr-latest-
cimsa-build). For example if you extract the file to the C:\cygwin\home\{your-user-name}\ then you
should type: cd /home/{your-user-name}/solr-3.4.0
7/31/2019 OpenSearch Installation Guide
13/13
Scbl OpenSearch build 27012012
Tips: you can list the directory and confirm if we are in the right folder by typing ls
Step-3: go to example directory cd example
Step-4: start the sol-r Jetty based server to test if its run correctly java -jar start.jar
This next screenshot shows that sol-r have been started correctly (note that this server open at the 8983
port number):