Hadoop Eclipse

Embed Size (px)

Citation preview

  • 8/3/2019 Hadoop Eclipse

    1/11

    Programing Map-Reduce

    ( Hadoop )

    with Eclipse

    +

    Wei- Yu Chen

    NCHC

    2 0 0 8 / 0 5 / 2 7

    see more : ht tp: / / t rac .nchc .org. tw/c loud/

    http://trac.nchc.org.tw/cloud/http://trac.nchc.org.tw/cloud/
  • 8/3/2019 Hadoop Eclipse

    2/11

    1. Prepare :

    System :

    Ubuntu 7.10

    Hadoop 0.16

    Requirement :

    Eclipse (3.2.2)

    $ ap t - get install ecl ipse

    java 6

    $ ap t - ge t inst all sun- j ava6- b in sun - j ava6- jdk sun - j ava6- j r e sun- j ava6-plugin

    suggest to remove the default java compiler gcj

    $ ap t - ge t purge j ava -gc j -compat

    Append two codes to / e tc /bas h.bashrc to se tup Java Class path

    expor t JAVA_HOME=/ usr / l ib / jvm / java- 6 - sunexpor t HADOOP_HOME=/ ho me /waue /workspace /ha doop / ex po rt CLASSPATH=.:$JAVA_HOME/li b / d t.ja r:$JAVA_HOME/lib / to ols.ja r

    Building UP Path

    Name Path

    Hado op Ho me / h o me / wa ue / wo rks pace / ha do op /

    Java Home / us r /lib / jv m /java - 6 - su n

  • 8/3/2019 Hadoop Eclipse

    3/11

    2. Hadoop Setup1. Genera te an SSH key for the user .

    $ s sh - keygen - t r sa - P ""

    $ cat ~ / . s sh / id_r sa.pub > > ~ . s sh /au t hor i zed_keys$ ssh localhost$ exit

    2. Installation Hadoop$ c d / h o m e / w a u e / w o rk s p ac e$ su do tar xzf had oop - 0.16.0. tar.gz$ sudo mv hadoop - 0 .16.0 hadoop$ su do chown - R waue:waue hadoo p$ cd hadoop

    3. Co nfig u ra ti o n

    1. hadoop-env.sh ($HADOOP_HOME/conf/) Change

    # The java implemen tation to use. Required.# export JAVA_HOME= /u sr /l ib/ j2sd k1.5- sun

    to

    # The java implemen tation to use. Required.export JAVA_HOME=/ usr / l ib / jvm /java- 6- sun

    export HADOOP_HOME=/ hom e/ waue/ workspace/ hadoop

    exp or t HADOOP_LOG_DIR= $HADOOP_HOME/l ogs

    export HADOOP_SLAVES=$HADOOP_HOME/conf/slaves

    2. hadoop-site.xml ($HADOOP_HOME/conf/) modify the contents of conf / had oop - s i te .xml as below

    fs .defau l t .name localhost :9000 < / p ro p e r t y >

    mapred . job . t racker< /name> localhost :9001

  • 8/3/2019 Hadoop Eclipse

    4/11

    < n a m e > m a p re d . m a p . t a s k s < / n a m e > 1 define mapred.map tasks to be number of s lave hosts < / p ro p e r t y >

    mapred . reduce . tasks< /name> 1 define mapred.reduce tasks to be num ber of s lave hosts < / p ro p e r t y >

    dfs . rep l ica t ion< /name> 1< / p ro p e r t y >

    4. St ar t Up Ha d o op

    $ cd $HADOOP_HOME$ b in / h a d o o p n a m e n o d e - f or m a t08 / 05 / 23 14:52:16 INFO df s.NameNo de: STARTUP_MSG:

    / ************************************************************STARTUP_MSG: Star tin g Na meN od eSTARTUP_MSG: ho st = Dx7200 / 12 7.0.1.1STARTUP_MSG: arg s = [ -fo rm at]STARTUP_MSG: ver sio n = 0.16. 4STARTUP_MSG: bui ld = ht t p : / / svn.apache.org/repos /as f /ha doop /core /bra nches / branch - 0.16

    - r 6526 14; compiled by 'hadoo pqa' on Fri May 2 00:18:12 UTC 2008************************************************************/08/05/23 14:52:17 INFO fs.FSNamesystem:fsOwner=w aue,waue,adm,dialout,cdrom,floppy,audio,dip,video,plugdev,staff,scanner, lpadmin,admin,netdev,powerdev,vboxusers08/05/23 14:52:17 INFO fs .FSNamesystem: supergroup=supergroup08 /0 5 /2 3 14:52:17 INFO fs.FSNamesystem: isPermissionEnabled= tru e08/ 05 /2 3 14:52:17 INFO dfs.Storage: Storage di rec tory / t mp /h adoop - waue/d fs /n ame h as beensuccessfully formatte d.08 / 05 / 23 14:52:17 INFO df s.NameNo de: SHUTDOWN_MSG:

    / ************************************************************SHUTDOWN_MSG: Shutting down NameNode at Dx7200/127.0.1.1************************************************************/

    $ /b in / s t a r t - a l l. shs ta r ting namenode, logging to / home /waue /workspace /hadoop / logs /hadoop -waue- namenode-Dx7200.outloca lhos t : s ta r t ing da tanode , logging to /home/waue/workspace/hadoop/ logs /hadoop-waue-datanode-Dx7200.outlocalhost: s tart ing secon daryna meno de, loggingto /home/waue /workspace /hadoop / logs /hadoop-waue-seconda rynamenode-Dx7200 .ou ts tar t ing jobtracker , logging to /home/waue/workspace/hadoop/ logs /hadoop-waue-jobtracker-Dx7200.outlocalhos t : s ta r t ing tasktracker , logging to /h ome /waue /workspace/ hadoo p/ logs / hadoop - waue-tasktracker-Dx7200.out

  • 8/3/2019 Hadoop Eclipse

    5/11

    Then make sure ht tp : / / localhos t :50030/ by your explorer is on going.

    Ps : i f your syste m ha d erro r af ter resta r t , you could d o t here for resolving andrenewing one.

    $ cd $HADOOP_HOME$ bin/s top-a l l . sh$ r m - r f / t m p /*$ rm - r f logs /*

    And repeat to 4. s tar t up Hadoop

    http://localhost:50030/http://localhost:50030/http://localhost:50030/
  • 8/3/2019 Hadoop Eclipse

    6/11

    3. Eclipse Setup

    3.1 install IBM mapReduce tool

    1. Download th e IBM MapRed uce Tools zi p file and ex tract to / tm p / .2. Make su re Eclipse is closed an d ...$ cd / t m p / $ un zip m apre duce_tools .z ip$ mv plugins /com. ibm.hipods .mapreduce* / us r / l ib /ec lipse / plugins /

    3. Restar t EclipseCheck IBM MapRed uce Tools p lugi n in stalling well

    Eclipse

    File > New > Project see MapReduce category

    3.2 Eclipse configure

    Eclipse

    Window > Preferences > java> compiler se t compiler compliance level to 5.0

    Some ec lipse- plugin may exhaus t m uch resource , you may ha ppen to out of me mory error . We suggest to execute eclipse with som e para meters as t hat :

    $ ec lipse - vmargs - Xmx 512 m

    http://www.alphaworks.ibm.com/tech/mapreducetoolshttp://www.alphaworks.ibm.com/tech/mapreducetools
  • 8/3/2019 Hadoop Eclipse

    7/11

    4. Run on Eclipse

    4.1 map-reduce sample code

    Eclipse

    File > new > project > map - reduce project > next > project name : sa mple us e de fault location : V us e de fault Hadoop : V

    > Finishat Project explorer , you wil l see sample tree. Now, you should create asample code.

    Eclipse

    right clicksample > new > f ile > file name : WordCount.jav a

    the sa mple code i s here

    http: / / trac .nchc.org. tw/c loud /attachment /wik i /hadoop - sample-code/WordCount.java

    pas te the con tent s to your n ew adding f ile WordCoun t.java

    4.2. Connect to Hadoop File System

    Enable the MapRedu ce servers wind ow

    Eclipse

    Window > Show View > Other... > MapReduce Tool s > MapReduce ServersAt the bo tto m of your window, you sho uld have a "MapReduce Servers " ta b. Ifnot , see secon d bullet above. Switch to t hat tab.

    At the top r ight edge of the ta b, you sh ould see a li t t le blue elepha nt icons.

    Eclipse

    Clickblue elephant to a dd a new MapReduce server locat ion.

    Server name : any_you_want Hostname : localhost Installation directory: /home/waue/workspace/nutch/

    http://trac.nchc.org.tw/cloud/wiki/hadoop-sample-codehttp://trac.nchc.org.tw/cloud/wiki/hadoop-sample-codehttp://trac.nchc.org.tw/cloud/wiki/hadoop-sample-codehttp://trac.nchc.org.tw/cloud/wiki/hadoop-sample-code
  • 8/3/2019 Hadoop Eclipse

    8/11

    Username : wau e

    If any password pro mpt , p lease input the password which you login to local

    I t sho uld s how u p u nd er a l it t le elephan t icon in the Project Explorer (on the lef tside of Eclipse).

    ps : Pleast ma ke sure your Hadoo p is working on local system. If not , pleaserefer session 2 Hadoop Setup for debuging, or you can not p ass t hrough.

    $ c d / h o m e / w a u e / w o rk s p ac e / h a d o o p / $ wget ht tp : / /www.gutenberg .org/e text /132/132. tx t $ bin / hadoop d f s - mkdi r inpu t

    $ b in / h a d o o p d f s - lsFound 1 i tems

    /u se r /waue / inpu t 2008- 05- 23 15:15 rwxr - x r -x waue supe rg roup$ b in /hadoo p d f s - pu t 132 . tx t inpu t

    4.3 Run

    Eclipse

    sample > right clickWordCount.java > run as ... > run on Hadoop > choo sean exist ing serv er from the l ist below > f inish

    A console tag will show be side MapReduce Server tag.

    http://www.gutenberg.org/etext/132/132.txthttp://www.gutenberg.org/etext/132/132.txt
  • 8/3/2019 Hadoop Eclipse

    9/11

    While Map Reduce is r un ning, you ca n visit h t tp : / / localhos t :50030/ to view th atHadoop is dispatching jobs by Map Reduce.

    After finish, you can go t o ht tp : / / localhos t :50060/ to see the res ul t .

    http://localhost:50030/http://localhost:50060/http://localhost:50030/http://localhost:50060/
  • 8/3/2019 Hadoop Eclipse

    10/11

  • 8/3/2019 Hadoop Eclipse

    11/11

    5. Reference

    NCHC Cloud Technique Develop Grou p ht tp : / / t rac .nchc .org . tw/c loud/

    IBM Map-Reduce ht tp : / /www.alphaworks . ibm.com/tech/mapreducetools

    Cloud9 ht tp : / /w w w .umiacs .umd.edu /~ j immyl in /c loud9 /umd-hadoop-d i s t / c loud9-docs /how to / s t a r t . h tml

    Runing Hadoop ht tp : / /www.michael -noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29

    Related Files :

    Hadoop

    ht tp : / / apache .n tu . edu . tw /hadoop /core /

    IBM ma p red uce t ool :

    h t tp : / /www.alphaworks . ibm.com/tech/mapreducetools

    word sa mple 1 : The Art of War by 6th cent. B.C. Sunzi

    ht tp : / /www.gutenberg .org/e text /132

    word sa mple 2 : The Adventu res of Sherlock Holmes by Sir Arthu r

    Conan Doyle ht tp : / /www.gutenberg .org/e text /1661

    http://trac.nchc.org.tw/cloud/http://www.alphaworks.ibm.com/tech/mapreducetoolshttp://www.umiacs.umd.edu/~jimmylin/cloud9/umd-hadoop-dist/cloud9-docs/howto/start.htmlhttp://www.umiacs.umd.edu/~jimmylin/cloud9/umd-hadoop-dist/cloud9-docs/howto/start.htmlhttp://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)http://apache.ntu.edu.tw/hadoop/core/http://www.alphaworks.ibm.com/tech/mapreducetoolshttp://www.gutenberg.org/etext/132http://www.gutenberg.org/etext/1661http://trac.nchc.org.tw/cloud/http://www.alphaworks.ibm.com/tech/mapreducetoolshttp://www.umiacs.umd.edu/~jimmylin/cloud9/umd-hadoop-dist/cloud9-docs/howto/start.htmlhttp://www.umiacs.umd.edu/~jimmylin/cloud9/umd-hadoop-dist/cloud9-docs/howto/start.htmlhttp://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)http://apache.ntu.edu.tw/hadoop/core/http://www.alphaworks.ibm.com/tech/mapreducetoolshttp://www.gutenberg.org/etext/132http://www.gutenberg.org/etext/1661