Upload
anirban-bhattacharjee
View
222
Download
0
Embed Size (px)
Citation preview
8/3/2019 Hadoop Eclipse
1/11
Programing Map-Reduce
( Hadoop )
with Eclipse
+
Wei- Yu Chen
NCHC
2 0 0 8 / 0 5 / 2 7
see more : ht tp: / / t rac .nchc .org. tw/c loud/
http://trac.nchc.org.tw/cloud/http://trac.nchc.org.tw/cloud/8/3/2019 Hadoop Eclipse
2/11
1. Prepare :
System :
Ubuntu 7.10
Hadoop 0.16
Requirement :
Eclipse (3.2.2)
$ ap t - get install ecl ipse
java 6
$ ap t - ge t inst all sun- j ava6- b in sun - j ava6- jdk sun - j ava6- j r e sun- j ava6-plugin
suggest to remove the default java compiler gcj
$ ap t - ge t purge j ava -gc j -compat
Append two codes to / e tc /bas h.bashrc to se tup Java Class path
expor t JAVA_HOME=/ usr / l ib / jvm / java- 6 - sunexpor t HADOOP_HOME=/ ho me /waue /workspace /ha doop / ex po rt CLASSPATH=.:$JAVA_HOME/li b / d t.ja r:$JAVA_HOME/lib / to ols.ja r
Building UP Path
Name Path
Hado op Ho me / h o me / wa ue / wo rks pace / ha do op /
Java Home / us r /lib / jv m /java - 6 - su n
8/3/2019 Hadoop Eclipse
3/11
2. Hadoop Setup1. Genera te an SSH key for the user .
$ s sh - keygen - t r sa - P ""
$ cat ~ / . s sh / id_r sa.pub > > ~ . s sh /au t hor i zed_keys$ ssh localhost$ exit
2. Installation Hadoop$ c d / h o m e / w a u e / w o rk s p ac e$ su do tar xzf had oop - 0.16.0. tar.gz$ sudo mv hadoop - 0 .16.0 hadoop$ su do chown - R waue:waue hadoo p$ cd hadoop
3. Co nfig u ra ti o n
1. hadoop-env.sh ($HADOOP_HOME/conf/) Change
# The java implemen tation to use. Required.# export JAVA_HOME= /u sr /l ib/ j2sd k1.5- sun
to
# The java implemen tation to use. Required.export JAVA_HOME=/ usr / l ib / jvm /java- 6- sun
export HADOOP_HOME=/ hom e/ waue/ workspace/ hadoop
exp or t HADOOP_LOG_DIR= $HADOOP_HOME/l ogs
export HADOOP_SLAVES=$HADOOP_HOME/conf/slaves
2. hadoop-site.xml ($HADOOP_HOME/conf/) modify the contents of conf / had oop - s i te .xml as below
fs .defau l t .name localhost :9000 < / p ro p e r t y >
mapred . job . t racker< /name> localhost :9001
8/3/2019 Hadoop Eclipse
4/11
< n a m e > m a p re d . m a p . t a s k s < / n a m e > 1 define mapred.map tasks to be number of s lave hosts < / p ro p e r t y >
mapred . reduce . tasks< /name> 1 define mapred.reduce tasks to be num ber of s lave hosts < / p ro p e r t y >
dfs . rep l ica t ion< /name> 1< / p ro p e r t y >
4. St ar t Up Ha d o op
$ cd $HADOOP_HOME$ b in / h a d o o p n a m e n o d e - f or m a t08 / 05 / 23 14:52:16 INFO df s.NameNo de: STARTUP_MSG:
/ ************************************************************STARTUP_MSG: Star tin g Na meN od eSTARTUP_MSG: ho st = Dx7200 / 12 7.0.1.1STARTUP_MSG: arg s = [ -fo rm at]STARTUP_MSG: ver sio n = 0.16. 4STARTUP_MSG: bui ld = ht t p : / / svn.apache.org/repos /as f /ha doop /core /bra nches / branch - 0.16
- r 6526 14; compiled by 'hadoo pqa' on Fri May 2 00:18:12 UTC 2008************************************************************/08/05/23 14:52:17 INFO fs.FSNamesystem:fsOwner=w aue,waue,adm,dialout,cdrom,floppy,audio,dip,video,plugdev,staff,scanner, lpadmin,admin,netdev,powerdev,vboxusers08/05/23 14:52:17 INFO fs .FSNamesystem: supergroup=supergroup08 /0 5 /2 3 14:52:17 INFO fs.FSNamesystem: isPermissionEnabled= tru e08/ 05 /2 3 14:52:17 INFO dfs.Storage: Storage di rec tory / t mp /h adoop - waue/d fs /n ame h as beensuccessfully formatte d.08 / 05 / 23 14:52:17 INFO df s.NameNo de: SHUTDOWN_MSG:
/ ************************************************************SHUTDOWN_MSG: Shutting down NameNode at Dx7200/127.0.1.1************************************************************/
$ /b in / s t a r t - a l l. shs ta r ting namenode, logging to / home /waue /workspace /hadoop / logs /hadoop -waue- namenode-Dx7200.outloca lhos t : s ta r t ing da tanode , logging to /home/waue/workspace/hadoop/ logs /hadoop-waue-datanode-Dx7200.outlocalhost: s tart ing secon daryna meno de, loggingto /home/waue /workspace /hadoop / logs /hadoop-waue-seconda rynamenode-Dx7200 .ou ts tar t ing jobtracker , logging to /home/waue/workspace/hadoop/ logs /hadoop-waue-jobtracker-Dx7200.outlocalhos t : s ta r t ing tasktracker , logging to /h ome /waue /workspace/ hadoo p/ logs / hadoop - waue-tasktracker-Dx7200.out
8/3/2019 Hadoop Eclipse
5/11
Then make sure ht tp : / / localhos t :50030/ by your explorer is on going.
Ps : i f your syste m ha d erro r af ter resta r t , you could d o t here for resolving andrenewing one.
$ cd $HADOOP_HOME$ bin/s top-a l l . sh$ r m - r f / t m p /*$ rm - r f logs /*
And repeat to 4. s tar t up Hadoop
http://localhost:50030/http://localhost:50030/http://localhost:50030/8/3/2019 Hadoop Eclipse
6/11
3. Eclipse Setup
3.1 install IBM mapReduce tool
1. Download th e IBM MapRed uce Tools zi p file and ex tract to / tm p / .2. Make su re Eclipse is closed an d ...$ cd / t m p / $ un zip m apre duce_tools .z ip$ mv plugins /com. ibm.hipods .mapreduce* / us r / l ib /ec lipse / plugins /
3. Restar t EclipseCheck IBM MapRed uce Tools p lugi n in stalling well
Eclipse
File > New > Project see MapReduce category
3.2 Eclipse configure
Eclipse
Window > Preferences > java> compiler se t compiler compliance level to 5.0
Some ec lipse- plugin may exhaus t m uch resource , you may ha ppen to out of me mory error . We suggest to execute eclipse with som e para meters as t hat :
$ ec lipse - vmargs - Xmx 512 m
http://www.alphaworks.ibm.com/tech/mapreducetoolshttp://www.alphaworks.ibm.com/tech/mapreducetools8/3/2019 Hadoop Eclipse
7/11
4. Run on Eclipse
4.1 map-reduce sample code
Eclipse
File > new > project > map - reduce project > next > project name : sa mple us e de fault location : V us e de fault Hadoop : V
> Finishat Project explorer , you wil l see sample tree. Now, you should create asample code.
Eclipse
right clicksample > new > f ile > file name : WordCount.jav a
the sa mple code i s here
http: / / trac .nchc.org. tw/c loud /attachment /wik i /hadoop - sample-code/WordCount.java
pas te the con tent s to your n ew adding f ile WordCoun t.java
4.2. Connect to Hadoop File System
Enable the MapRedu ce servers wind ow
Eclipse
Window > Show View > Other... > MapReduce Tool s > MapReduce ServersAt the bo tto m of your window, you sho uld have a "MapReduce Servers " ta b. Ifnot , see secon d bullet above. Switch to t hat tab.
At the top r ight edge of the ta b, you sh ould see a li t t le blue elepha nt icons.
Eclipse
Clickblue elephant to a dd a new MapReduce server locat ion.
Server name : any_you_want Hostname : localhost Installation directory: /home/waue/workspace/nutch/
http://trac.nchc.org.tw/cloud/wiki/hadoop-sample-codehttp://trac.nchc.org.tw/cloud/wiki/hadoop-sample-codehttp://trac.nchc.org.tw/cloud/wiki/hadoop-sample-codehttp://trac.nchc.org.tw/cloud/wiki/hadoop-sample-code8/3/2019 Hadoop Eclipse
8/11
Username : wau e
If any password pro mpt , p lease input the password which you login to local
I t sho uld s how u p u nd er a l it t le elephan t icon in the Project Explorer (on the lef tside of Eclipse).
ps : Pleast ma ke sure your Hadoo p is working on local system. If not , pleaserefer session 2 Hadoop Setup for debuging, or you can not p ass t hrough.
$ c d / h o m e / w a u e / w o rk s p ac e / h a d o o p / $ wget ht tp : / /www.gutenberg .org/e text /132/132. tx t $ bin / hadoop d f s - mkdi r inpu t
$ b in / h a d o o p d f s - lsFound 1 i tems
/u se r /waue / inpu t 2008- 05- 23 15:15 rwxr - x r -x waue supe rg roup$ b in /hadoo p d f s - pu t 132 . tx t inpu t
4.3 Run
Eclipse
sample > right clickWordCount.java > run as ... > run on Hadoop > choo sean exist ing serv er from the l ist below > f inish
A console tag will show be side MapReduce Server tag.
http://www.gutenberg.org/etext/132/132.txthttp://www.gutenberg.org/etext/132/132.txt8/3/2019 Hadoop Eclipse
9/11
While Map Reduce is r un ning, you ca n visit h t tp : / / localhos t :50030/ to view th atHadoop is dispatching jobs by Map Reduce.
After finish, you can go t o ht tp : / / localhos t :50060/ to see the res ul t .
http://localhost:50030/http://localhost:50060/http://localhost:50030/http://localhost:50060/8/3/2019 Hadoop Eclipse
10/11
8/3/2019 Hadoop Eclipse
11/11
5. Reference
NCHC Cloud Technique Develop Grou p ht tp : / / t rac .nchc .org . tw/c loud/
IBM Map-Reduce ht tp : / /www.alphaworks . ibm.com/tech/mapreducetools
Cloud9 ht tp : / /w w w .umiacs .umd.edu /~ j immyl in /c loud9 /umd-hadoop-d i s t / c loud9-docs /how to / s t a r t . h tml
Runing Hadoop ht tp : / /www.michael -noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
Related Files :
Hadoop
ht tp : / / apache .n tu . edu . tw /hadoop /core /
IBM ma p red uce t ool :
h t tp : / /www.alphaworks . ibm.com/tech/mapreducetools
word sa mple 1 : The Art of War by 6th cent. B.C. Sunzi
ht tp : / /www.gutenberg .org/e text /132
word sa mple 2 : The Adventu res of Sherlock Holmes by Sir Arthu r
Conan Doyle ht tp : / /www.gutenberg .org/e text /1661
http://trac.nchc.org.tw/cloud/http://www.alphaworks.ibm.com/tech/mapreducetoolshttp://www.umiacs.umd.edu/~jimmylin/cloud9/umd-hadoop-dist/cloud9-docs/howto/start.htmlhttp://www.umiacs.umd.edu/~jimmylin/cloud9/umd-hadoop-dist/cloud9-docs/howto/start.htmlhttp://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)http://apache.ntu.edu.tw/hadoop/core/http://www.alphaworks.ibm.com/tech/mapreducetoolshttp://www.gutenberg.org/etext/132http://www.gutenberg.org/etext/1661http://trac.nchc.org.tw/cloud/http://www.alphaworks.ibm.com/tech/mapreducetoolshttp://www.umiacs.umd.edu/~jimmylin/cloud9/umd-hadoop-dist/cloud9-docs/howto/start.htmlhttp://www.umiacs.umd.edu/~jimmylin/cloud9/umd-hadoop-dist/cloud9-docs/howto/start.htmlhttp://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)http://apache.ntu.edu.tw/hadoop/core/http://www.alphaworks.ibm.com/tech/mapreducetoolshttp://www.gutenberg.org/etext/132http://www.gutenberg.org/etext/1661