7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 1/115
© www.BitBootCamp.com© www.BitBootCamp.com
HIVE
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 2/115
© www.BitBootCamp.com© www.BitBootCamp.com 2
Overview of Hadoop Training
Unix
Introduction to Hadoop
Hive Working with Hive Cro ta! "uerie with Hive
#ecommendation $ngine
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 3/115
© www.BitBootCamp.com© www.BitBootCamp.com %
Coure O!&ective
How Hive augments MapReduce
How to create tables and manipulate data using Hive
Advanced features of Hive
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 4/115
© www.BitBootCamp.com© www.BitBootCamp.com '
Course Chapters
Introduction to Hive
(etting )ata Into Hive
*anipu+ating )ata in Hive
,artitioning and Bucketing
-dvanced Hive
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 5/115
© www.BitBootCamp.com© www.BitBootCamp.com
Introduction to Hive
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 6/115
© www.BitBootCamp.com© www.BitBootCamp.com
*otivation for Hive
Easy to write Map Reduce job
Built for Non-Programmers – no Java
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 7/115© www.BitBootCamp.com© www.BitBootCamp.com /
*ap#educe The Cha++enge
*ap #educe i written in 0-1-
#e"uire a good undertanding of
0-1-
The *ap #educe ,aradigm
The Hadoop -,I
The Buine pro!+em at hand
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 8/115© www.BitBootCamp.com© www.BitBootCamp.com
Origin of Hive
)eve+oped at 3ace!ook
Open ource pro&ect at -pache foundation
4anguage !aed on 564
)ec+arative in nature
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 9/115© www.BitBootCamp.com© www.BitBootCamp.com 7
What i Hive 8
Code generator for 564 tatement 9: *ap #educe &o!
Convert the Hive 564 to 0ava *ap #educe
5u!mit the code to the c+uter
)ip+a; the reu+t to the uer
Hive -dvantage 564 i much eaier than 0-1-
Writing e"uiva+ent *ap #educe code i much fater
- +ot of peop+e a+read; know 564
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 10/115© www.BitBootCamp.com© www.BitBootCamp.com =>
Hive v. 0ava *# comparion
5e+ect ? from Ta!+eT 0oin Ta!+eB on @ Ta!+eT.a A Ta!+eB.a
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 11/115© www.BitBootCamp.com© www.BitBootCamp.com ==
Hive i not a #e+ationa+ )ata!ae
#e+ation )ata!ae *anagement ;tem
Thouand of imu+taneou c+ient 1er; fat repone time
5upport for Tranaction @ -CI)
5upport for Update tatement
Hive i not #)B*5 It wi++ not make the Hadoop c+uter into data!ae
It !aica++;D convert the Hive64 to *ap#educe &o!
It wi++ take ome time to execute
Eou wi++ never do @ 5e+ect ? from ta!+eT
@covered +ater H)35 Import command
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 12/115© www.BitBootCamp.com© www.BitBootCamp.com =2
Hive v. #e+ationa+ )B
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 13/115© www.BitBootCamp.com© www.BitBootCamp.com =%
Getting Data into Hive
The Hive architecture
How to create ta!+e in Hive
)ifferent co+umn t;pe
Importing data into Hive
*u+tip+e Hive )ata!ae
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 14/115© www.BitBootCamp.com© www.BitBootCamp.com
Getting Data into HIVE
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 15/115© www.BitBootCamp.com© www.BitBootCamp.com =F
How Hive work
)ata i tored in the H)35
3o+der and 3i+e
Hive wi++ +a;er the ta!+e definition on top of the 3o+derG3i+e 3o+der Ta!+e
3i+e Content in the ta!+e
Ta!+e ))4D define the +a;out of the fi+e Co+umn <ameD and t;pe
Co+umn and #ow 5eparator @ C51D T51D etc.. )efau+t eparator Contro+9- Char
1ia 5er)eD thi can !e changed
Hive *eta95tore contain a++ thi information
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 16/115© www.BitBootCamp.com© www.BitBootCamp.com =
Hive *eta 5tore
Hive *eta 5tore i tored in a et of ta!+e
)er!; @ )efau+t @ing+e uer *;5"+ can !e configured to tore thi meta data @mu+ti uer
*eta 5tore inc+ude the fo++owing
Ta!+e ))4 Ta!+e <ameD Co+umn <ameD )ata T;peD etc
4ocation of data in H)35 ;tem Interna+ GuerGhiveGwarehoue
$xterna+ -n; +ocation in the H)35 ;tem
#ow and co+umn eparator 5torage format Ued !; *ap9#educe which wi++ govern the
Input3ormat and Output3ormat
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 17/115© www.BitBootCamp.com© www.BitBootCamp.com =/
5u!mit Hive 6uer; to C+uter
Hive Interpreter
Convert the Hive64 code to *ap9#educe &o!
0o! 5u!miion 0o! wi++ !e u!mit to the c+uter
Option for <um!er of *apperG#educer wi++ !e entD etc..
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 18/115© www.BitBootCamp.com© www.BitBootCamp.com =
4aunch Hive
There are three wa; we can +aunch hive
Hive Command 4ine Interface hive
Hive:
4aunch Hive64 from command +ine Hive Je K e+ect ? from ta!+eT +imit =>L
4aunch Hive6+ from 5cript.564 fi+e
Hive Jf 5cript.564
Hive command mut !e terminated !; K M L
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 19/115© www.BitBootCamp.com© www.BitBootCamp.com =7
Getting Data into Hive
The Hive architecture
How to create ta!+e in Hive
)ifferent co+umn t;pe
Importing data into Hive
*u+tip+e Hive )ata!ae
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 20/115© www.BitBootCamp.com© www.BitBootCamp.com 2>
)ata Ta!+e in Hive
5;ntax
Create Ta!+e t @ Co+<ame T;peD N #ow 3ormat )e+imited
3ie+d Terminated !; CharP
5tored a Text3i+e Q 5e"uence3i+eP
It wi++ create a u!9director; t at
GuerGhiveGwarehoueG in H)35
GuerGhiveGwarehoueG Hive Warehoue )irector;
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 21/115© www.BitBootCamp.com© www.BitBootCamp.com 2=
Create Ta!+e in )etai+
Create able t ! Col"ame #pe$ % &
4it the name of the ta!+e
4it the name of the co+umn and data T;pe
Row 'ormat Delimited
Te++ Hive thatD data fie+d are de+imited !; ome char
'ields erminated b# (Char)
5pecifie the de+imited char @ KDLD KRtL
)efau+t i Contro+9- char SR>>=
*tored as ( e+t'ile , *e-uence'ile) 4a;out of the fi+eD if the fi+e i text fi+e
5e"uence fi+e i the Hadoop !inar; fi+e +a;outD )efau+tP
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 22/115
© www.BitBootCamp.com© www.BitBootCamp.com 22
He+p on Ta!+e )efinition
5imp+e 1iew
dec t
)etai+ 1erion
dec extended t
Thi information i dip+a;ed from hive meta data
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 23/115
© www.BitBootCamp.com© www.BitBootCamp.com 2%
$xterna+ ta!+e
If the data i contained outide of Warehoue fo+der
GuerGhiveGwarehoue Thi wi++ !e ca++ed $xterna+ ta!+e
)ata i ti++ tored in H)35
$xamp+eCreate $xterna+ Ta!+e externa+ta!+e
@ c= tringD c2 arra;Vtring:D c% int
#ow format de+imited
Co++ection item terminated !; KDL
5tored a textfi+e
.ocation /0user0menish0e+ternal1table2
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 24/115
© www.BitBootCamp.com© www.BitBootCamp.com 2'
)e+eting ta!+e
Ta!+e can !e dropped !;
)rop Ta!+e t
Interna+ Ta!+e -++ *eta data i de+eted
-++ data i +ot
$xterna+ Ta!+e -++ *eta )ata i +ot
E+ternal data is not deleted Data director# is not deleted
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 25/115
© www.BitBootCamp.com© www.BitBootCamp.com 2F
-+tering the Ta!+e )efinition
Change the ta!+e definition
Change ta!+e +ocation Change co+umn definition
#ename ta!+eGupdate propertie
-dd and remove partition @ 4ater more on partition
$xamp+e
A.ER A3.E t *E .4CAI4" 5new_location6
A.ER A3.E t ADD C4.7M"* !col_name, type$ 888&
A.ER A3.E t RE"AME 4 x A.ER A3.E t CHA"GE old_name new_name new_type
A.ER A3.E t DR49 9ARII4" ! part_col=:val :&
A.ER A3.E t ADD 9ARII4" ! part_col=:val :&
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 26/115
© www.BitBootCamp.com© www.BitBootCamp.com 2
Getting Data into Hive
The Hive architecture
How to create ta!+e in Hive
)ifferent co+umn t;pe
Importing data into Hive
*u+tip+e Hive )ata!ae
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 27/115
© www.BitBootCamp.com© www.BitBootCamp.com 2/
)ata T;pe in Hive
Hive data t;peD map to 0ava ,rimitive t;pe
5tandard )ata T;pe Integer
TI<EI<T 9 = !;te integer
5*-44I<T 9 2 !;te integer
I<T 9 ' !;te integer
BI(I<T 9 !;te integer
Boo+ean t;pe BOO4$-< 9 T#U$G3-45$
3+oating point num!er 34O-T 9 ing+e preciion
)OUB4$ 9 )ou!+e preciion
5tring t;pe 5T#I<( 9 e"uence of character in a pecified character et
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 28/115
© www.BitBootCamp.com© www.BitBootCamp.com 2
)ata T;pe What *iing
<o )efau+t t;pe for
)ate or Time Work -round
4everage 5tring T;pe
Bui+t in function to manage )ate or Time
<o Binar; co+umn t;pe B+o! etc..
Can not tore )ocument a !inar; J 4everage ome ort of )ocument data!ae for thi purpoe
Hive i !ui+t to manageD text data
<ote Overa++ fi+e can !e in !inar;M we &ut can not mix the text and!inar; data in one ta!+e.
Hive i a continuou !ui+dD future re+eae wi++ have thee
$T- TB)
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 29/115
© www.BitBootCamp.com© www.BitBootCamp.com 27
$xamp+e of a Hive Ta!+e
$xamp+e
Create ta!+e ngram @word tringD ;ear intD wfre" intD !fre" int#ow 3ormat )e+imited
3ie+d terminated !; KRtL
5tored a Text3i+eM
$ach command in Hive need to !e terminated !; K M L
#ow 3ormat )e+imited Hive to expect one record per +ine
4ine are determined !; KRnRrL
3ie+d Terminated !;.. Co+umn are eparated !; ta! KRtL
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 30/115
© www.BitBootCamp.com© www.BitBootCamp.com %>
Comp+ex )ata T;pe
3o++owing comp+ex data t;pe are upported
*ap @ $"uiva+ent to Hah9ta!+e e;D 1a+ue ,air
-rra; 4it of $+ement
5truct Uer defined 5tructure
Thee are ued to tore 0ava O!&ect or 05O< o!&ect
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 31/115
© www.BitBootCamp.com© www.BitBootCamp.com %=
*ap )ata T;pe
*ap
e; J 1a+ue ,air Ca++ed !; referencing the ke;
Uage
*-, V primitiveT;peD an;t;pe : $xamp+e
*ap V 5tringD 5tring:
*ap V intD 5truct :
Uer +ogin and ,aword
Uer+ogin a *ap T;peM uerid i ke; in intD and pa i va+ue intring
Create ta!+e ,aword
Uer+ogin *-,VintD 5tring:D N P
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 32/115
© www.BitBootCamp.com© www.BitBootCamp.com %2
-rra; )ata T;pe
-rra; )ata T;pe
$+ement of ame t;pe a a +it $+ement are acceed !; index
Uage -rra;Van;t;pe:
$xamp+e $mai+addr V 5tring:
)etai+ 4it of uer emai+ -ddre
XK3irt<ameYdomain.comLD K3irt<ame.4at<ameYdomain.comLZ
3irt<ameYdomain.com i acceed !; ca++ing $mai+addrX=Z
Create ta!+e emai+ emai+addr -rra; V5tring:D N P
5t t
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 33/115
© www.BitBootCamp.com© www.BitBootCamp.com %%
5truct
5truct )ata T;pe
*ix of )ata e+ement $+ement are acceed !; dot K.L notation
Uage 5tructV5tringan;t;peD N :
$xamp+e uer 5truct Vuerid intD name5tringD emai+5tring :
$xamp+e $mp+o;ee ta!+e with detai+
Create ta!+e emp+o;ee
$mpdec 5tructVHigh5choo+5tringD Co++ege5tring :
N P
) +i it f C + ) t T
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 34/115
© www.BitBootCamp.com© www.BitBootCamp.com %'
)e+imiter for Comp+ex )ata T;pe
-rra; and 5truct
Collections Items terminated b# char Create ta!+e t=
@ c= tringD c2 arra;Vtring:D c% int
#ow format de+imited
3ie+d Terminated !; SRt
Co++ection item terminated !; KDL5tored a textfi+eM
3i+e wi++ contain the fo++owing data
Uer= K=LDL2LDL%L =>>
Uer2 aD!Dc =72Uer% xD;D[ 2>>>
*ap J e; 1a+ue pairMap ;e#s erminated b# char
G tti D t i t Hi
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 35/115
© www.BitBootCamp.com© www.BitBootCamp.com %F
Getting Data into Hive
The Hive architecture
How to create ta!+e in Hive
)ifferent co+umn t;pe
Importing data into Hive
*u+tip+e Hive )ata!ae
4 di ) t i Hi
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 36/115
© www.BitBootCamp.com© www.BitBootCamp.com %
4oading )ata in Hive
To +oad data in hiveD &ut move the data to the
correponding director; in H)35
*u+tip+e wa; to +oad data Hadoop f Jmv GpathGtoG+oca+fi+e GuerGhiveG
warehoueGta!+enameG
Hdf df Jcop;from+oca+ GpathGtoG+oca+fi+e GuerGhiveGwarehoueGta!+enameG
Hive: 4oad data inpath SGpathGtoGdirGinGhdf into ta!+e t
Hive: +oad data local inpath SSGpathGtoGdirG+oca+ into ta!+e t
Overwrite the fi+e 4oad data inpath SpathGtoGdir overwrite into ta!+e t
Create new ta!+e on the f+; Inert overwrite ta!+e t2 e+ect ? from t=M
( tti ) t t f Hi
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 37/115
© www.BitBootCamp.com© www.BitBootCamp.com %/
(etting )ata out of Hive
Inert overwrite )irector;
Output "uer; reu+t in H)35 dir 4oca+ option wi++ !ring data to +oca+ fo+der
)ata i eria+i[ed a text
Co+umn are eparated !; Ctr+9- character R>>=
#ow !; new+ine char
(ood to get +arge amount of data
$xamp+e Inert overwrite +oca+ director; SpathGtoG+oca+Gfi+e.dat e+ect ? from tM
Thi t t h t f
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 38/115
© www.BitBootCamp.com© www.BitBootCamp.com %
Thing to watch out for
)rop ta!+e
-++ data i +otD no wa; to get it !ack
There i not #o++!ack or Undo
$xterna+ Ta!+e It &ut a pointer to H)35 fo+der outide of warehoue fo+der
1er; he+pfu+D for exiting dataD not need to move the data around
)ropping i afeD a on+; metadata i de+eted
4 di d t f $ i ti d t !
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 39/115
© www.BitBootCamp.com© www.BitBootCamp.com %7
4oading data from $xiting data!ae
We a++ need to import data from mu+tip+e data!ae to
Hadoop
Ue 5"oop Open 5ource code
*;5645"oop Hadoop
1ia 0)BC
It wi++ connect to an; data!aeD o +ong ;ou have the 0)BC driver
5 C t
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 40/115
© www.BitBootCamp.com© www.BitBootCamp.com '>
5"oop Concept
4aunch *ap #educe &o! to +oad data
*u+tip+e connection to data!ae to pu++ data )efau+t ' connection
$ach connection work in para++e+D and data i imported fater
$ach connection work on a different part of the data
0)BC &ut +ike O)BC connection
Create the mapping fi+eD from ource to detination
!aed on the ource meta data Can create hive ta!+e without an; pecific config
5"oop 5 nta
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 41/115
© www.BitBootCamp.com© www.BitBootCamp.com '=
5"oop 5;ntax
5"oop import R
99uername user R99paword pass R99connect &d!cm;"+GGdbserver.example.com/db R 99hive9import R
99fie+d9terminated9!; \Rt\ R 99ta!+e t1
Getting Data into Hive
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 42/115
© www.BitBootCamp.com© www.BitBootCamp.com '2
Getting Data into Hive
The Hive architecture
How to create ta!+e in Hive
)ifferent co+umn t;pe
Importing data into Hive
*u+tip+e Hive )ata!ae
*u+tip+e )ata!ae
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 43/115
© www.BitBootCamp.com© www.BitBootCamp.com '%
*u+tip+e )ata!ae
)efau+t data!ae name i defau+t
5how data!ae 4it the data!ae in the ;tem
Create data!ae d! Create a new data!ae on the warehoue dir
Ue d!
5how ta!+e
5how ta!+e from d!
Course Chapters
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 44/115
© www.BitBootCamp.com© www.BitBootCamp.com ''
Course Chapters
(etting )ata Into Hive
*anipu+ating )ata in Hive
,artitioning and Bucketing
-dvanced Hive
Manipulating Data in Hive
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 45/115
© www.BitBootCamp.com© www.BitBootCamp.com 'F
Manipulating Data in Hive
5e+ect 5tatement
0oin
5toring reu+t in H)35 or +oca+
Baic 3unction
5u! -dvanced Hive
O!&ective
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 46/115
© www.BitBootCamp.com© www.BitBootCamp.com '
O!&ective
In thi chapter ;ou wi++ +earn how to ue Hive to "uer; data
on HadoopM a++ the &o! wi++ +aunch map9reduce &o! inome capacit;
Ue e+ect tatement
0oin 5tore reu+t in Hdf
Ue defau+t function
Hive64
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 47/115
© www.BitBootCamp.com© www.BitBootCamp.com '/
Hive64
5u!et of 564972 5tandard
5upported 5e+ectD 0oinD aggregate and u! "uerie <o 5upport Update or )e+ete
Hive ha ome extenion
,artitioning 5amp+ing
Comp+ex data t;pe @ -rra;D *apD 5truct - we aw !efore
Uer )efined function man; +anguage are upported
*u+ti ta!+e inert do more with one command
Baic 5e+ect 5;ntax
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 48/115
© www.BitBootCamp.com© www.BitBootCamp.com '
Baic 5e+ect 5;ntax
5e+ect 5;ntax
5e+ect exp=D exp2D exp%.. 3rom ta!+eT
$xp can Co+umn nameD 3unctionD cutom 3unction
3rom i re"uired
Hive i not cae enitive ,+eae ue the !et coding practice Came+ cae or Upper Cae
-+ia in 5e+ect
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 49/115
© www.BitBootCamp.com© www.BitBootCamp.com '7
-+ia in 5e+ect
-+iae are upported in Hive
"o as <e# word needed5e+ect exp=D exp2D .. 3rom ta!+ename t
Ued ifD we have common name acro ta!+e
5e+ect t.exp=D t.exp2 from ta!+ename t
3i+ter 4imiting the <um!er of #ecord
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 50/115
© www.BitBootCamp.com© www.BitBootCamp.com F>
3i+ter 4imiting the <um!er of #ecord
3i+ter the reu+t !; uing where c+aue
5e+ect exp=D exp2D exp%.. 3rom ta!+eT
where condition
Condition i an; Boo+ean expreion Condition can com!ined with -ndGor D @
4imit C+aue @ 5ame a Top
5e+ect exp=D exp2D exp%.. 3rom ta!+eTwhere condition
4imit n
Order !; 5ort !; etc
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 51/115
© www.BitBootCamp.com© www.BitBootCamp.com F=
Order !; 5ort !; ..etc
There are mu+tip+e wa; to ortD due to para++e+ nature of
Hadoop
Order B; Overa++ ort acro a++ the mapper 5ing+e reducer i ued to ort the data
Can !e ver; +ow ,+eae ue the +imit c+aue
On !ig dataetD it can !+ow9up a ing+e node
5ort !; 5orting i on+; +oca+ to given reducer Ue mu+tip+e reducer to ort the data
5ort i +oca+ to reducerD hence it guarantee on per reducer
*a; reu+t in partia+ ordered et
)itri!ute !;
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 52/115
© www.BitBootCamp.com© www.BitBootCamp.com F2
)itri!ute !;
)itri!ute !;
)itri!ute the data !aed on the ke; to the ame reducer <o guarantee on c+utering or orting propertie
Uefu+D if data need to !e partitioned for a given reducer
3or examp+e e+ect c=D c2 from t ditri!ute !; c=
Ta!+e =
=D a
2D !%D c=Dd%D e'D f
=Da%De%Dc=Dd
2D!'Df
#educer =
#educer 2
=Da%De%Dc
=Dd2D!'Df
3ina+ output
C+uter !;
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 53/115
© www.BitBootCamp.com© www.BitBootCamp.com F%
C+uter !;
C+uter !;
)itri!ute !; ] 5ort !; in one command #ow with ame ke; i ditri!uted to ame reducer
)ata i orted per reducer
$xamp+e e+ect c=D c2 from t c+uter!; c=
Ta!+e =
=D a
2D !%D c=Dd%D e'D f
=Da%De%Dc=Dd
2D!'Df
#educer =
#educer 2
=Da=Dd%Dc
%De2D!'Df
3ina+ output
(roup !;
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 54/115
© www.BitBootCamp.com© www.BitBootCamp.com F'
(roup !;
(roup !; i ued to aggregate data
5e+ect tateD count@ditinct retid
3rom ret
(roup !; tate
*u+tip+e -ggregation in one tatement
5e+ect tateD count@ditinct retid D count@? cnt
3rom ret
(roup !; tate
Manipulating Data in Hive
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 55/115
© www.BitBootCamp.com© www.BitBootCamp.com FF
Manipulating Data in Hive
5e+ect 5tatement
0oin
5toring reu+t in H)35 or +oca+
Baic 3unction
5u! -dvanced Hive
0oining ta!+e
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 56/115
© www.BitBootCamp.com© www.BitBootCamp.com F
0oining ta!+e
It ued to &oin ta!+e
5upport Inner 0oin
4eft Outer 0oin
#ight Outer 0oin
3u++er Outer 0oin
<ot a++ condition are upported -.id A !.id i upported
-.id V: !.id not upported
5;ntax
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 57/115
© www.BitBootCamp.com© www.BitBootCamp.com F/
5;ntax
Hive ue the fu++ expreed verion of the ;ntax
5e+ect co+= from t= inner &oin t2 on t=.co+= A t2.co+2
5hortcut verion of &oin are not upported 5e+ect co+= from t=D t2 where N
0oin
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 58/115
© www.BitBootCamp.com© www.BitBootCamp.com F
0oin
Inner 0oin
4eft Outer0oin
#ight outer &oin
3u++ outer 0oin
nu++
nu++
Manipulating Data in Hive
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 59/115
© www.BitBootCamp.com© www.BitBootCamp.com F7
Manipulating Data in Hive
5e+ect 5tatement
0oin
5toring reu+t in H)35 or +oca+
Baic 3unction
*u+tip+e Hive )ata!ae
Outputting data to H)35
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 60/115
© www.BitBootCamp.com© www.BitBootCamp.com >
Outputting data to H)35
To dump data on H)35 with a given format
$xterna+ Ta!+e Ue Inert tatement to Overwrite ta!+e
Create $xterna+ Ta!+e T @ co+=D co+2D.. #ow 3ormat de+imited
3ie+d terminated !; KDL
5tored a Text3i+e
4ocation KGpathGtoGhdfLM
Inert overwrite ta!+e e+ect ? from N
*u+ti9ta!+e inert
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 61/115
© www.BitBootCamp.com© www.BitBootCamp.com =
*u+ti ta!+e inert
- e+ect tatement can run for a +ong time
Can we do ome proceing in para++e+
3rom @ e+ect co+=D co+2D co+% from .. a+ia
Inert overwrite ta!+e t= e+ect co+= from a+iaInert overwrite ta!+e t2 e+ect count@? from a+ia
We have to define the tructure of t= and t2 prior
Manipulating Data in Hive
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 62/115
© www.BitBootCamp.com© www.BitBootCamp.com 2
Manipulating Data in Hive
5e+ect 5tatement
0oin
5toring reu+t in H)35 or +oca+
Baic 3unction
5u! -dvanced Hive
Hive 3unction
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 63/115
© www.BitBootCamp.com© www.BitBootCamp.com %
Hive 3unction
Bui+t in function
*ath )ate
Condition
5tring
-ggregate
There i ver; powerfu+ upport for Uer defined function
3unction can !e written in an; +anguage
Ued to hide the !uine +ogic
4everage Thrift to make thi magic happen
*ath 3unction
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 64/115
© www.BitBootCamp.com© www.BitBootCamp.com '
*ath 3unction
5tandard *ath function
#and@ 4og@ 5"rt @ cei+@ N 5tring 3unction
4ength@ concat@ u!tring@ upper@ +ower@ trim@ etc..
)ate 3unction Unixtimetamp@ fromunixtimetamp@ ;ear@ month@ datediff@ ..
@Ue EEEE9**9)) format
-ggregate function *in@ max@ um@ tddevpop@ tddevamp+e@ etcN
<eed to ue group !; for aggregate to work
Manipulating Data in Hive
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 65/115
© www.BitBootCamp.com© www.BitBootCamp.com F
Manipulating Data in Hive
5e+ect 5tatement
0oin
5toring reu+t in H)35 or +oca+
Baic 3unction
5u! -dvanced Hive
Creating ta!+e from exiting data
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 66/115
© www.BitBootCamp.com© www.BitBootCamp.com
C ea g a! e o e g da a
Create ta!+e from e+ect tatement
Create ta!+e ta!+ename a@e+ect co+=D co+2D co+% from exitingta!+e
<o need to define the metadata from ta!+ename
Co+umn definition wi++ !e inherited from exiting ta!+e
5u! 6uer;
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 67/115
© www.BitBootCamp.com© www.BitBootCamp.com /
6 ;
Hive upport 5u!6uerie
5e+ect co+= from@ e+ect co== ] co+22 a co+= from t= t2
5u!9"uer; ta!+e mut !e given a name
One can have a man; neted "uerie a poi!+e..
-dvice )o not ue itD a it get ver; hard to de!ug..
1iew
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 68/115
© www.BitBootCamp.com© www.BitBootCamp.com
5;ntaxCreate 1iew v @ co+=D co+2..
- e+ect co+=D co+2 N
Co+umn name in 1 i optiona+D it wi++ take it from e+ecttatement
1iew are not materia+i[ed on ta!+e def.
Cannot !e ued to inert data etc..
Order !; and 4imit c+aue are upported
To de+ete )rop view v
Union -++
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 69/115
© www.BitBootCamp.com© www.BitBootCamp.com 7
Com!ine data from mu+tip+e ta!+e
The name of the co+umn in ta!+e mut !e identica+
On+; Union -++ i upportedD
Union i not upportedD a it wi++ do the dedupe
5e+ect co+= from ta!+e t=
Union a++
5e+ect co+=2 a co+= from ta!+e2
*ap9ide v. #educe9ide 0oin
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 70/115
© www.BitBootCamp.com© www.BitBootCamp.com />
p
There are two t;pe of 0oin
*ap 5ide 0oin 0oin wi++ happen on the mapper
*uch faterD !ut on+; work for one ma++ ta!+e
<eed to give a hint to Hive to make thi worke+ect 0=> MA9?4I"!b& =0 count@? from
- &oin ! on @-.a= A !.a=
B i a ma++ ta!+e
hive.auto.convert.&oinAtrue
#educe ide &oin 0oin wi++ happen on the reducer
-++ the data wi++ !e hipped a++ over
Can !e ver; +ow
*ap 0oin What <ot 5upported
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 71/115
© www.BitBootCamp.com© www.BitBootCamp.com /=
p pp
The fo++owing i not upported. Union 3o++owed !; a *ap0oin
4atera+ 1iew 3o++owed !; a *ap0oin
#educe 5ink @(roup B;G0oinG5ort B;GC+uter B;G)itri!ute B;3o++owed !; *ap0oin
*ap0oin 3o++owed !; Union
*ap0oin 3o++owed !; 0oin *ap0oin 3o++owed !; *ap0oin
#educe 5ide 0oin
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 72/115
© www.BitBootCamp.com© www.BitBootCamp.com /2
#educe ide &oin Wi++ do fu++ outer &oin
It wi++ convert to inner &oin at reducer
$ach mapper wi++ ee a +ice of data
*apper wi++ ee the &oin ke;D and on+; thoe pair are emited
#educer wi++ &oin the data
It a mu+ti tep map9reduce &o! (ood
,ara++e+ 0oinD and work at ca+e
Bad
)ata wi++ !e hipped to a++ over the network Wated c;c+eD where few matche are poi!+e
3atet Hive 6uer;
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 73/115
© www.BitBootCamp.com© www.BitBootCamp.com /%
;
In order of priorit;
*eta data on+; )ecri!e tM
H)35 read 5e+ect ? from t +imit =>M
*ap on+; 5e+ect ? from f where co+=AKva+ueL
#educe 5e+ect count@? from t
*u+tip+e *ap9#educe 5e+ect ? from t &oin t2 on @t.a A t=.a ort !; t.a
Course Chapters
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 74/115
© www.BitBootCamp.com© www.BitBootCamp.com /'
p
(etting )ata Into Hive
*anipu+ating )ata with Hive
,artitioning and Bucketing
-dvanced Hive
9artitioning and 3uc<eting
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 75/115
© www.BitBootCamp.com© www.BitBootCamp.com /F
g g
,artitioning )ata
Bucketing )ata
,artitioning )ata
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 76/115
© www.BitBootCamp.com© www.BitBootCamp.com /
,artitioning data i e"uiva+ent to Hori[onta+ p+it of dataD!aed on a co+umn va+ue
)ata i tored in u!9director; of main ta!+e
Work +ike an index on data
$xamp+e Where monthAS0an on+; read the 0an fo+der
Both tatic and d;namic partition are upported
4og )ata
Hive )ata!ae,artition m; +oaddate
GurGhiveGwarehoueG+ogG0an
GurGhiveGwarehoueG+ogG3e!
GurGhiveGwarehoueG+ogG*arch
5tatic ,artition 5;ntax
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 77/115
© www.BitBootCamp.com© www.BitBootCamp.com //
,artition !; 5;ntaxCreate ta!+e t @ co+= intD ..
9artitioned b# ! column"ame datat#pe&
#ow 3ormat )e+N
$xamp+e
Create Ta!+e 4og @ UerI) intD Httptring 5tring
,artitioned !; @ +oaddate tring
#ow 3ormat de+imited
3ie+d Terminated !; KRtL<ote
,artition co+umn i a K1irtua+ Co+umnL
)ata doe not exit in the incoming dataD it pecified !; the uer
It act +ike a rea+ co+umn in the fina+ ta!+e
5tatic ,artition $xamp+e
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 78/115
© www.BitBootCamp.com© www.BitBootCamp.com /
5;ntax to +oad data4oad data inpath SpathGtoGta!+e
Into ta!+e t
,artition @ co+Ava+ue
$xamp+e
4oad )ata inpath SGuerGmenihG+og
Into ta!+e 4og
,artition @+oaddate AS>=9>=92>='
3i+e +ocationGurGhiveGwarehoueG+ogG+oaddateA=9=92>='
);namic ,artition
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 79/115
© www.BitBootCamp.com© www.BitBootCamp.com /7
If the partition !; data a+read; exit in the taging ta!+e
3rom 5tagingta!+e 5
Inert overwrite ta!+e t ,artition @ partitionco+
e+ect .co+=D .co+2D .co+%$ s8partition1col @
,artition are automatica++; createdD !aed on the va+uein co+umn <ew partition wi++ !e created
O+d partition wi++ !e overwritten
5et the fo++owing command to et the partition 5et Hive.exec.d;namic.partitionAtrueM
5et hive.exec.max.created.fi+eA =>>>>>M
);namic ,artition Contro+
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 80/115
© www.BitBootCamp.com© www.BitBootCamp.com >
<ote If the partition co+umn have man; va+ueD that man;ditinct u!fo+der wi++ !e created
Ue the fo++owing command to +imitD run awa; code hive.exec.max.d;namic.partition.pernode
*ax num!er of d;namic partition to !e created
)efau+t =>>
hive.exec.max.d;namic.partition Tota+ num!er of partition created per Hive64 code
)efau+t =>>>
hive.exec.max.created.fi+e *ax num!er of fi+e created !; mapper and reducer
)efau+t =>>D>>>
Watch out for temp data createdM it can !+ow up the c+uter.̂ ^^
5u!9,artition
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 81/115
© www.BitBootCamp.com© www.BitBootCamp.com =
Ta!+e can contain u! partition Eear
*onth
Create ta!+e t @ co+pecification
,artition @ CreateEear 5tringD Create*onth 5tring Createmonth wi++ !e the u!director; under Create;ear
The d;namic co+umn partD mut !e in the end of thee+ect +it
-+ter Ta!+e command can !e ued to drop or add partition
9artitioning and 3uc<eting
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 82/115
© www.BitBootCamp.com© www.BitBootCamp.com 2
,artitioning )ata
Bucketing )ata
Bucketing )ata
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 83/115
© www.BitBootCamp.com© www.BitBootCamp.com %
5imi+ar to ,artition data
)ata i partitioned !; hah va+ue of a partition !; co+umn
$ffective in even ditri!ution of data acro thenodeGfo+der
Can !e ued to amp+e the data If we need a random amp+e to work on the data
Bucketed Ta!+e 5;ntax
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 84/115
© www.BitBootCamp.com© www.BitBootCamp.com '
Create ta!+e t @ co+pecificationC+uter !; @ co+ into n Bucket
<ote
5ince Hahva+ue are ued to ditri!ute the dataD p+eae inure thatwe have event ditri!ution of data in c+uter !; co+umn
Inert data in !ucketed ta!+e 3irt inert the data in the taging ta!+e
Hive: et mapred.reduce.takA@num!er9of9!ucket
Hive : et hive.enfore.!ucketingAtrueM
Hive : Inert overwrite ta!+e fina+!ucketta!+e
5e+ect ? from tagingta!+e
Bucketing at High 4eve+
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 85/115
© www.BitBootCamp.com© www.BitBootCamp.com F
Inert overwrite ta!+efina+!ucketta!+e
5e+ect ? fromtagingta!+e
*ap@
*ap@
*ap@
#educe@
#educe@
#educe@
Hahing
Bucket=
Bucket2
Bucket%
Hive
5amp+ing )ata
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 86/115
© www.BitBootCamp.com© www.BitBootCamp.com
To amp+e the dataD ue the fo++owing
5e+ect ? from !ucketedta!+eTa!+e5amp+e @ !ucket = out of F on co+
If ta!+e ha F !ucketD then it wi++ return the data from !ucket =
If ta!+e ha 2> !ucketD then it wi++ return data from !ucket =D D ==D =
We do not need to !ucket ta!+e to get a amp+eM we can ue omeother a+goM
HoweverD without !ucketingD a fu++ ta!+e can wi++ !e needed
Course Chapters
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 87/115
© www.BitBootCamp.com© www.BitBootCamp.com /
(etting )ata Into Hive
*anipu+ating )ata with Hive
,artitioning and Bucketing
-dvanced Hive
Advanced Hive
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 88/115
© www.BitBootCamp.com© www.BitBootCamp.com
Hive 1aria!+e
Hive Command 4ine Interface
Thrift and Hive
Tranform
Uer defined function
5er)e
Hive 1aria!+e
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 89/115
© www.BitBootCamp.com© www.BitBootCamp.com 7
1aria!+e in the cript Hive : et varnameAva+ueM
5e+ect ? from ta!+e where co+AhiveconfvarnamePM
1aria!+e from outide of cript Hive Jhiveconf varnameAva+ueM
hive Jhiveconf ;earA2>=' Je
K4oad data infi+e StmpG;earP into +og partitionA;earPL
<ote R AA K <ew +ine in unixD it a continuation of code K
Advanced Hive
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 90/115
© www.BitBootCamp.com© www.BitBootCamp.com 7>
Hive 1aria!+e
Hive Command 4ine Interface
Thrift and Hive
Tranform
Uer defined function
5er)e
Command 4ine Interface C4I Hive
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 91/115
© www.BitBootCamp.com© www.BitBootCamp.com 7=
Hive ha command +ine interfaceD to run "uerie ininteractive mode
3o++owing are the command +ine option
Interactive *ode
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 92/115
© www.BitBootCamp.com© www.BitBootCamp.com 72
Hive run in the interactive without the option for 9e K In+ine 6uerieL
9f K Code fi+e K
Comment 99
-++ code +ineD need to end with M
Command V"uer; : M run the "uer;
$xitM or 6uitM exit the interactive mode
5et +it out a++ the varia!+e in the hive environment
5et Jv a++ poi!+e tandard varia!+e in the ;tem
-dd fi+e Vva+ue: add the +it of reource to the environment 4it V3i+eQ0arQarchive: earch the t;pe of reource
)f execute a df command from hive
-dding #eource to Hive
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 93/115
© www.BitBootCamp.com© www.BitBootCamp.com 7%
Hive can add reource to the eion -n; +oca+ fi+e
-++ the added reourceD wi++ !e ditri!uted to the hadoop on a++ thenode
-dd fi+e Q 0ar Q -rchive P Vfi+e,ath: ?
4it fi+e Q 0ar Q -rchive P Vfi+e,ath: ?
)e+ete fi+e Q 0ar Q -rchive P Vfi+e,ath: ?
$xamp+e
Hive : add fi+e Gm;codeGcode.p;M
Hive : +it fi+eM GtmpGcode.p;
Advanced Hive
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 94/115
© www.BitBootCamp.com© www.BitBootCamp.com 7'
Hive 1aria!+e
Hive Command 4ine Interface
Thrift and Hive
Tranform
Uer defined function
5er)e
What i -pache Thrift8
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 95/115
© www.BitBootCamp.com© www.BitBootCamp.com 7F
Thrift i a g+ue !etween man; +anguage
#,C @ #emote ,rocedure Ca++ are ued to ca++ thefunction from other +anguage
-n; +anguageD which upport #,C can !e ued in Thrift
)eve+oped at 3ace!ook
Cro +anguage eria+i[ation with +ower overhead
-++ow for the ue of the other +anguage C]]D 0avaD ,;thonD ,H,D #u!;D C_D ,er+D 0ava5cript.. $tc
-++ow ue of definition fi+e
Inner Working of Thrift
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 96/115
© www.BitBootCamp.com© www.BitBootCamp.com 7
Create a ?.thrift to dec+are o!&ect and procedure Thee wi++ !e ued to communicate !etween Hive and other
+anguage
$xecute the Thrift Too+ (enerate the Thrift p+atform code from ;our +anguage of choice
Create C+ient 5erver -pp+ication
Thrift wi++ createD tranport c+aeD define o!&ect and function
#un the Hive 5erver
<ote Comp+ex and need a +ot of hacking to make it work
Advanced Hive
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 97/115
© www.BitBootCamp.com© www.BitBootCamp.com 7/
Hive 1aria!+e
Hive Command 4ine Interface
Thrift and Hive
Tranform
Uer defined function
5er)e
Uing Tranform
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 98/115
© www.BitBootCamp.com© www.BitBootCamp.com 7
Hive a++ow the uer to ue cripting +anguage to modif;the data Can ue an; cripting +anguage
5;ntax Hive : add Gm;codeGcript.p+M
Hive: Inert overwrite ta!+e reu+t
5e+ect tranform@t.? uing SGm;codeGcript.p+L a @co+=D co+2
3rom @ e+ect ? N tM
Tranform
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 99/115
© www.BitBootCamp.com© www.BitBootCamp.com 77
Input to 5cript )ata i received a ta! eparated va+ue
Output of 5cript )ata i emitted a ta! eparated va+ue
5cript can !e !ui+t in an; +anguageD o +ong a the c+uterha the parerGinterpreter for it -dd command wi++ move the cript to a++ the node on the c+uter
Cutom *ap and #educe 5cript
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 100/115
© www.BitBootCamp.com© www.BitBootCamp.com =>>
Tranform a++ow for cutom map and reduce cript
3rom @
3rom uer
*ap uer.uidD D uer.date
uing Smapide5criptLa dtD id
c+uter !; dt mapoutput
Inert overwrite ta!+e uerreduce
reduce mapoutput.dtD mapoutput.iduing Sreducecript
a dateD countM
Output t;pe of tranform
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 101/115
© www.BitBootCamp.com© www.BitBootCamp.com =>=
)efau+t the output fie+d wi++ !e on convert to 5tring andde+imited !; KRtL <u++ va+ue wi++ !e convert to 4itera+ R<
)efau+t can changed !; #ow 3ormat
5e+ect Tranform@ ..
Uing S5ome5criptL
- Co+=D Co+2
5e+ect Tranform@ N
Uing S5ome5criptM
- @Co+= intD co+2 int
$xamp+e
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 102/115
© www.BitBootCamp.com© www.BitBootCamp.com =>2
)ata Count the num!er of Word in the !ook
)ata i tored a one +ine per record
*apper 4eve+ 5p+it the word from each +ine
$mit the word and = a cnt @wordD=
5end the data for each word to the ame reducer !; uing c+uter !;
#educer Count the num!er of word
$xamp+e ,;thon 5cript
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 103/115
© www.BitBootCamp.com© www.BitBootCamp.com =>%
Mapperscript.py
#!/usr/bin/env python
import ;
for +ine in ;8tdin
word B +ine8trip@8p+it@
for word in word
print `Rt=` @word8+ower@
#educe5cript.p;
#!/usr/bin/env python
import ;
@+atke;D +atcount B @<oneD >@ke;D count A @<oneD >for +ine in ;8tdin
@ke;D count B +ine8trip@8p+it@`Rt`if +atke; and +atke; B ke;
print `Rtd` @ke;D count@+atke;D +atcount B @ke;D int@count
else+atke; B ke;
+atcount >B int@count
$xamp+e Hive code
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 104/115
© www.BitBootCamp.com© www.BitBootCamp.com =>'
3#O* @
3#O* doc
5$4$CT T#-<53O#* @+ine
U5I<( Smapper5cript.p;\
-5 wordD count
C4U5T$# BE word
wcI<5$#T O1$#W#IT$ T-B4$ wordcount
5$4$CT T#-<53O#* @wc.wordD wc.count
U5I<( reducercript.p;\
-5 wordD count
M
Advanced Hive
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 105/115
© www.BitBootCamp.com© www.BitBootCamp.com =>F
Hive 1aria!+e
Hive Command 4ine Interface
Thrift and Hive
Tranform
Uer defined function
5er)e
Uer )efined 3unction @U)3
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 106/115
© www.BitBootCamp.com© www.BitBootCamp.com =>
Uer )efined function can !e ued the ame wa; a td.function
U)3 are written in 0-1- $xtend the c+a with U)3 c+a
5hou+d contain a eva+uate method
Three t;pe of U)3 are upported 5tandard U)3
Uer )efined Ta!+e 3unction @ U)T 5ing+e input row to make into mu+tip+e output #ow
J *apper Uer )efined -ggregate function @ U)-3
-ggregate mu+tip+e va+ue to one one va+ue J #educer
Cutom U)3
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 107/115
© www.BitBootCamp.com© www.BitBootCamp.com =>/
,ackage com.u+f!erht+a!.hive.udfM
Import org.apache.hadoop.hive."+.exec.U)3M
Import org.apache.hadoop.io.TextM
,u!+ic fina+ c+a Upper extend U)3
,u!+ic Text eva+uate @ fina+ Text
If@ AA nu++ return nu++ MP
return new Text @ [email protected]@ M
P
P
)ep+o; the 0ar 3i+e
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 108/115
© www.BitBootCamp.com© www.BitBootCamp.com =>
Compi+e the code in the 0ar 3i+e
Hive : add 0ar m;code.&ar
-dded m;code.&ar to c+a path
Hive : +it &arM
m;code.&ar
#egiter the function
Hive: create temporar; function m;upper a
Scom.u+f!erht+a!.hive.udf.UpperM
Ca++ing the U)3 function
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 109/115
© www.BitBootCamp.com© www.BitBootCamp.com =>7
Hive: e+ect m;upper@wordD um@fre" from ngram
(roup !; m;upper@word
Advanced Hive
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 110/115
© www.BitBootCamp.com© www.BitBootCamp.com ==>
Hive 1aria!+e
Hive Command 4ine Interface
Thrift and Hive
Tranform
Uer defined function
5er)e
5eria+i[er G )eeria+i[er @ 5er)e
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 111/115
© www.BitBootCamp.com© www.BitBootCamp.com ===
There i no re"uirement for data to !e in Hive3ormat
)ata i not verified on the inert or +oad )ata i &ut moved a !it
3i+e are imp+; tored
Thi a++ow for ver; fat movement of data
$rror are dicovered when we "uer; the data
)ata ma; not !e in tandard format 4ogD untructured dataD etc..
Hive ued 5er)e to contro+D how to read and write the
fi+e
5er)e
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 112/115
© www.BitBootCamp.com© www.BitBootCamp.com ==2
)efau+t 4a[;5imp+e5er)e ,are the data !aed on the de+imiter into t;ped o!&ect
Ue +a[; creation of o!&ect for !etter performace
#egex5er)e Ue regu+ar expre to pare the fi+e
Uer can create cutom 5er)e #ead the !inar; fi+e etc..
The proce to dep+o; erde i ame a U)3
$xamp+e
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 113/115
© www.BitBootCamp.com© www.BitBootCamp.com ==%
Com!ine the data from three cv to two co+umn
5amp+e )ata movieidD ;earD tit+e P
=2%'D 2>='D The god 3ather
%'%FD 2>='D (ravit;..
Create ta!+e movie @ movieid tringD detai+ tring
#ow format erdeSorg.apache.hadoop.hive.contri!.erde2.#egex5er)e
With erdepropertie@ Kinput.regexL A K@RRd?D@.?L M
Appendi+
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 114/115
© www.BitBootCamp.com© www.BitBootCamp.com =='
#ow<um!er U)3
U)3#ow5e"uence
7/21/2019 4 Hive Tutorial d03
http://slidepdf.com/reader/full/4-hive-tutorial-d03 115/115
package org.apache.hadoop.hive.contri!.udfM
import org.apache.hadoop.hive."+.exec.)ecriptionM
import org.apache.hadoop.hive."+.exec.U)3M
import org.apache.hadoop.hive."+.udf.U)3T;peMimport org.apache.hadoop.io.4ongWrita!+eM
G?? ? U)3#ow5e"uence. ?G
Y)ecription@name A ̀ rowe"uence`D
va+ue A ̀ 3U<C@ 9 #eturn a generated row e"uence num!er tarting from =`
YU)3T;pe@determinitic A fa+eD tatefu+ A true
pu!+ic c+a U)3#ow5e"uence extend U)3
private 4ongWrita!+e reu+t A new 4ongWrita!+e@M
pu!+ic U)3#ow5e"uence@
reu+t.et@>M
P
pu!+ic 4ongWrita!+e eva+uate@
reu+t.et@reu+t.get@ ] =M
return reu+tM P
P
GG $nd U)3#ow5e"uence.&ava