View
10
Download
0
Category
Preview:
Citation preview
Introduc)ontoMap-Reduce
VincentLeroy
1
Sources
• ApacheHadoop• Yahoo!DeveloperNetwork• Hortonworks• Cloudera• Prac)calProblemSolvingwithHadoopandPig
2
«BigData»
• Google,2008– 20PB/day– 180GB/job(variable)
• Webindex– 50Bpages– 15PB
• LargeHadronCollider(LHC)@CERN:produces15PB/year
3
Capacityofa(large)server
• RAM:256GB• Harddrivecapacity:24TB• Harddrivethroughput:100MB/s
4
Solu)on:Parallelism
• 1server– 8disks– ReadtheWeb:230days
• HadoopCluster@Yahoo– 4000servers– 8disks/server– ReadtheWebinparallel:1h20
5
DatacenterGoogle
6
Pi_allsinparallelism
• Synchroniza)on– Mutex,semaphores…
• Difficul)es– Deadlocks– Op)miza)on– Costly(experts)– Notreusable
7
Programmingmodels
• Sharedmemory(mul)cores)
• Messagepassing(MPI)
8
Faulttolerance
• Aserverfailseveryfewmonths• 1000servers…– MTBF(mean)mebetweenfailures)<1day
• Abigjobmaytakeseveraldays– Therewillbefailures,thisisnormal– Computa)onsshouldfinishwithinareasonable)meàYoucannotstartoverincaseoffailures
• Checkpoin)ng,replica)on– Hardtoimplementcorrectly
9
BigDataPla_orm
• Leteveryonewriteprogramsformassivedatasets– Encapsulateparallelism• Programmingmodel• Deployment
– Encapsulatefaulttolerance• Detectandhandlefailures
à Codeonce(experts),benefittoall
10
MAP-REDUCEMODEL
11
WhatareMapandReduce?
• 2simplefunc)onsinspiredfromfunc)onalprogramming– Transforma6on:mapmap(f,[x1,…,xn])=[f(x1),…,f(xn)]Ex:map(*2,[1,2,3])=[(*21),(*22),(*23)] =[2,4,6]
– Aggrega6on:reducereduce(f,[x1,…,xn])=f(x1,f(x2,f(x3,…f(xn-1,xn)))))Ex:reduce(+,[2,4,6])=(+2(+46)) =12
12
WhatareMapandReduce?
• Generic– Takeafunc)onasaparameter
• Canbeinstan)atedandcombinedtosolvemanydifferentproblems– map(toUpperCase,[“hello”,“data”])=[“HELLO”,“DATA”]
– reduce(max,[87,12,91])=91
• Thedeveloperprovidesthefunc)onapplied
13
Dataaskey/valuepairs
• MapReducedoesnotmanipulateatomicpiecesofdata– Everythingisa(Key,Value)pair– Keyandvaluecanbeofanytype• Ex:(Hello,17)
– Key=Hello,typetext– Value=17typeint
• Whenini)aldataisnotkey/value,interpretitaskey/value– Inputtextfilebecomes[(#line,line_content)…]
14
Map-ReduceonKey-Valuepairs
• MapandReduceadjustedtoKey-Valuepairs– Inmap,fisappliedindependentlyoneverykey/valuepairf(key,value)àlist(key,value)
– Inreduce,fisappliedtoallvaluesassociatedwiththesamekeyf(key,list(value))àlist(key,value)
– Thetypesofkeysandvaluestakenasinputdoesnothavetobethesameastheoutput
15
Example:Coun)ngfrequencyofwords
• Input:Afileof2lines– 1,"abcaabc"– 2,"abbccaccb"
• Output– a,3– b,3– c,2– aa,1– bb,1– cc,2
16
Wordfrequency:Mapper
• Mapprocessesapor)on(line)oftext– Splitwords– Foreachword,countoneoccurrence– Keynotusedinthisexample(linenumber)
• map(IntlineNumber,Textline,Outputoutput){ foreachwordinline.split(space){ output.write(word,1) }}
17
Wordfrequency:Reducer• Foreachkey,reduceprocessesallthecorrespondingvalues– Addnumberofoccurrences
• reduce(Stringword,List<Int>occurrences,Outputoutput){ intcount=0 foreachintoccinoccurrences{ count+=occ } output.write(word,count)}
18
Execu)onflow1,"abcaabc" 2,"abbccaccb"a,1b,1c,1aa,1b,1c,1
a,1bb,1cc,1a,1cc,1b,1
Map
Reduce a,[1,1,1]
b,[1,1,1]
c,[1,1]
aa,[1]
bb,[1]
cc,[1,1]
a,3
b,3
c,2
aa,1
bb,1
cc,2 19
HowtobuildaWebindex?
• Ini)aldata:(URL,web_page_content)• Goal:buildinvertedindex
Grenoble
h}ps://fr.wikipedia.org/wiki/Grenoble
h}p://www.grenoble.fr/
h}p://www.grenoble-tourisme.com/
h}p://wikitravel.org/en/Grenoble
UNIL
h}p://www.unil.ch/
h}ps://fr.wikipedia.org/wiki/Universit%C3%A9_de_Lausanne
h}ps://twi}er.com/unil
h}p://www.forma)on-con)nue-unil-epfl.ch/
20
HowtobuildaWebindex?
• map(URLpageURL,TextpageContent,Outputoutput){ foreachwordinpageContent.parse(){ output.write(word,pageURL) }}
21
HowtobuildaWebindex?
• reduce(Textword,List<URL>webPages,Outputoutput){ pos)ngList=initPos)ngList() foreachurlinwebPages{ pos)ngList.add(url) } output.write(word,pos)ngList)}
22
APACHEHADOOP:MAPREDUCEFRAMEWORK
23
Objec)veofHadoopMapReduce
• Provideasimpleandgenericprogrammingmodel:mapandreduce
• Deployexecu)onautoma)cally• Providefaulttolerance• Scaletothousandsofmachines• Performanceisimportantbutnotthepriority– What’simportantisthatjobsfinishwithinreasonable)me
– Ifit’stoslow,addservers!KillItWithIron(KIWIprinciple)
24
Architecture
• Fromamonolithicarchitecturetocomposablelayers
25
Execu)onsteps
Shuffle&Sort:groupbykeyandtransfertoreducer
26
Shuffle&Sort
• Barrierintheexecu)on– Allmaptasksmustcompletebeforestar)ngreduce
• Par))onertoassignkeystoserversexecu)ngreduce– Ex:hash(key)%nbServers– Dealwithloadbalancing
27
Combiner• Poten)alproblemofamapfunc)on:manykey/valuepairsintheoutput– Materializedtodisk,senttothereduceroverthenetwork
– Costlystepoftheexecu)on• Addanoperator:Combiner– Mini-reducerexecutedonthedataproducedbymaponasinglemachinetostartaggrega)ngit
• CombinermaybeusedbyHadoop(op)onal)– Thecorrectnessoftheprogramshouldnotdependonit
28
CombinerMap
Reduce
Key Value
Input MKI MVI
Output MK0 MV0
Key Value
Input RKI RVI
Output RK0 RV0
29
CombinerMap
Reduce
Combine
Key Value
Input MKI MVI
Output MK0 MV0
Key Value
Input CKI CVI
Output CK0 CV0
Key Value
Input RKI RVI
Output RK0 RV0
30
CombinerMap
Reduce
Combine
Key Value
Input MKI MVI
Output MK0 MV0
Key Value
Input CKI CVI
Output CK0 CV0
Key Value
Input RKI RVI
Output RK0 RV0
31
Combiner1,"abcaabc" 2,"abbccaccb"a,1b,1c,1aa,1b,1c,1
a,1bb,1cc,1a,1cc,1b,1
Map
Reduce a,[1,2]
b,[2,1]
c,[2]
aa,[1]
bb,[1]
cc,[2]
a,3
b,3
c,2
aa,1
bb,1
cc,2
a,1b,2c,2aa,1
a,2bb,1cc,2b,1
Combiner
32
Combiner
• SameAPIasreduce(key,List<value>)– Notthesamecontract!Foronekey,yougetSOMEvalues
• O�enthesameaggrega)onasreduce– E.g.WordCount
• Differentwhenusingglobalproper)es– E.g.Keepwordspresentatleast5)mes
33
HadoopMapReduceasadeveloper
• Providethefunc)onsperformedbyMapandReduce(Java,C++)– Applica)ondependent
• Definesthedatatypes(keys/values)– Ifnotstandard(Text,IntWritable…)– Func)onsforseraliza)on
• That’sall.
34
Importsimport java.io.IOException ; import java.util.* ; import org.apache.hadoop.fs.Path ; import org.apache.hadoop.io.IntWritable ; import org.apache.hadoop.io.LongWritable ; import org.apache.hadoop.io.Text ; import org.apache.hadoop.mapreduce.Mapper ; import org.apache.hadoop.mapreduce.Reducer ; import org.apache.hadoop.mapreduce.JobContext ; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat ; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat ; import org.apache.hadoop.mapreduce.Job ;
DonotusetheoldmapredAPI! 35
Mapper // input key type, input value type, output key type, output value type public class WordCountMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
@Override protected void map(LongWritable key, Text value,
Context context) throws IOException, InterruptedException {
for (String word : value.toString().split("\\s+")) { context.write(new Text(word), new IntWritable(1)); } }
}
36
Reducer// input key type, input value type, output key type, output value type public class WordCountReducer extends Reducer<Text, IntWritable, Text, LongWritable> {
@Override protected void reduce(Text key, Iterable<IntWritable>
values, Context context) throws IOException, InterruptedException {
long sum = 0; for (IntWritable value : values) { sum += value.get(); }
context.write(key, new LongWritable(sum)); }
}
37
Mainpublic class WordCountMain { public static void main(String [] args) throws Exception {
Configuration conf = new Configuration();
String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(WordCountMain.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(LongWritable.class);
job.setMapperClass(WordCountMapper.class);
job.setReducerClass(WordCountReducer.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
} 38
Writableexamplepublic class StringAndInt implements WritableComparable<StringAndInt> {
private IntWritable iw = new IntWritable(); private Text t = new Text(); public StringAndInt() {} public StringAndInt(String s, int i) { this.iw.set(i); this.t.set(s);} @Override public void write(DataOutput out) throws IOException { this.iw.write(out); this.t.write(out);} @Override public void readFields(DataInput in) throws IOException { this.iw.readFields(in); this.t.readFields(in);} @Override public int compareTo(StringAndInt o) { int c1 = this.t.compareTo(o.t); if (c1 != 0) { return c1; } else { return this.iw.compareTo(o.iw); }}
39
Terminology
• MapReduceprogram=job• Jobsaresubmi}edtotheJobTracker• Ajobisdividedinseveraltasks– AMapisatask– AReduceisatask
• TasksaremonitoredbyTaskTrackers– Aslowtaskiscalledastraggler
40
Jobexecu)on• $hadoopjarwordcount.jarorg.myorg.WordCountinputPath(HDFS)
outputPath(HDFS)• Checkparameters
– Isthereanoutputdirectory?– Doesitalreadyexist?– Isthereaninputdirectory?
• Computesplits• Thejob(MapReducecode),itsconfigura)onandsplitsarecopied
withahighreplica)on• Createanobjecttofollowtheprogressathetasksiscreatedbythe
JobTracker• Foreachsplit,createaMap• Createdefaultnumberofreducers
41
Tasktracker• TaskTrackersendsaperiodicsignaltotheJobTracker– Showthatthenodes)llfunc)ons– TellwhethertheTaskTrackerisreadytoacceptanewtask
• ATaskTrackerisresponsibleforanode– Fixednumberofslotsformaptasks– Fixednumberofslotsforreducetasks– Taskscanbefromdifferentjobs
• EachtaskrunsonitsownJVM– PreventsataskcrashtocrashtheTaskTrackeraswell
42
JobProgress
• AMaptaskreportsonitsprogress,i.e.amountofthesplitprocessed
• Forareducetask,3states– copy– sort– reduce
• ReportsenttotheTaskTracker• Every5seconds,reportforwardedtotheJobTracker• UsercanseetheJobTrackerstatethroughWebinterface
43
Progress
44
EndofJob• Outputofeachreducerwri}entoafile• Jobtrackerno)fiestheclientandwritesa
reportforthejob14/10/2811:54:25INFOmapreduce.Job:Jobjob_1413131666506_0070completedsuccessfullyJobCountersLaunchedmaptasks=392Launchedreducetasks=88Data-localmaptasks=392[...]Map-ReduceFrameworkMapinputrecords=622976332Mapoutputrecords=622952022Reduceinputgroups=54858244Reduceinputrecords=622952022Reduceoutputrecords=546559709[...]
45
Serverfailureduringajob
• Buginatask– taskJVMcrashes→TaskTrackerJVMno)fied– taskremovedfromitsslot
• Taskbecomeunresponsive– )meouta�er10minutes– taskremovedfromitsslot
• Eachtaskmaybere-runuptoN)mes(default7)incaseofcrashes
46
HDFS:DISTRIBUTEDFILESYSTEM
47
RandomvsSequen)aldiskaccess• Example
– DB100Musers– 100B/user– Alter1%records
• Randomaccess– Seek,read,write:30mS– 1Musersà8h20
• Sequen)alaccess– ReadALLWriteALL– 2x10GB@100MB/Sà3minutes
àItiso�enfastertoreadallandwriteallsequen)ally
48
DistributedFileSystem(HDFS)
• Goal– Faulttolerance(redundancy)– Performance(parallelaccess)
• Largefiles– Sequen)alreads– Sequen)alwrites
• “inplace”dataprocessing– Dataisstoredonthemachinesthatprocessit
• Be}erusageofmachines(nodedicatedfiler)• Lessnetworkbo}lenecks(be}erperformance)
49
HDFSmodel
• Dataorganizedinfilesanddirectoriesàmimicsastandardfilesystem
• Filesdividedinblocks(default:64MB)spreadonservers
• HDFSreportsthedatalayouttotheMap-ReduceframeworkàIfpossible,processdataonthemachineswhereitisalreadystored
50
Faulttolerance
• Fileblocksreplicated(default:3)totoleratefailures
• Placementaccordingtodifferentparameters– Powersupply– Networkequipment– Diverseserverstoincreasetheprobabilityofhavinga“close”copy
• Checksumofdatatodetectcorrupterblocks(alsoavailableinmodernfilesystems)
51
Master/Workerarchitecture• Amaster,theNameNode– Managethespaceoffilenames– Managesaccessrights– Superviseopera)onsonfiles,blocks…– Supervisethehealthofthefilesystem(failures,loadbalance…)
• Many(1000s)slaves,theDataNodes– Storethedata(blocks)– Performreadandwriteopera)ons– Performcopies(replica)on,orderedbytheNameNode)
52
NameNode
• Storesthemetadataofeachfileandblock(inode)– Filename,directory,blocksasso)ated,posi)onoftheseblocks,numberofreplicas…
• Keepsallinmainmemory(RAM)– Limi)ngfactor=numberoffiles– 60Mobjectsin16GB
53
DataNode
• Manageandmonitorthestateofblocksstoredonthehostfilesystem(o�enLinux)
• DirectlyaccessedbytheclientsàdatanevertransitthroughtheNameNode
• SendheartbeatstotheNameNodetoshowthattheserverhasnotfailed
• ReporttotheNameNodeifblocksarecorrupted
54
Wri)ngafile• TheclientsendsaquerytotheNameNodetocreateanew
file• TheNameNodechecks
– Clientauthoriza)ons– Filesystemconflicts(exis)ngfile…)
• NameNodechosesDataNodestostorefileandreplicas– DataNodes“pipelined”
• BlocksareallocatedontheseDataNodes• StreamofdatasenttothefirstDataNodeofthepipeline• EachDataNodeforwardsthedatareceivedtothenext
DataNodeinthepipeline
55
Readingafile• ClientsendsarequesttotheNameNodetoreadafile• NameNodechecksthefileexistsandbuildsalistofDataNodes
containingthefirstblocks• Foreachblock,NameNodesendstheaddressoftheDataNodes
hos)ngthem– Listorderedwrt.Proximitytotheclient
• ClientconnectstotheclosestDataNodecontainingthe1stblockofthefile
• Blockreadends:– Closeconnec)ontotheDataNode– Newconnec)ontotheDataNodecontainingthenextblock
• Whenallblocksareread:– QuerytheNameNodetoretrievethefollowingblocks
56
HDFSStructure
1
2
1
2
34
1
2
3
57
HDFScommands(directories)
• Createdirectorydir$hadoopdfs-mkdir/dir
• ListHDFScontent$hadoopdfs-ls
• Removedirectorydir$hadoopdfs-rmr/dir
58
HDFScommands(files)
• Copylocalfiletoto.txttoHDFSdir/$hadoopdfs-puttoto.txtdir/toto.txt
• CopyHDFSfiletolocaldisk$hadoopdfs-getdir/toto.txt./
• Readfile/dir/toto.txt$hadoopdfs-cat/dir/toto.txt
• Removefile/dir/toto.txt$hadoopdfs-rm/dir/toto.txt
59
Recommended