42
DISTRIBUTED RESEARCH ON EMERGING APPLICATIONS & MACHINES dream-lab.in | Indian Institute of Science, Bangalore DISTRIBUTED RESEARCH ON EMERGING APPLICATIONS & MACHINES dream-lab.in | Indian Institute of Science, Bangalore DREAM:Lab DREAM:Lab ©DREAM:Lab, 2014 This work is licensed under a Creative Commons Attribution 4.0 International License DISTRIBUTED RESEARCH ON EMERGING APPLICATIONS & MACHINES dream-lab.in | Indian Institute of Science, Bangalore DREAM:Lab SE252:Lecture 13-14, Feb 24/25 ILO3:Algorithms and Programming Patterns for Cloud Applications (Hadoop) DREAM:Lab

SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

  • Upload
    buibao

  • View
    235

  • Download
    8

Embed Size (px)

Citation preview

Page 1: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DISTRIBUTED RESEARCH ON EMERGING APPLICATIONS & MACHINESdream-lab.in | Indian Institute of Science, BangaloreDISTRIBUTED RESEARCH ON EMERGING APPLICATIONS & MACHINESdream-lab.in | Indian Institute of Science, Bangalore

DREAM:LabDREAM:Lab

©DREAM:Lab, 2014This work is licensed under a Creative Commons Attribution 4.0 International License

DISTRIBUTED RESEARCH ON EMERGING APPLICATIONS & MACHINESdream-lab.in | Indian Institute of Science, Bangalore

DREAM:Lab

SE252:Lecture 13-14, Feb 24/25ILO3:Algorithms and Programming

Patterns for Cloud Applications (Hadoop)

DREAM:Lab

Page 2: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

ILO 3

Page 3: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

Patterns & Technologies

Page 4: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

MapReduce

Page 5: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

MapReduce Design Pattern

Page 6: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

MapReduce: Data-parallel Programming Model

map(ki, vi) List<km, vm>[]••

reduce(km, List<vm>[]) List<kr, vr>[]••

Page 7: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

MR Borrows from Functional Programming

••

Page 8: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

MapReduce & MPI Scatter-Gather

Page 9: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

MapReduce: Programming Model

How nowBrown cow

How doesIt work now

brown 1cow 1does 1How 2it 1now 2work 1

<How,1><now,1><brown,1><cow,1><How,1><does,1><it,1><work,1><now,1>

<How,1 1><now,1 1><brown,1><cow,1><does,1><it,1><work,1>

Map(k1,v1) → list(k2,v2)Reduce(k2, list(v2)) → list(v2)

Page 10: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

Map

Page 11: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

Map

map(String input_key, String input_value):

// input_key: line number

// input_value: line of text

for each Word w in input_value.tokenize()

EmitIntermediate(w, "1");

(0, “How now brown cow”) →

[(“How”, 1), (“now”, 1), (“brown”, 1), (“cow”, 1)]

Page 12: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

let map(k, v) = emit(k.toUpper(), v.toUpper())

(“foo”, “bar”) → (“FOO”, “BAR”)

(“Foo”, “other”) → (“FOO”, “OTHER”)

(“key2”, “data”) → (“KEY2”, “DATA”)

let map(k, v) =

if (isPrime(v)) then emit(k, v)

(“foo”, 7) → (“foo”, 7)

(“test”, 10) → (nothing)

Page 13: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

Reduce

Page 14: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

Reduce

reduce(String output_key, Iterator intermediate_values)

// output_key: a word

// output_values: a list of counts

int sum = 0;

for each v in intermediate_values

sum += ParseInt(v);

Emit(output_key, AsString(sum));

(“A”, [1, 1, 1]) → (“A”, 3)

(“B”, [1, 1]) → (“B”, 2)

Page 15: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

for each w

in value do

emit(w,1)

How nowBrown cow

How doesIt work now

for all w in

value do

emit(w,1)

<How,1><now,1><brown,1><cow,1>

<How,1><does,1><it,1><work,1><now,1>

<How,1 1><now,1 1>

<brown,1><cow,1>

<does,1><it,1><work,1>

How 2now 2

does 1it 1work 1

brown 1cow 1

sum =

sum + value

emit(key,sum)

Page 16: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

Anagram Example

public class AnagramMapper extends MapReduceBase implementsMapper<LongWritable, Text, Text, Text> {

private Text sortedText = new Text();private Text orginalText = new Text(); public void map(LongWritable key, Text value,

OutputCollector<Text, Text> outputCollector, Reporter reporter) {

String word = value.toString();char[] wordChars = word.toCharArray();Arrays.sort(wordChars);String sortedWord = new String(wordChars);sortedText.set(sortedWord);orginalText.set(word);// Sort word and emit <sorted word, word>outputCollector.collect(sortedText, orginalText);

}}

Page 17: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

Anagram Example…public void reduce(Text anagramKey, Iterator<Text> anagramValues,

OutputCollector<Text, Text> results, Reporter reporter) {String output = "";while(anagramValues.hasNext()) {

Text anagram = anagramValues.next();output = output + anagram.toString() + "~";

}StringTokenizer outputTokenizer =

new StringTokenizer(output,"~");// if the values contain more than one word // we have spotted a anagram.if(outputTokenizer.countTokens()>=2) {

output = output.replace("~", ",");outputKey.set(anagramKey.toString());outputValue.set(output);results.collect(outputKey, outputValue);

}}

Page 18: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

5-min Assignment

Page 19: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

MapReduce for Histogram

int bucketWidth = 4Map(k, v) {

emit(v/bucketWidth, 1)}

Reduce(k, v[]){sum=0;foreach(w in v[]) sum++;emit(k, sum)

}

7296025

21103540

111162181

246810110

1,10,12,11,10,10,11,1

0,10,12,10,11,11,10,1

2,12,11,10,10,12,10,1

0,11,11,12,12,12,10,1

2,12,12,12,12,12,12,12,1

0,10,10,10,10,10,1

1,11,11,11,11,11,11,11,1

0,10,10,10,10,10,1

2,8 0,12 1,8

Page 20: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

MapReduce for Histogramint bucketWidth = 4Map(k, v) {

emit(v/bucketWidth, 1)}

Combine(k, v[]) {// same code as reducer

}

Reduce(k, v[]){sum=0;foreach(w in v[]) sum+=w;emit(k, sum)

}

8296025

21103540

111162181

246810110

2,10,12,11,1

0,10,12,10,1

2,12,11,10,1

0,11,11,12,1

0,10,11,1

1,11,10,1

0,12,10,1

2,12,10,1

2,3 1,4 0,7 2,6 1,3 0,5

2,12,12,1

1,11,11,11,1

0,10,10,10,1

0,10,10,1

2,12,12,1

1,11,11,1

0,10,10,10,10,1

2,12,12,1

2,32,6

1,41,3

0,70,5

2,9 1,7 0,12

Page 21: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

Hadoop Execution Model

Page 22: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

Hadoop MapReduce & HDFS

Page 23: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

HDFS Read/Write

Page 24: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

Scheduling a MR Job

Page 25: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

MapReduce w/ 1 & N Reducers

Page 26: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

Map only job

Page 27: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

Pipelining during Shuffle & Sort

Page 28: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

Sorting using MapReduce (Map Only)

Page 29: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

Sorting using MapReduce

Page 30: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

MapReduce Execution Overview

Page 31: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

MapReduce Execution Overview

Page 32: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

MapReduce Execution Overview

Page 33: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

MapReduce Resources

Page 34: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

MapReduce Resources

Page 35: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

MapReduce Execution Overview

Page 36: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

MapReduce Execution Overview

Page 37: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

MapReduce Execution Overview

Page 38: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

MapReduce Execution Overview

Page 39: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

Locality

Page 40: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

Fault Tolerance

Page 41: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

Optimizations

Page 42: SE252:Lecture 12, Feb 18 ILO3:3:Algorithms and …cds.iisc.ac.in/faculty/simmhan/SE252/lectures/SE252.Jan2015... · DREAM:Lab MapReduce: Programming Model How now Brown cow How does

DREAM:LabDREAM:LabDREAM:Lab

Reminder