Hadoop Overview 2

Preview:

Citation preview

1Copyright © Capgemini 2012. All Rights Reserved

Agenda

• What is Big Data & Hadoop?• Core Hadoop• The Hadoop Ecosystem• Use Case

2Copyright © Capgemini 2012. All Rights Reserved

Hadoop Base Platform

3Copyright © Capgemini 2012. All Rights Reserved

Hadoop Cluster

4Copyright © Capgemini 2012. All Rights Reserved

A common Hadoop cluster architecture consists of a two-level network topology, asillustrated in Figure 9-1. Typically there are 30 to 40 servers per rack, with a 1 GB switchfor the rack (only three are shown in the diagram), and an uplink to a core switch orrouter (which is normally 1 GB or better). The salient point is that the aggregate band-width between nodes on the same rack is much greater than that between nodes ondifferent racks.

Network Topology

5Copyright © Capgemini 2012. All Rights Reserved

• Files split into 128MB blocks• Blocks replicated across several

datanodes (usually 3)• Single namenode stores metadata (file

names, block locations, etc)• Optimized for large files, sequential reads• Files are append-only

Name node

124

213

1

3

3

4

Hadoop Distributed File Systems

6Copyright © Capgemini 2012. All Rights Reserved

Block Replication The Name Node makes all decisions regarding replication of blocks. It periodically receives aHeartbeat and a Block report from each of the Data Nodes in the cluster. Receipt of aHeartbeat implies that the Data Node is functioning properly. A Block report contains a list ofall blocks on a Data Node.

Hadoop Distributed File Systems

7Copyright © Capgemini 2012. All Rights Reserved

MapReduce data flow

8Copyright © Capgemini 2012. All Rights Reserved

MapReduce Programming Model

Data type: key-value records

Map function:(Kin, Vin) list(Kinter, Vinter)

Reduce function:(Kinter, list(Vinter)) list(Kout, Vout)

9Copyright © Capgemini 2012. All Rights Reserved

Reduce(Count)

Reduce(Count)

Reduce(Count)Data

Collection: split1 Split the data to

Supply multipleprocessors

DataCollection:

split 2

DataCollection:

split nMap

Map

……

Map

Cat

Bat

Dog

Other Words(size:

TByte)

P-0000

P-0001

P-0002

, count1

, count2

,count3

Map <key, 1> Reducers (say, Count)

MAP: Input data <key, value> pairREDUCE: <key, value> pair <result>

MapReduce Operation

10Copyright © Capgemini 2012. All Rights Reserved

MapReduce Example Word Count

def mapper(line): foreach word in line.split(): output(word, 1)

def reducer(key, values): output(key, sum(values))

11Copyright © Capgemini 2012. All Rights Reserved

Word Count Execution

the quickbrown fox

the fox atethe mouse

how nowbrown cow

Map

Map

Map

Reduce

Reduce

brown, 2fox, 2how, 1now, 1the, 3

ate, 1cow, 1

mouse, 1quick, 1

the, 1brown, 1

fox, 1

quick, 1

the, 1fox, 1the, 1

how, 1now, 1

brown, 1

ate, 1mouse, 1

cow, 1

Input Map Shuffle & Sort Reduce Output

12Copyright © Capgemini 2012. All Rights Reserved

An Optimization: The Combiner

• A combiner is a local aggregation function for repeated keys produced by same map

• Works for associative functions like sum, count, max

• Decreases size of intermediate data

• Example: map-side aggregation for Word Count:

def combiner(key, values): output(key, sum(values))

13Copyright © Capgemini 2012. All Rights Reserved

An Optimization: The Combiner

• A combiner is a local aggregation function for repeated keys produced by same map

• Works for associative functions like sum, count, max

• Decreases size of intermediate data

• Example: map-side aggregation for Word Count:

def combiner(key, values): output(key, sum(values))

14Copyright © Capgemini 2012. All Rights Reserved

Word Count with Combiner

Input Map & Combine Shuffle & Sort Reduce Output

the quickbrown fox

the fox atethe mouse

how nowbrown cow

Map

Map

Map

Reduce

Reduce

brown, 2fox, 2how, 1now, 1the, 3

ate, 1cow, 1

mouse, 1quick, 1

the, 1brown, 1

fox, 1

quick, 1

the, 2fox, 1

how, 1now, 1

brown, 1

ate, 1mouse, 1

cow, 1

15Copyright © Capgemini 2012. All Rights Reserved

MapReduce Execution Details

• Single master controls job execution on multiple slaves

• Mappers preferentially placed on same node or same rack as their input block

– Minimizes network usage

• Mappers save outputs to local disk before serving them to reducers

– Allows recovery if a reducer crashes– Allows having more reducers than nodes

16Copyright © Capgemini 2012. All Rights Reserved

Anatomy of a MapReduce Job Run

• Asks the jobtracker for a new job ID (by calling getNewJobId() on JobTracker) (step2).

• Checks the output specification of the job. For example, if the output directory has not been specified or it already exists, the job is not submitted and an error is thrown to the MapReduce program.

• Computes the input splits for the job. If the splits cannot be computed, because the input paths don’t exist, for example, then the job is not submitted and an error is thrown to the MapReduce program.

• Copies the resources needed to run the job, including the job JAR file, the config-uration file, and the computed input splits, to the jobtracker’s filesystem in a directory named after the job ID. The job JAR is copied with a high replication factor (controlled by the mapred.submit.replication property, which defaults to 10) so that there are lots of copies across the cluster for the tasktrackers to access when they run tasks for the job (step 3)

• Tells the jobtracker that the job is ready for execution (by calling submitJob() on JobTracker) (step 4)

• The client, which submits the MapReduce job.

17Copyright © Capgemini 2012. All Rights Reserved

Fault Tolarence in MapReduce

1. If a task crashes:– Retry on another node

» OK for a map because it has no dependencies» OK for reduce because map outputs are on disk

– If the same task fails repeatedly, fail the job or ignore that input block (user-controlled)

Note: For these fault tolerance features to work, your map and reduce tasks must be side-effect-free

18Copyright © Capgemini 2012. All Rights Reserved

Fault Tolarence in MapReduce

2. If a node crashes:– Re-launch its current tasks on other nodes– Re-run any maps the node previously ran

» Necessary because their output files were lost along with the crashed node

19Copyright © Capgemini 2012. All Rights Reserved

3. If a task is going slowly (straggler):– Launch second copy of task on another node (“speculative

execution”)– Take the output of whichever copy finishes first, and kill the

other

Surprisingly important in large clusters– Stragglers occur frequently due to failing hardware,

software bugs, misconfiguration, etc– Single straggler may noticeably slow down a job

Fault Tolarence in MapReduce

20Copyright © Capgemini 2012. All Rights Reserved

HBase

HBase Architecture

• HBase is a distributed column-oriented database built on top of HDFS• HBase is the Hadoop application to use when you require real-time read/write random-access to very large datasets

What if you need the database features that Hive doesn’t provide, like row-levelupdates, rapid query response times, and transactions?

21Copyright © Capgemini 2012. All Rights Reserved

Hbase Cluster members

• HBase depends on ZooKeeper and by default it manages a ZooKeeper instance as the authority on cluster state

• HBase hosts vitals such as the location of the root catalog table and the address of the current cluster Master

• Regionserver slave nodes are listed in the HBase conf/regionservers file as you would list datanodes and tasktrackers in the Hadoop conf/slaves file

• There are multiple implemen-tations of the filesystem interface—one for the local filesystem, one for the KFS file-system, Amazon’s S3, and HDFS (the Hadoop Distributed Filesystem)—HBase can persist to any of these implementations

• By default, unless told otherwise, HBase writes to the local filesystem

HBase

22Copyright © Capgemini 2012. All Rights Reserved

Apache Oozie

max-temp-workflow/├── lib/│ └── hadoop-examples.jar└── workflow.xml

An open-source workflow/coordination service to manage data processing jobs for Hadoop, developed and then open-sourced by Yahoo!

23Copyright © Capgemini 2012. All Rights Reserved

PigStarted at Yahoo! ResearchRuns about 30% of Yahoo!’s jobsFeatures:

Expresses sequences of MapReduce jobs Data model: nested “bags” of items Provides relational (SQL) operators (JOIN, GROUP BY, etc) Easy to plug in Java functions Pig Pen development environment for Eclipse

24Copyright © Capgemini 2012. All Rights Reserved

• Higher level data flow language• Convert them into MapReduce Job and runs it• Provides good functionality (JOINS, practitioners)• Very compact!• A comparison between Pig & Java

• Faster to develop• Slower to run

Pig

25Copyright © Capgemini 2012. All Rights Reserved

An Example Problem

Suppose you have user data in one file, page view data in another, and you need to find the top 5 most visited pages by users aged 18 - 25.

Load Users Load Pages

Filter by age

Join on name

Group on url

Count clicks

Order by clicks

Take top 5

Example from http://wiki.apache.org/pig-data/attachments/PigTalksPapers/attachments/ApacheConEurope09.ppt

26Copyright © Capgemini 2012. All Rights Reserved

In MapReduce

import java.io.IOException; import java.util.ArrayList; import java.util.Iterator; import java.util.List; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.io.Writable; import org.apache.hadoop.io.WritableComparable; import org.apache.hadoop.mapred.FileInputFormat; import org.apache.hadoop.mapred.FileOutputFormat; import org.apache.hadoop.mapred.JobConf; import org.apache.hadoop.mapred.KeyValueTextInputFormat; import org.apache.hadoop.mapred.Mapper; import org.apache.hadoop.mapred.MapReduceBase; import org.apache.hadoop.mapred.OutputCollector; import org.apache.hadoop.mapred.RecordReader; import org.apache.hadoop.mapred.Reducer; import org.apache.hadoop.mapred.Reporter; import org.apache.hadoop.mapred.SequenceFileInputFormat; import org.apache.hadoop.mapred.SequenceFileOutputFormat; import org.apache.hadoop.mapred.TextInputFormat; import org.apache.hadoop.mapred.jobcontrol.Job; import org.apache.hadoop.mapred.jobcontrol.JobControl; import org.apache.hadoop.mapred.lib.IdentityMapper; public class MRExample { public static class LoadPages extends MapReduceBase implements Mapper<LongWritable, Text, Text, Text> { public void map(LongWritable k, Text val, OutputCollector<Text, Text> oc, Reporter reporter) throws IOException { // Pull the key out String line = val.toString(); int firstComma = line.indexOf(','); String key = line.substring(0, firstComma); String value = line.substring(firstComma + 1); Text outKey = new Text(key); // Prepend an index to the value so we know which file // it came from. Text outVal = new Text("1" + value); oc.collect(outKey, outVal); } } public static class LoadAndFilterUsers extends MapReduceBase implements Mapper<LongWritable, Text, Text, Text> { public void map(LongWritable k, Text val, OutputCollector<Text, Text> oc, Reporter reporter) throws IOException { // Pull the key out String line = val.toString(); int firstComma = line.indexOf(','); String value = line.substring(firstComma + 1); int age = Integer.parseInt(value); if (age < 18 || age > 25) return; String key = line.substring(0, firstComma); Text outKey = new Text(key); // Prepend an index to the value so we know which file // it came from. Text outVal = new Text("2" + value); oc.collect(outKey, outVal); } } public static class Join extends MapReduceBase implements Reducer<Text, Text, Text, Text> { public void reduce(Text key, Iterator<Text> iter, OutputCollector<Text, Text> oc, Reporter reporter) throws IOException { // For each value, figure out which file it's from and store it // accordingly. List<String> first = new ArrayList<String>(); List<String> second = new ArrayList<String>(); while (iter.hasNext()) { Text t = iter.next(); String value = t.toString(); if (value.charAt(0) == '1') first.add(value.substring(1)); else second.add(value.substring(1));

reporter.setStatus("OK"); } // Do the cross product and collect the values for (String s1 : first) { for (String s2 : second) { String outval = key + "," + s1 + "," + s2; oc.collect(null, new Text(outval)); reporter.setStatus("OK"); } } } } public static class LoadJoined extends MapReduceBase implements Mapper<Text, Text, Text, LongWritable> { public void map( Text k, Text val, OutputCollector<Text, LongWritable> oc, Reporter reporter) throws IOException { // Find the url String line = val.toString(); int firstComma = line.indexOf(','); int secondComma = line.indexOf(',', firstComma); String key = line.substring(firstComma, secondComma); // drop the rest of the record, I don't need it anymore, // just pass a 1 for the combiner/reducer to sum instead. Text outKey = new Text(key); oc.collect(outKey, new LongWritable(1L)); } } public static class ReduceUrls extends MapReduceBase implements Reducer<Text, LongWritable, WritableComparable, Writable> { public void reduce( Text key, Iterator<LongWritable> iter, OutputCollector<WritableComparable, Writable> oc, Reporter reporter) throws IOException { // Add up all the values we see long sum = 0; while (iter.hasNext()) { sum += iter.next().get(); reporter.setStatus("OK"); } oc.collect(key, new LongWritable(sum)); } } public static class LoadClicks extends MapReduceBase implements Mapper<WritableComparable, Writable, LongWritable, Text> { public void map( WritableComparable key, Writable val, OutputCollector<LongWritable, Text> oc, Reporter reporter) throws IOException { oc.collect((LongWritable)val, (Text)key); } } public static class LimitClicks extends MapReduceBase implements Reducer<LongWritable, Text, LongWritable, Text> { int count = 0; public void reduce( LongWritable key, Iterator<Text> iter, OutputCollector<LongWritable, Text> oc, Reporter reporter) throws IOException { // Only output the first 100 records while (count < 100 && iter.hasNext()) { oc.collect(key, iter.next()); count++; } } } public static void main(String[] args) throws IOException { JobConf lp = new JobConf(MRExample.class); lp.setJobName("Load Pages"); lp.setInputFormat(TextInputFormat.class);

lp.setOutputKeyClass(Text.class); lp.setOutputValueClass(Text.class); lp.setMapperClass(LoadPages.class); FileInputFormat.addInputPath(lp, new Path("/user/gates/pages")); FileOutputFormat.setOutputPath(lp, new Path("/user/gates/tmp/indexed_pages")); lp.setNumReduceTasks(0); Job loadPages = new Job(lp); JobConf lfu = new JobConf(MRExample.class); lfu.setJobName("Load and Filter Users"); lfu.setInputFormat(TextInputFormat.class); lfu.setOutputKeyClass(Text.class); lfu.setOutputValueClass(Text.class); lfu.setMapperClass(LoadAndFilterUsers.class); FileInputFormat.addInputPath(lfu, new Path("/user/gates/users")); FileOutputFormat.setOutputPath(lfu, new Path("/user/gates/tmp/filtered_users")); lfu.setNumReduceTasks(0); Job loadUsers = new Job(lfu); JobConf join = new JobConf(MRExample.class); join.setJobName("Join Users and Pages"); join.setInputFormat(KeyValueTextInputFormat.class); join.setOutputKeyClass(Text.class); join.setOutputValueClass(Text.class); join.setMapperClass(IdentityMapper.class); join.setReducerClass(Join.class); FileInputFormat.addInputPath(join, new Path("/user/gates/tmp/indexed_pages")); FileInputFormat.addInputPath(join, new Path("/user/gates/tmp/filtered_users")); FileOutputFormat.setOutputPath(join, new Path("/user/gates/tmp/joined")); join.setNumReduceTasks(50); Job joinJob = new Job(join); joinJob.addDependingJob(loadPages); joinJob.addDependingJob(loadUsers); JobConf group = new JobConf(MRExample.class); group.setJobName("Group URLs"); group.setInputFormat(KeyValueTextInputFormat.class); group.setOutputKeyClass(Text.class); group.setOutputValueClass(LongWritable.class); group.setOutputFormat(SequenceFileOutputFormat.class); group.setMapperClass(LoadJoined.class); group.setCombinerClass(ReduceUrls.class); group.setReducerClass(ReduceUrls.class); FileInputFormat.addInputPath(group, new Path("/user/gates/tmp/joined")); FileOutputFormat.setOutputPath(group, new Path("/user/gates/tmp/grouped")); group.setNumReduceTasks(50); Job groupJob = new Job(group); groupJob.addDependingJob(joinJob); JobConf top100 = new JobConf(MRExample.class); top100.setJobName("Top 100 sites"); top100.setInputFormat(SequenceFileInputFormat.class); top100.setOutputKeyClass(LongWritable.class); top100.setOutputValueClass(Text.class); top100.setOutputFormat(SequenceFileOutputFormat.class); top100.setMapperClass(LoadClicks.class); top100.setCombinerClass(LimitClicks.class); top100.setReducerClass(LimitClicks.class); FileInputFormat.addInputPath(top100, new Path("/user/gates/tmp/grouped")); FileOutputFormat.setOutputPath(top100, new Path("/user/gates/top100sitesforusers18to25")); top100.setNumReduceTasks(1); Job limit = new Job(top100); limit.addDependingJob(groupJob); JobControl jc = new JobControl("Find top 100 sites for users 18 to 25"); jc.addJob(loadPages); jc.addJob(loadUsers); jc.addJob(joinJob); jc.addJob(groupJob); jc.addJob(limit); jc.run(); } }

Example from http://wiki.apache.org/pig-data/attachments/PigTalksPapers/attachments/ApacheConEurope09.ppt

27Copyright © Capgemini 2012. All Rights Reserved

Users = load ‘users’ as (name, age);Filtered = filter Users by age >= 18 and age <= 25; Pages = load ‘pages’ as (user, url);Joined = join Filtered by name, Pages by user;Grouped = group Joined by url;Summed = foreach Grouped generate group, count(Joined) as clicks;Sorted = order Summed by clicks desc;Top5 = limit Sorted 5;

store Top5 into ‘top5sites’;

Example from http://wiki.apache.org/pig-data/attachments/PigTalksPapers/attachments/ApacheConEurope09.ppt

In Pig Latin

28Copyright © Capgemini 2012. All Rights Reserved

Notice how naturally the components of the job translate into Pig Latin.

Load Users Load Pages

Filter by age

Join on name

Group on url

Count clicks

Order by clicks

Take top 5

Users = load …Filtered = filter … Pages = load …Joined = join …Grouped = group …Summed = … count()…Sorted = order …Top5 = limit …

Example from http://wiki.apache.org/pig-data/attachments/PigTalksPapers/attachments/ApacheConEurope09.ppt

Ease of Translation

29Copyright © Capgemini 2012. All Rights Reserved

Ease of TranslationNotice how naturally the components of the job translate into Pig Latin.

Load Users Load Pages

Filter by age

Join on name

Group on url

Count clicks

Order by clicks

Take top 5

Users = load …Filtered = filter … Pages = load …Joined = join …Grouped = group …Summed = … count()…Sorted = order …Top5 = limit …

Job 1

Job 2

Job 3

Example from http://wiki.apache.org/pig-data/attachments/PigTalksPapers/attachments/ApacheConEurope09.ppt

30Copyright © Capgemini 2012. All Rights Reserved

Hive

Developed at FacebookUsed for majority of Facebook jobs “Relational database” built on Hadoop

Maintains list of table schemas SQL-like query language (HQL) Can call Hadoop Streaming scripts from HQL Supports table partitioning, clustering, complex

data types, some optimizationsTranslates SQL into MapReduce jobsSo you can do this

Select count(*) from users where user_id=56 And Hive will translates this into MapReduce jobs

31Copyright © Capgemini 2012. All Rights Reserved

Hive Architecture Apache Hive, which is built on top of Hadoop for providing data warehouse services

Hive

32Copyright © Capgemini 2012. All Rights Reserved

Sample Hive Queries

SELECT p.url, COUNT(1) as clicks FROM users u JOIN page_views p ON (u.name = p.user)WHERE u.age >= 18 AND u.age <= 25GROUP BY p.urlORDER BY clicksLIMIT 5;

• Find top 5 pages visited by users aged 18-25:

• Filter page views through Python script:

SELECT TRANSFORM(p.user, p.date)USING 'map_script.py'AS dt, uid CLUSTER BY dtFROM page_views p;

Hive

33Copyright © Capgemini 2012. All Rights Reserved

Pig

34Copyright © Capgemini 2012. All Rights Reserved

Sqoop

• Often, valuable data in an organization is stored in relational database systems (RDBMS)

• Sqoop is an open-source tool that allows users to extract data from a relational database into Hadoop for further processing.

• It’s even possible to use Sqoop to move data from a relational database into HBase.

35Copyright © Capgemini 2012. All Rights Reserved

36Copyright © Capgemini 2012. All Rights Reserved

37Copyright © Capgemini 2012. All Rights Reserved

38Copyright © Capgemini 2012. All Rights Reserved

39Copyright © Capgemini 2012. All Rights Reserved

40Copyright © Capgemini 2012. All Rights Reserved

41Copyright © Capgemini 2012. All Rights Reserved

Recommended