20
HADOOP SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

Embed Size (px)

Citation preview

Page 1: SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

HADOOP

SEMINAR ON

Guided by:Prof. D.V.Chaudhari

Seminar by:Namrata

SakhareRoll No: 65B.E.Comp

Page 2: SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

HISTORY OF HADOOP

Large businesses needed to go through terabytes and petabytes of data. This data was initially control by a single powerful computer. But due to its limitation, it can handle data up to certain limits.To solve this problem, Google publicized MapReduce.

MapReduce : A system which supports distributed computing on large data sets on clusters.

Many other businesses were facing the same problem of scaling.Therefore, Doug Cutting developed an open source version of MapReduce system called HADOOP.

Page 3: SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

WHAT IS HADOOP ?•Hadoop is framework of tools.•The objective of hadoop is ,it supports running application on big data.•It is an open source set of tools and distributed under Apache License.•It is powerful tool designed for deep analysis and transaction of very large data .

Page 4: SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

BIG DATA•The keyword behind hadoop is BIG DATA.•Big data facing challenges

Velocity Variety

Volume

Big Data

Page 5: SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

TRADITIONAL APPROACH

BIG DATA Powerful Computer

Processed by

BIG DATA Powerful ComputerProcessing Limits

Page 6: SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

HADOOP APPROACH

BIG DATABroken Into Pieces

Page 7: SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

BIG DATA

Computation

Computation

Computation

Computation

Combined Result

Combined Result

COMPUTATION OF DATA

Page 8: SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

ARCHITECTURE

MapReduce

HDFS

Task tracker

Name Node

Date Node

Job Tracker

Page 9: SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

MASTER SLAVE ARCHITECTURE

Task tracke

rData node

Task tracke

r

Task tracke

r

Task tracke

r

Data node

Data node

Data node

Data node Name node

Task tracker Job tracker

Master

Slave

Page 10: SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

JOB TRACKER

Task tracke

rData node

Task tracke

r

Task tracke

r

Task tracke

r

Data node

Data node

Data node

Data node Name node

Task tracker Job tracker

Page 11: SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

NAME NODE

Task tracke

rData node

Task tracke

r

Task tracke

r

Task tracke

r

Data node

Data node

Data node

Data node Name node

Task tracker Job tracker

Master

Slave

Page 12: SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

TASK TRACKER AND DATA NODE

Task tracke

r

Data node

Task tracke

r

Task tracke

r

Task tracke

r

Data node

Data node

Data node

Page 13: SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

FAULT TOLERANCE FOR DATA

Task tracke

rData node

Task tracke

r

Task tracke

r

Task tracke

r

Data node

Data node

Data node

Data node Name node

Task tracker Job tracker

Master

Slave

HDFS

Page 14: SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

FAULT TOLERANCE FOR PROCESSING

Task tracke

rData node

Task tracke

r

Task tracke

r

Task tracke

r

Data node

Data node

Data node

Data node Name node

Task tracker Job tracker

Master

Slave

MAPREDUCE

Page 15: SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

MASTER BACK UP

Task tracke

rData node

Task tracke

r

Task tracke

r

Task tracke

r

Data node

Data node

Data node

Data node Name node

Task tracker Job tracker

Master

Slave

Tables are

backed up

Page 16: SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

EASY PROGARMMING Where the file is

located

How to manage failures

How to break

computations into pieces

How to program for

scaling

Don’t have to worry about

Programmer

Page 17: SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

FEATURES OF HADOOP

Main Features Of Hadoop :•Works on distributed model.. :It Works on numerous low cost computer instead of single powerful computer.

•Linux based set of tools. : It Works On Linux Operating System.

Page 18: SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

TOOLS IN HADOOP

Tools Of HADOOP

Scoop

Flume

Oozie

Pig

Mahout

Hbase

Hive

Page 19: SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

IMPLEMENTATION OF HADOOP

•Yahoo•IBM•FACEBOOK•AMAZON•AMERICAN AIRLINES•THE NEWYORK TIMES•EBAY

Page 20: SEMINAR ON Guided by: Prof. D.V.Chaudhari Seminar by: Namrata Sakhare Roll No: 65 B.E.Comp

THANK YOU…