Vgu bis2010 Mapreduce and Batch processing

MapReduce and Batch Processing

VGU BIS2010, Group 13

Son Pham: phamtranthaison@gmail.com |

Phong Le: bigbangvn@gmail.com |

Lam Pham: lam.pts.vn@gmail.com |

Chuong Nguyen: chuongit@gmail.com |

Chapter 4

Content

Part 1: Son Pham

Batch Layer <

Part 2: Phong Le

> MapReduce

Part 3: Lam Pham

MapReduce <

Part 4: Chuong Nguyen

> Demo

Batch Layer

Lambda Architecture

Batch Layer

• Precomputation• High latency• Linearly Scalable

Batch Layer

On-the-fly computation:

Precomputation:

Batch Layer – Linear Scalability

“Scalability is the ability of a system to maintain performance under increased

load by adding more resources”

Linear vs. Non-Linear Scalability

Linear Scalability Non- Linear Scalability

“A linearly scalable system can maintain performance under increasedload by adding resources in proportion to the increased load”

MapReduce

A distributed computing paradigm originally pioneered by Google

Inspired by the “Map” and “Reduce” functions commonly used in functional programming (LISP)

Operating on data stored in a distributed filesystem (HDFS…)

A population free implementation is Apache Hadoop.

MapReduce

MapReduce - “Word count” Example

MapReduceScalability

Automatically parallelize the computation across the cluster of machines

Fault-ToleranceReassign failed tasks

THANK YOU

Vgu bis2010 Mapreduce and Batch processing

Technology

Processing with What is MapReduce? Hadoop/MapReduce

Begrepp och grundvrden - Brandfactory A/S · 2019. 12. 20. · Krav, TRV 2020:030 ”VGU, Begrepp och grundvrden” Rd, TRV 2020:031 ”VGU, Vgars och gators utformning” Som plattform

VGU - BIS2010: Integrated Information Management

Hadoop MapReduce

Final Modulhandbuch VGU Entwurf Aktuell

VGU Control Theory

BigData MapReduce

Mapreduce introduction

HDFS & MapReduce

Mapreduce tuning

TU Berlin VGU GPEM HCMC (2)

Actualización del guía para Verificadores (VGU): Anexo al VGU 8 … · Actualización del Guía para Verificadores #8 ANEXO Febrero 2015 Página 2 of 12 2.10 Cumplimiento con las

Forschungan der Vietnamese German University (VGU) · Kooperation der TU Darmstadt, der VGU und der University of Transport and Communication (UTC), Leitung: Prof. Dr.-Ing Manfred

1. Introduction to MapReduce - UPMlsd.ls.fi.upm.es/.../IntroToMapReduce.pdf · Processing of massive data: MapReduce – 1. Introduction to MapReduce MapReduce has a 'low semantic

Data Management in Large-Scale Distributed Systems - MapReduce … · Introduction to MapReduce The Hadoop Eco-System HDFS Hadoop MapReduce 4. MapReduce at Google Publication The

Introduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals

EE324 DISTRIBUTED SYSTEMS FALL 2015 MapReduce. Overview 2 MapReduce

MapReduce DesignPatterns

Hadoop MapReduce

MapReduce Tutorial