17
Learn

Learn Hadoop Online

  • Upload
    vibloo

  • View
    40

  • Download
    0

Embed Size (px)

DESCRIPTION

Learn Hadoop Online by Vibloo Hadoop Admin & Developer Online and Classroom training. Hadoop Online training course is designed to enhance your knowledge and skills to become a successful Hadoop developer and In-depth knowledge of core concepts will be covered in the course along with implementation on varied industry use-cases - PowerPoint PPT Presentation

Citation preview

Page 1: Learn Hadoop Online

Learn

Page 2: Learn Hadoop Online

Hadoop Online training course is designed to enhance your knowledge and skills to

become a successful Hadoop developer and In-depth knowledge of core concepts

will be covered in the course along with implementation on varied industry use-cases.

take a look on HADOOP ADMIN AND DEVELOPER COURSE content

Skype Id: info.vibloo Email: [email protected] USA: +1-248-809-1418 IND: +91-40-3296-5222

Page 3: Learn Hadoop Online

What is Hadoop?

The Hadoop Distributed File System

Hadoop Map Reduce Works

Anatomy of a Hadoop Cluster

Master Daemons

Name node

Introduction to Hadoop

Job Tracker

Secondary name node

Slave Daemons

Job tracker

Task tracker

Skype Id: info.vibloo Email: [email protected] USA: +1-248-809-1418 IND: +91-40-3296-5222

www.vibloo.com/Hadoop-Online-Training

Page 4: Learn Hadoop Online

Blocks and Splits

Input and HDFS Splits

Data Replication

Hadoop Rack Aware

Data high availability

Data Integrity

Cluster architecture and block placement

Accessing HDFS

JAVA & CLI Approach

HDFS (Hadoop Distributed File System)

Programming Practices

Developing MapReduce Programs in

Running without HDFS and MapReduce

Running all daemons in a single node

Running daemons on dedicated nodes

Local Mode

Pseudo-distributed Mode

Fully distributed mode

Skype Id: info.vibloo Email: [email protected] USA: +1-248-809-1418 IND: +91-40-3296-5222

www.vibloo.com/Hadoop-Online-Training

Page 5: Learn Hadoop Online

Make a fully distributed Hadoop cluster on a single laptop/desktop

Name Node in Safe mode

Meta Data Backup

Integrating Kerberos security in hadoop

Setup Hadoop cluster of Apache, Cloudera and Horton Works

Skype Id: info.vibloo Email: [email protected] USA: +1-248-809-1418 IND: +91-40-3296-5222

www.vibloo.com/Hadoop-Online-Training

Page 6: Learn Hadoop Online

Examining a Sample MapReduce Program, with several examples

Basic API Concepts

The Driver Code

The Mapper

The Reducer

Hadoop's Streaming API

Writing a MapReduce Program

Skype Id: info.vibloo Email: [email protected] USA: +1-248-809-1418 IND: +91-40-3296-5222

www.vibloo.com/Hadoop-Online-Training

Page 7: Learn Hadoop Online

The configure and close Methods

Sequence Files

Record Reader

Record Writer

Role of Reporter

Output Collector

Performing several hadoop jobs

Processing XML files

Counters

Directly Accessing HDFS

Tool Runner

Using The Distributed Cache

Skype Id: info.vibloo Email: [email protected] USA: +1-248-809-1418 IND: +91-40-3296-5222

www.vibloo.com/Hadoop-Online-Training

Page 8: Learn Hadoop Online

Sorting and Searching

Indexing

Classification/Machine Learning

Term Frequency - Inverse Document Frequency

Word Co-Occurrence

Common MapReduce Algorithms

Creating an Inverted Index

Identity Mapper

Identity Reducer

MapReduce applications

Skype Id: info.vibloo Email: [email protected] USA: +1-248-809-1418 IND: +91-40-3296-5222

www.vibloo.com/Hadoop-Online-Training

Page 9: Learn Hadoop Online

Testing with MRUnit

Logging

Other Debugging Strategies

Debugging MapReduce Programs

Skype Id: info.vibloo Email: [email protected] USA: +1-248-809-1418 IND: +91-40-3296-5222

www.vibloo.com/Hadoop-Online-Training

Page 10: Learn Hadoop Online

A Recap of the MapReduce Flow

The Secondary Sort

Customized Input Formats and Output Formats

Advanced MapReduce Programming

Skype Id: info.vibloo Email: [email protected] USA: +1-248-809-1418 IND: +91-40-3296-5222

www.vibloo.com/Hadoop-Online-Training

Page 11: Learn Hadoop Online

Counters

Skipping Bad Records

Rerunning failed tasks with Isolation Runner

Monitoring and debugging on a Production Cluster

Skype Id: info.vibloo Email: [email protected] USA: +1-248-809-1418 IND: +91-40-3296-5222

Page 12: Learn Hadoop Online

Reducing network traffic with combiner

Partitioners

Using Compression

Reusing the JVM

Running with speculative execution

Refactoring code and rewriting algorithms Parameters affecting Performance

Other Performance Aspects

Tuning for Performance in MapReduce

Skype Id: info.vibloo Email: [email protected] USA: +1-248-809-1418 IND: +91-40-3296-5222

www.vibloo.com/Hadoop-Online-Training

Page 13: Learn Hadoop Online

HBase

HBase concepts

HBase architecture

Region server architecture

File storage architecture

HBase basics

Column access

Scans

HBase use cases

Install and configure HBase on a multi node cluster

Create database

Develop and run sample applications

Access data stored in HBase using clients

like Java, Python and Pearl

HBase and Hive Integration

HBase admin tasks

Defining Schema and basic operation

Skype Id: info.vibloo Email: [email protected] USA: +1-248-809-1418 IND: +91-40-3296-5222

www.vibloo.com/Hadoop-Online-Training

Page 14: Learn Hadoop Online

PIG

Pig basics

Install and configure PIG on a cluster

PIG Vs MapReduce and SQL

Pig Vs Hive

Write sample Pig Latin scripts

Modes of running PIG

Running in Grunt shell

Programming in Eclipse

Running as Java program

PIG UDFs

Pig Macros

Skype Id: info.vibloo Email: [email protected] USA: +1-248-809-1418 IND: +91-40-3296-5222

www.vibloo.com/Hadoop-Online-Training

Page 15: Learn Hadoop Online

Flume, Chukwa, Avro, Scribe, Thrift

Flume and Chukwa concepts

Use cases of Thrift

Avro and scribe

Install and configure flume on cluster

Create a sample application to capture logs from Apache using flume

Skype Id: info.vibloo Email: [email protected] USA: +1-248-809-1418 IND: +91-40-3296-5222

www.vibloo.com/Hadoop-Online-Training

Page 16: Learn Hadoop Online

CDH4 Enhancements

Name Node High – Availability

Name Node federation

Fencing

YARN

Skype Id: info.vibloo Email: [email protected] USA: +1-248-809-1418 IND: +91-40-3296-5222

www.vibloo.com/Hadoop-Online-Training

Page 17: Learn Hadoop Online

Hadoop Challenges

Hadoop disaster recovery

Hadoop suitable cases

Skype Id: info.vibloo Email: [email protected] USA: +1-248-809-1418 IND: +91-40-3296-5222

www.vibloo.com/Hadoop-Online-Training