32
The Data Center and Hadoop Jacob Rapp, Cisco [email protected]

The Data Center and Hadoop

Embed Size (px)

DESCRIPTION

What do data center operators need to know when deploying Hadoop in the Data Center? Multi-tenancy, network topology, workload types, and myriad other factors affect the way applications run and perform in the data center. Understanding performance characteristics of the distributed system is key to not only optimize for Hadoop, but allows Hadoop to seamlessly operate side-by-side existing applications.

Citation preview

Page 1: The Data Center and Hadoop

The Data Center and Hadoop

Jacob Rapp, Cisco

[email protected]

Page 2: The Data Center and Hadoop

Hadoop Considerations

• Traffic Types, Job Patterns, Network Considerations, Compute

Network Integration

• Co-exist with current Data Center infrastructure

• Open, Programmable and Application-Aware Networks

Multi-tenancy

• Remove the “Silo clusters”

2

Page 3: The Data Center and Hadoop

3

Page 4: The Data Center and Hadoop

4

Analyze

Extract Transform Load

(ETL)

Explode

Reduce

Reduce

Reduce

Ingress vs.

Egress

Data Set

1:0.3

Ingress vs.

Egress

Data Set

1:1

Ingress vs.

Egress

Data Set

1:2

The Time the reducers

start is dependent on:mapred.reduce.slowstart.co

mpleted.maps

It doesn’t change the amount

of data sent to Reducers, but

may change the timing to

send that data

Page 5: The Data Center and Hadoop

5

Small Flows/Messaging(Admin Related, Heart-beats, Keep-alive,

delay sensitive application messaging)

Small – Medium Incast(Hadoop Shuffle)

Large Flows(HDFS Ingest)

Large Incast(Hadoop Replication)

Page 6: The Data Center and Hadoop

6

Many-to-Many Traffic Pattern

Map 1 Map 2 Map NMap 3

Reducer 1 Reducer 2 Reducer 3 Reducer N

HDFS

Shuffle

Output

Replication

NameNode

JobTracker

ZooKeeper

Page 7: The Data Center and Hadoop

AnalyzeSimulated with

Shakespeare

Wordcount

Extract Transform Load

(ETL)Simulated with

Yahoo TeraSort

Extract Transform Load

(ETL)Simulated with

Yahoo TeraSort with output

replication

Job Patterns have varying impact on network utilization

Page 8: The Data Center and Hadoop

8

Page 9: The Data Center and Hadoop

9

Network Attributes

Architecture

Availability

Capacity, Scale &

Oversubscription

Flexibility

Management & Visibility

Integration Considerations

Page 10: The Data Center and Hadoop

10

Single 1GE100% Utilized

Dual 1GE75% Utilized

10GE40% Utilized

Generally 1G is being used largely due to the cost/performance trade-offs.

Though 10GE can provide benefits depending on workload

Page 11: The Data Center and Hadoop

• No single point of failure from network view point. No impact on job completion time

• NIC bonding configured at Linux – with LACP mode of bonding

• Effective load-sharing of traffic flow on two NICs.

• Recommended to change the hashing to src-dst-ip-port (both network and NIC bonding in Linux) for optimal load-sharing

11

Page 12: The Data Center and Hadoop

1

13 25 37 49 61 73 85 97

109

121

133

145

157

169

181

193

205

217

229

241

253

265

277

289

301

313

325

337

349

361

373

385

397

409

421

433

445

457

469

481

493

505

517

529

541

553

565

577

589

601

613

625

637

649

661

673

685

697

709

721

733

745

757

769

781

793

Job

Co

mp

leti

on

Ce

ll U

sage

1G Buffer Used 10G Buffer Used 1G Map % 1G Reduce % 10G Map % 10G Reduce %

1GE vs. 10GE Buffer Usage

12

Moving from 1GE to 10GE actually lowers the buffer requirement at the switching layer.

By moving to 10GE, the data node has a wider pipe to receive data lessening the need for buffers on the network as the total aggregate transfer rate and amount of data does not increase substantially. This is due, in part, to limits of I/O and Compute capabilities

Page 13: The Data Center and Hadoop

Goals

• Extensive Validation of Hadoop Workload

• Reference Architecture

Make it easy for Enterprise

Demystify Network for HadoopDeployment

Integration with Enterprise with efficient choices of network topology/devices

Findings

• 10G and/or Dual attached server provides consistent job completion time & better buffer utilization

• 10G provide reduce burst at the access layer

• Dual Attached Sever is recommended design –1G or 10G. 10G for future proofing

• Rack failure has the biggest impact on job completion time

• Does not require non-blocking network

• Latency does not matter much in Hadoopworkloads

13

http://www.slideshare.net/Hadoop_Summit/ref-arch-validated-and-tested-approach-to-define-a-network-design

http://youtu.be/YJODsK0T67A

More Details From Hadoop

Summit 2012 at:

Page 14: The Data Center and Hadoop

14

Page 15: The Data Center and Hadoop

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 15

n3548-001# show interface brief

--------------------------------------------------------------------------------

Ethernet VLAN Type Mode Status Reason Speed Port

Interface Ch #

--------------------------------------------------------------------------------

Eth1/1 1 eth access up none 10G(D) --

Eth1/2 1 eth access up none 10G(D) --

Eth1/3 1 eth access up none 10G(D) --

Eth1/4 1 eth access up none 10G(D) --

Eth1/5 1 eth access up none 10G(D) –-

.

.

Eth1/33 1 eth access up none 10G(D) --

Eth1/34 1 eth access up none 10G(D) --

Eth1/35 1 eth access down SFP not inserted 10G(D) --

Eth1/36 1 eth access down SFP not inserted 10G(D) --

Eth1/37 1 eth access down Administratively down 10G(D) –

.

Page 16: The Data Center and Hadoop

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 16

n3548-001# show mac address-table dynamic

Legend:

* - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay

MAC

age - seconds since first seen,+ - primary entry using vPC Peer-

Link

VLAN MAC Address Type age Secure NTFY Ports

---------+-----------------+--------+---------+------+----+----------------

--

* 1 e8b7.484d.a208 dynamic 60570 F F Eth1/31

* 1 e8b7.484d.a20a dynamic 60560 F F Eth1/31

* 1 e8b7.484d.a73e dynamic 60560 F F Eth1/34

* 1 e8b7.484d.a740 dynamic 60560 F F Eth1/34

* 1 e8b7.484d.ad15 dynamic 60560 F F Eth1/28

* 1 e8b7.484d.ad17 dynamic 60560 F F Eth1/28

* 1 e8b7.484d.b3e9 dynamic 60570 F F Eth1/25

* 1 e8b7.484d.b3eb dynamic 60560 F F Eth1/25

.

.

MAC Addresses

of the connected

devices … and

the port they are

on…

Page 17: The Data Center and Hadoop

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 17

n3548-001# portServerMap

=======================================

Port Server FQDN

---------------------------------------

Eth1/1 c200-m2-10g2-001.cluster10g.com

Eth1/2 c200-m2-10g2-002.cluster10g.com

Eth1/3 c200-m2-10g2-003.cluster10g.com

Eth1/4 c200-m2-10g2-004.cluster10g.com

Eth1/5 c200-m2-10g2-005.cluster10g.com

Eth1/6 c200-m2-10g2-006.cluster10g.com

Eth1/7 c200-m2-10g2-031.cluster10g.com

Eth1/8 c200-m2-10g2-008.cluster10g.com

Eth1/9 c200-m2-10g2-009.cluster10g.com

Eth1/11 c200-m2-10g2-011.cluster10g.com

.

.

.

Page 18: The Data Center and Hadoop

© 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 18

n3548-001# trackerList

===========================================

Port Server Server Port

-------------------------------------------

Eth1/2 c200-m2-10g2-002 50544

Eth1/3 c200-m2-10g2-003 41909

Eth1/4 c200-m2-10g2-004 36480

Eth1/5 c200-m2-10g2-005 38179

Eth1/6 c200-m2-10g2-006 51375

Eth1/7 c200-m2-10g2-031 41915

Eth1/8 c200-m2-10g2-008 50983

Eth1/9 c200-m2-10g2-009 37056

Eth1/11 c200-m2-10g2-011 35882

Eth1/12 c200-m2-10g2-012 44551

.

.

.

Page 19: The Data Center and Hadoop

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 19

n3548-001# bufferServerMap

===================================================================

Port Server 1sec 5sec 60sec 5min 1hr

-------------------------------------------------------------------

Eth1/1 c200-m2-10g2-001 0KB 0KB 0KB 0KB 0KB

Eth1/2 c200-m2-10g2-002 384KB 384KB 1536KB 2304KB 2304KB

Eth1/3 c200-m2-10g2-003 384KB 384KB 1152KB 1536KB 1536KB

Eth1/4 c200-m2-10g2-004 384KB 384KB 2304KB 2304KB 2304KB

Eth1/5 c200-m2-10g2-005 384KB 384KB 768KB 1536KB 1536KB

Eth1/6 c200-m2-10g2-006 384KB 2304KB 2304KB 2304KB 2304KB

Eth1/7 c200-m2-10g2-031 384KB 384KB 3456KB 3840KB 3840KB

Eth1/8 c200-m2-10g2-008 768KB 768KB 2688KB 2688KB 2688KB

Eth1/9 c200-m2-10g2-009 384KB 384KB 2304KB 2304KB 2304KB

Eth1/11 c200-m2-10g2-011 384KB 384KB 1920KB 1920KB 1920KB

.

.

.Eth1/1(c200-m2-10g2-001)

has 0 buffer usage because

it’s the name node

Page 20: The Data Center and Hadoop

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 20

n3548-001# jobsBuffer

Hadoop Job Info ...

===================================================================

1 jobs currently running

JobId RunTime(secs) User Priority

job_201306131423_0009 120 hadoop NORMAL

===================================================================

Buffer Info - Per Port

Port Server 1sec 5sec 60sec 5min 1hr

-------------------------------------------------------------------

Eth1/1 c200-m2-10g2-001 0KB 0KB 0KB 0KB 0KB

Eth1/2 c200-m2-10g2-002 384KB 384KB 768KB 768KB 768KB

Eth1/3 c200-m2-10g2-003 384KB 384KB 1152KB 1152KB 1152KB

Eth1/4 c200-m2-10g2-004 384KB 1536KB 1536KB 1536KB 1536KB

Eth1/5 c200-m2-10g2-005 384KB 768KB 1152KB 1152KB 1152KB

.

.

What jobs were running

during peak buffer usage

… and for how long were

they running

Page 21: The Data Center and Hadoop

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 21

n3548-001(config)# jobsBuffer

Hadoop Job Info ...

===================================================================

0 jobs currently running

JobId RunTime(secs) User Priority

===================================================================

Buffer Info - Per Port

Port Server 1sec 5sec 60sec 5min 1hr

-------------------------------------------------------------------

Eth1/1 c200-m2-10g2-001 0KB 0KB 0KB 0KB 0KB

Eth1/2 c200-m2-10g2-002 0KB 0KB 0KB 1920KB 1920KB

Eth1/3 c200-m2-10g2-003 0KB 0KB 0KB 2304KB 2304KB

Eth1/4 c200-m2-10g2-004 0KB 0KB 0KB 2688KB 2688KB

Eth1/5 c200-m2-10g2-005 0KB 0KB 0KB 2304KB 2304KB

Eth1/6 c200-m2-10g2-006 0KB 0KB 0KB 2304KB 2304KB

Eth1/7 c200-m2-10g2-031 0KB 0KB 0KB 1920KB 2688KB

.

Historic look at the

buffer usage …

Page 22: The Data Center and Hadoop

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 22

Page 23: The Data Center and Hadoop

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 23

Page 24: The Data Center and Hadoop

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 24

Page 25: The Data Center and Hadoop

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 25

Buffer Usage

Shuffle

Replication

Reduce

Map

0 60 120 180 240 300 360 420 480 540 600 660 720 780

Page 26: The Data Center and Hadoop

© 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 26

(Python Socket)

Push Data Push Data Push Data

PTP Grandmaster

(OPTIONAL)

Analyze

github.com/datacenter

Page 27: The Data Center and Hadoop

27

Page 28: The Data Center and Hadoop

28

Hadoop + HBASE

Job Based

Department Based

Various Multitenant Environments

Need to understand

Traffic Patterns

Scheduling

Dependent

Permissions and

Scheduling

Dependent

Page 29: The Data Center and Hadoop

29

Map 1 Map 2 Map NMap 3

Reducer

1

Reducer

2

Reducer

3

Reducer

N

HDFS

Shuffle

Output

Replication

Region

Server

Region

Server

Client Client

Major

Compaction

ReadRead

Read

Update

Update

Read

Major

Compaction

Page 30: The Data Center and Hadoop

30

Hbase During Major Compaction

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

Latency(us)

Time

UPDATE-AverageLatency(us) READ-AverageLatency(us) QoS-UPDATE-AverageLatency(us) QoS-READ-AverageLatency(us)

Read/Update

Latency

Comparison of Non-

QoS vs. QoS Policy

~45% for Read

Improvement

Switch Buffer

Usage

With Network QoS

Policy to prioritize

Hbase Update/Read

Operations

Page 31: The Data Center and Hadoop

Switch Buffer

Usage

With Network QoS

Policy to prioritize

Hbase Update/Read

Operations

0

5000

10000

15000

20000

25000

30000

35000

40000

Latency(us)

Time

UPDATE-AverageLatency(us) READ-AverageLatency(us) QoS-UPDATE-AverageLatency(us) QoS-READ-AverageLatency(us)

1

70

139

208

277

346

415

484

553

622

691

760

829

898

967

1036

1105

1174

1243

1312

1381

1450

1519

1588

1657

1726

1795

1864

1933

2002

2071

2140

2209

2278

2347

2416

2485

2554

2623

2692

2761

2830

2899

2968

3037

3106

3175

3244

3313

3382

3451

3520

3589

3658

3727

3796

3865

3934

4003

4072

4141

4210

4279

4348

4417

4486

4555

4624

4693

4762

4831

4900

4969

5038

5107

5176

5245

5314

5383

5452

5521

5590

5659

5728

5797

5866

5935

BufferUsed

Timeline

HadoopTeraSort Hbase

Hbase + Hadoop Map Reduce

Read/Update

Latency

Comparison of Non-

QoS vs. QoS Policy

~60% for Read

Improvement

Page 32: The Data Center and Hadoop

Cisco Unified Data Center

UNIFIED

FABRIC

UNIFIED

COMPUTING

Highly Scalable, Secure

Network FabricModular Stateless

Computing Elements

UNIFIED

MANAGEMENT

Automated

Management

THANK YOU FOR LISTENING

www.cisco.com/go/ucswww.cisco.com/go/nexushttp://www.cisco.com/go/wor

kloadautomation

Manages Enterprise

Workloads

Cisco.com Big Datawww.cisco.com/go/bigdata

Data Center Script Examples from Presentation:

github.com/datacenter