17
VL2: A Scalable and Flexible Data Center Network Microsoft Research Presented by: Ankita Mahajan

VL2: A scalable and flexible Data Center Network

Embed Size (px)

DESCRIPTION

This Data Center Network Architecture introduces a virtual layer 2.5 in the protocol stack of hosts and uses a directory service to achieve efficient forwarding. It uses separate location/identifier IPs

Citation preview

Page 1: VL2: A scalable and flexible Data Center Network

VL2: A Scalable and Flexible Data Center Network

Microsoft ResearchPresented by: Ankita Mahajan

Page 2: VL2: A scalable and flexible Data Center Network

INTRODUCTION

• Cloud services need data centers with hundreds of thousands of servers and that concurrently support a large number of distinct services.

• To be profitable, DC must achieve high utilization, and key to this is agility — the capacity to assign any server to any service.

• Agility promises: improved risk management and cost savings.

Page 3: VL2: A scalable and flexible Data Center Network

Agility in today’s DCN

• Designs for today’s DCN prevent agility in several ways:

• Oversubscription : Existing architectures do not provide enough capacity between the servers they interconnect

• Traffic flood in 1 service affects other services.• Topologically significant IP addresses.• Dividing servers among VLANs: reates

Fragmentation of Address Space.

Page 4: VL2: A scalable and flexible Data Center Network

Objectives of building VL2

To overcome limitations we need a n/w with following objectives:• Uniform high capacity: Hot-spot free• Performance isolation• Layer-2 semantics: Just as if the servers were

on a LAN it should be easy to assign any server to any service and configure that server with whatever IP address the service expects.

Page 5: VL2: A scalable and flexible Data Center Network

• 20 to 40 servers per rack, each singly connected to• a Top of Rack (ToR) switch with a 1 Gbps link. • ToRs connect to two aggregation switches for redundancy, and• these switches aggregate further connecting to access routers. • At the top, core routers carry traffic between access routers.• All links use Ethernet as a physical-layer protocol• To limit overheads (e.g., packet flooding and ARP broadcasts) and to isolate

different services servers are partitioned into virtual LANs (VLANs)

Page 6: VL2: A scalable and flexible Data Center Network

Limitations3 fundamental Limitations:• Limited server-to-server capacity: ToRs are

1:5 to 1:20 oversubscribed and paths through the highest layer can be 1:240 oversubscribed.

• Fragmentation of resources: spare capacity is reserved by individual services.

• Poor reliability and utilization: Resilience model forces each device and link to be run up to at most 50% of its maximum utilization.

Page 7: VL2: A scalable and flexible Data Center Network

Data-Center Traffic Analysis

1. The ratio of traffic volume between servers into data centers, to, traffic entering/leaving data centers is currently around 4:1

2. data-center computation is focused where high speed access to data on memory or disk is fast and cheap.

3. The demand for bandwidth between servers inside a data center is growing faster than the demand for bandwidth to external hosts.

4. The network is a bottleneck to computation.

Page 8: VL2: A scalable and flexible Data Center Network

Flow Distribution Analysis:Distribution of flow sizes:

• Similar to Internet traffic, 99% of flows are smaller than 100 MB.

• But the distribution is simpler and more uniform than Internet.

• More than 90% of bytes are in flows between 100MB and 1 GB.

Page 9: VL2: A scalable and flexible Data Center Network

Flow Distribution Analysis:Number of Concurrent Flows:

• More than 50% of the time, an average machine has about ten concurrent flows.

• At least 5% of the time it has greater than 80 concurrent ows.• We almost never see more than 100 concurrent ows.Both the above Flow Distribution Analysis imply that VLB will perform well on this traffic. Since even big flows are only 100 MB.adaptive routing schemes may be dicult to implement in the data center since any reactive traffic engineering will need to run at least once a second if it wants to react to individual flows.

Page 10: VL2: A scalable and flexible Data Center Network

Traffic Matrix Analysis

• Poor summarizability of trac patterns:Is there regularity in the trac that might be exploited through careful measurement and trac engineering?TM(t)ij clustering: a day’s worth of trac in the datacenter, even when approximating with 50-60 clusters, the fitting error remains 60%

• Instability of trac patterns: how predictable is the trac in the next interval given the current trac?

Page 11: VL2: A scalable and flexible Data Center Network

Failure Characteristics

1. pattern of networking equipment failures:• Most failures are small in size: 50% of network

device failures involve < 4 devices and 95% of network device failures involve < 20 devices

• Large correlated failures are rare: the largest correlated failure involved 217 switches

• downtimes can be signicant: 95% of failures are resolved in 10min, 98% in < 1 hr, 99.6% in < 1 day, but 0.09% last > 10 days.

Page 12: VL2: A scalable and flexible Data Center Network

impact of networking equipment failure?

• in 0.3% of failures all redundant components in a network device group became unavailable

• The main causes of these downtimes are network miscongurations, firmware bugs, and faulty components (e.g., ports).

• With no obvious way to eliminate all failures from the top of the hierarchy, VL2’sapproach is to broaden the topmost levels of the network so that the impact of failures is muted and performance degrades gracefully,

• moving from 1:1 redundancy to n:m redundancy.

Page 13: VL2: A scalable and flexible Data Center Network

Terminology

• Goodput: useful information delivered per second to the application layer.

• VLB: each server independently picks a path at random through the network for each of the flows it sends to other servers in the data center.

• ECMP: distributes traffic across equal-cost paths

• anycast addresses for the Directory System

Page 14: VL2: A scalable and flexible Data Center Network
Page 15: VL2: A scalable and flexible Data Center Network
Page 16: VL2: A scalable and flexible Data Center Network
Page 17: VL2: A scalable and flexible Data Center Network