29
CloudEngine Switch 16800 & Cloud Fabric Almaz Mazitov

CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

  • Upload
    others

  • View
    19

  • Download
    1

Embed Size (px)

Citation preview

Page 1: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

CloudEngine Switch 16800 & Cloud Fabric

Almaz Mazitov

Page 2: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Huawei Confidential2

Four Phases of DCN Development: Data Center Switches Must Evolve to Next-

Generation Models to Meet Service Requirements

Cloud-based services,optimizing

provisioning efficiency

Virtualization(US$3.5 billion, CAGR: -3%)

Cloud computing(US$5 billion, CAGR: +15%)

AI (US$2 billion, CAGR: +50%)

Switches with high-density

ports and large-buffer Pool-based management

through the SDN controller

Resource pool-

based sharing,

improving

utilization

AI big dataCloud-based

Accelerated distributed storage

and AI high-speed computing Integration of computing,

storage, and data networks

Data value mining,

realizing business

monetization

Traditional DCs(Installed base network)

The network is stable

and reliable, and

services are not

interrupted.

Server centralization

Independent data center

(DC) construction

Virtualization

16

08

0404S

08S

CloudEngine 12800/12800S

Sx7 Series Campus

Switches for the Data

Center Market

16

CloudEngine16800

08

04

Large Layer 2: stack, M-LAG, VXLAN AI engine and 400GSTP

Since its launch in 2012, Huawei CloudEngine 12800 provides good support for the development of virtualization and cloud. In 2019, the

CloudEngine 16800 was released to meet DC development requirements in the AI era.

Release in August 2012Issued in January 2019

Association with computing

resources Interconnection with the cloud

platform to implement self-help

service provisioning

Page 3: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Huawei Confidential3

CloudEngine Series Data Center Switch Portfolio

Core Switches Access Switches

CloudEngine 16800 (new)

CloudEngine 16816 CloudEngine 16808 CloudEngine 16804

CloudEngine 6881-48S6CQ

CloudEngine 6863-48S6CQ

10GE TOR switch (new)

25GE TOR switch (new)

Page 4: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Huawei Confidential4

CloudEngine 16800: Leading Hardware Architecture, Extensive Software

Features, and Complete Solution Mapping Capabilities

CloudEngine 16808 CloudEngine 16804CloudEngine 16816

36*100GE

36*40GE

24*40GE48*10GE

18*100GE

Complete Solution

Mapping CapabilitiesAgile Controller-DCN provides simplified

deployment capabilities throughout the life cycle.

FabricInsight analyzes TCP flows and network-

wide health.

Leading Hardware Architecture

Flexible NSH: Flexible and simplified VAS deployment

High security: Microsegmentation (VM-level security

isolation)

Telemetry technology, detecting the network quality in

real time

Edge intelligence and local processing of network behaviors

Orthogonal architecture, backplane-free cabling, strict

front-to-back airflow, cell switching

Mixed-flow fan, VC phase change heat dissipation

Smooth evolution to 400G

AI engine (V1R19C10)

Extensive Software Features

CloudEngine 16800: 400G platform supports 10GE, 40GE, and 100GE interfaces, and AI engine.

Page 5: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Huawei Confidential5

Orthogonal ArchitectureStrict Front-to-Back Airflow

DesignNon-blocking Switching

Mixed-flow Fan,VC Phase Change Heat

Dissipation

Line

card

Backplane-free cablingHigher chassis bandwidth

Independent front-to-back

airflowEven heat dissipation, basic

requirements for data centers

Cell switching, VoQBalanced traffic distribution, higher

bandwidth usage

Mixed-flow fan,

VC phase change heat

dissipation Air volume three times higher than

the industry average, greatly

reducing noise

Leading energy-saving design

Hardware Architecture: Industry-leading Architecture Design and

Innovate Heat Dissipation

The CloudEngine16800 supports the network lifecycle of four generations of servers and smooth evolution to 400G.

1/31/3

1/3

1/3

1/3

1/3

1/3

1/3

1/

3

1/3

1/3

1/3

1/31/3

1/3

1/3

1/3

1/3

VC heat

dissipation

substrate

Heat

dissipation fin

Chip

Air intake Air exhaust

Page 6: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Huawei Confidential6

Introduction to CloudEngine 16800

Specification CE16804 CE16808 CE16816

Dimensions

(W x D x H, mm)482.6 x 990.3 x 437(10U)

482.6 x 990.3 x

703.6 (16U)

482.6 x 1149.2 x

1435.7(32U)

Switching capacity 43 Tbit/s 86 Tbit/s 173 Tbit/s

Packet forwarding

rate11,280 Mpps 22,560 Mpps 45,120 Mpps

LPU slots 4 8 16

MPU 1+1

SFUs 6 (scalable to 9 for future expansion)

Architecture Clos switching architecture, cell switching, VoQ

Number of fan

trays3 3 3

Number of power

supplies6 10 20

Power inputDC: 2200 W (-48 V/-60 V)

AC/HVDC: 3000 W (AC: 220 V, HVDC: 240 V/380 V)

Two MPUs: 1+1

redundancy

The CloudEngine 16808 has

10 power modules in total.

The CloudEngine 16808

has a total of eight slots.

The CloudEngine 16808

has three fan trays.

The CloudEngine 16808

has up to nine SFUs and

supports N+1 or N+M

redundancy.

Page 7: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Huawei Confidential7

CloudEngine 16800: 100G/40GE/10GE Line Cards

36*100GE QSFP28 36*40GE QSFP+

24*40GE QSFP+

48*10GE SFP+

18*100GE QSFP28

Item 100GE Line Card 40GE Line Card 10GE Line Card

Card name CEL36CQFD-G CEL18CQFD-G CEL36LQFD-G CEL24LQFD-G CEL48XSFD-G

Port36*100GE/36*40GE/

144*25GE/144*10GE

18*100GE/18*40GE/

72*25GE/72*10GE

36*40GE/

144*10GE24*40GE/96*10GE 48*10GE

MAC address

tableStandard mode: 96K Large routing mode: 32K Large MAC mode: 256K

FIB (IPv4/IPv6) Standard mode: 220K/80K Large routing mode: 256K/80K Large MAC mode: 128K/64K

ND Standard mode: 80K Large routing mode: 80K Large MAC mode: 64K

ARP

<Non-contiguous

and contiguous

MAC addresses>

Standard mode: 96K-220K Large routing mode: 96K-256K Large MAC mode: 96K-128K

ACL 6*7.5K 3*7.5K 3*7.5K 2*7.5K 1*7.5K

Page 8: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Huawei Confidential8

MPUs of the CloudEngine 16800

Half-width MPU of the

CloudEngine

16804/CloudEngine 16808

Full-width MPU of the

CloudEngine 16816

• The CloudEngine 16804/CloudEngine 16808 uses half-width

MPUs, and active and standby MPUs are installed side by side.

• The CloudEngine 16816 uses full-width MPUs, and the active

and standby MPUs are arranged vertically.

• HiSilicon CPU

16-core, single-core 1.8 GHz

• Memory: 8 GB

• CMU

• Integrated AI chip (GA in February 2020)

• 1588v2 (GA in February 2020)

MPU Description

CE-MPUD-HALFHalf-width MPU, adapting to the CloudEngine

16804/CloudEngine 16808

CE-MPUD-FULL Full-width MPU, adapting to the CloudEngine 16816

Page 9: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Huawei Confidential9

SFUs of the CloudEngine 16800

SFU04

SFU08

SFU16

SFU Performance

CE-SFU04G-G 8.4 Tbit/s

CE-SFU04F-G4.2 Tbit/s

CE-SFU08G-G 16.8 Tbit/s

CE-SFU08F-G8.4 Tbit/s

CE-SFU16G-G 28.8 Tbit/s

CE-SFU16F-G 16.8 Tbit/s

Page 10: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Huawei Confidential10

Mapping Between Cards and SFUs of the CloudEngine 16800

Device Model Card SFU

Number of SFUs

Required for Line-

rate Forwarding

CE 16804/

CE16808/

CE16816

36*100GE CE-SFU04G-G/ CE-SFU08G-G/CE-SFU16G-G 5

36*40GECE-SFU04F-G/ CE-SFU08F-G/CE-SFU16F-G 4

CE-SFU04G-G/ CE-SFU08G-G/CE-SFU16G-G 4

48*10GECE-SFU04F-G/ CE-SFU08F-G/CE-SFU16F-G 4

CE-SFU04G-G/ CE-SFU08G-G/CE-SFU16G-G 4

18*100GE CE-SFU04F-G/ CE-SFU08F-G/CE-SFU16F-G 5

CE-SFU04G-G/ CE-SFU08G-G/CE-SFU16G-G 5

24*40GE CE-SFU04F-G/ CE-SFU08F-G/CE-SFU16F-G 4

CE-SFU04G-G/ CE-SFU08G-G/CE-SFU16G-G 4

Remarks: The CloudEngine 16800 uses the 6-plane SFU design.

Page 11: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Huawei Confidential11

. . .

Rapid response to service requirements

Hardware BFD Microsegmentation NSH-based SFC VXLAN over IPv6

CPU

Forwarding chip

Intra-card CPU chip

Quad-core CPU: Protocol packet

processing

FIB entry delivery

. . .

Co-processor Hardware BFD

High-performance

sFlow

. . .

Forwarding chip

Adjustable

processes

New service

processes

Adjustable entry

resources

Enhanced service

processes

VRP

NETCONF CLI

Linux container

gRPCOpenFlowSSH

FuncEdit

NETCONF

SNMP

Linux and driver

Fragmentation

and reassembly

Programmable Key Components, Flexible Customization of Service Functions

Page 12: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Huawei Confidential12

Simplified DeploymentThe SDN controller defines

SFC in drag-and-drop mode.

Efficient Forwarding

Traffic diversion for one time,

saving ACL resources and

providing simple configuration

Flexible OrchestrationDecouple VAS functions from

fabrics, providing flexible

orchestration.

WEBApp

A

FW IDS LB NAT

VAS

resource

pool

Switch Switch Switch

NSH-based SFC Provides Easy VAS Orchestration

Page 13: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Inline deployment causes complex

configuration of the control plane.

Diversified policies are deployed,

and ACLs become the bottleneck.

NSH: NSH Copes with Challenges Brought by Diversified DC Security to the

Network

• The switch needs to eliminate

the ACL bottleneck.

• Security policies need to be

configured on the GUI.

• Security devices are pooled,

implementing scaling on demand.

The security service is coupled with the

physical topology, leading to low scalability.

App 1 App 2 App n……

QoS, routing, O&M, and security policies

Static traffic diversion depends

on the physical topology

ContainerOverall Intent SummarySFC Microsegmentation

Page 14: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Huawei Confidential14

VM 1 VM 2 VM 3

1.1.1.1 1.1.1.2 1.1.1.3

VM 4 VM 5 VM 6

2.2.2.1 2.2.2.2 2.2.2.3

As Is: Subnet-based isolation To Be: VM-level

isolation

Fine-grained DefenseDefine applications based on VM

names and discrete IP

addresses, with fine granularity.

Flexible DeploymentDefine services based on

application groups and decouple

them from subnets to achieve

flexible deployment.

Distributed SecurityTraffic of access switches is

filtered nearby and east-west

isolation is implemented

without using firewalls.

Microsegmentation Achieves Fine-grained Isolation and Service Security

Page 15: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Traditional isolation brings

traffic bypassing.Traditional security depends on

different service partitions.

Microsegmentation Copes with Challenges Caused by Diversified DC Security

• Cloud sharing and security isolation

create a conflict.

• Access switches support security

isolation.

• Switches need to eliminate the

ACL bottleneck.

Due to diversified isolation policies,

ACLs become scarce resources.

Web App Database

Externalnetwork Untrusted

Source: Forrester Research

Zero-trust security model was

proposed in 2012.

Internalnetwork

Segmentation

Subnet

Microsegmentation

VM

name/ContainerDiscrete IP

address

Spine

VTEP

VMVM

VM

OVS

Server leaf

VTEP

VMVM

VM

OVS

Server leaf

ContainerOverall Intent SummarySFC Microsegmentation

Page 16: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

VM name = Web*

Microsegmentation solves the problem of the zero-trust security model. Compared with the zero-trust

security model, microsegmentation provides security isolation in a more fine-grained manner. It covers

physical machines and addresses east-west security issues.

Microsegmentation Provides Fine-grained Security Isolation

SegmentationMicro

Microsegmentation

SubnetVM name/

Container

Discrete

IP

address

OS typeOrganization

name

Web 1 Web 2

Web 3 Web 4

Security group = App

App 1 App 2

App 3 App 4

Operating system = Linux

Linux Linux

Linux Linux

IP

IP1=10.0.0.1

IP2=10.0.0.2

MAC

MAC1

=11-

11-11

MAC1

=22-

22-22

VLAN=10

DB1 DB2

DB3 DB4

ContainerOverall Intent SummarySFC Microsegmentation

Page 17: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Microsegmentation Solves the ACL Bottleneck of Switches

VM

Microseg

mentatio

n

VAS resource

pool

VM

resource

pool

External

network

Divert traffic to the firewall

3 ACL rules

3 policy-based routes

ACL entry bottleneck

VM resource

pool

External

network

Microsegmentation-

based isolation

0 ACL rule0 microsegmentation policy

PBR depends heavily on ACL entries. Microsegmentation overcomes entry

restrictions.

Solution 1 Solution 2

Case 1: At a bank, PBR and antivirus

preempt ACLs. As a result, ACLs are

insufficient and services fail to be provisioned

(due to conflicts with security policies).

Solution benefits:

Microsegmentation used to isolate east-west

traffic on switches instead of firewalls

VMVM

VM

OVS

VMVM

VM

OVS

VM resource

pool

VMVM

VM

OVS

VM

ContainerOverall Intent SummarySFC Microsegmentation

Page 18: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Huawei Confidential18

• SNMP/NETCONF query/response mechanism, and

minute-level reporting

• Microburst detection is not supported, and traffic details

cannot be detected.

• The traditional network device reports only logs and

alarms, but cannot collect packet characteristic

information such as the delay and packet loss.

• gRPC subscription/active reporting mechanism, and millisecond-

level reporting

• The CloudEngine 16800 monitors the microburst status, detects

traffic details, and predicts congestion in real time.

• The CloudEngine 16800 uses the intelligent analysis algorithm to

detect packet characteristic information such as the delay,

packet loss, and packet loss location in real time.

As-Is: Network Device Used as Black Boxes To-Be: Visualized Network Management and Control

Industry-leading Telemetry Technology Achieves Visualized and Controllable

Networks or Services in Real Time

Collector Analyzer

CPU Forwarding

chipNP

SNMP

NETCONFNetStream ERSPAN

Flow table

Protobuf

over UDP

gRPCERSPAN+

CPUForwarding

chip

Traditional NMS

AI

Chip

Page 19: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

RDMA Effectively Improves Throughput and Reduces Latency, but Current

Network Bearer Solutions Have Disadvantages

Challenges:

Packet loss: The packet loss rate of 1% decreases the RoCE

throughput from 100% to 0. However, packet loss on traditional

Ethernet networks in best-effort (BE) mode is inevitable.

Introduction to RDMA/RoCE

Technical description:

RDMA technology implements kernel bypass and zero copy of the

buffer, provides RDMA read/write access between remote nodes,

and implements the control plane protocol in the NIC hardware.

RDMA technology is used in HPC, distributed storage, and AI

scenarios to reduce the CPU load and latency, greatly improving

the application performance.

RoCEv2 migrates RDMA traffic to the ETH/IP network. In this way,

the ETH/IP network supports HPC, distributed storage, and AI

application deployment, and is required to provide the same

network performance as memory access.

vs.

RDMA over InfiniBand

Advantage: Zero packet loss, low

latency, and high throughput

Disadvantage: Manual O&M

performed by dedicated personnel,

high cost

Proprietary Technology,

Dedicated Network

RDMA over CEE (current)

Advantage: SDN automation,

low price

Disadvantage: High latency

and low throughput

Open Ethernet,

Converged Network

Current RDMA Network Bearer Solutions

(IB vs. CEE)

IB CEE

Performance High Low

O&M Difficult Easy

Price High Low

Scale Small Ultra-large

OthersDedicated

network

Cloud-

network

synergy

Page 20: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Dynamic ECN: Local device-level intelligence

(implemented by the intelligent chip)

Question:

Statically configured

threshold

Static queue type

• Set priorities through multiple

queues

• Prevent packet loss through PFC

backpressure

• Use ECN to notify the transmit end

to avoid congestion

ECN thresholdPFC threshold

Basic Flow Control Model

Queue

AI ECN: Global network-level intelligence

(optimal application experience)

The CPU sensitivity

is a key indicator.

Set the optimal threshold based on

the current traffic model.

The queue type and

threshold are the key.

Application-based priority queues are

generated based on application requirements.

AI chip

Application-oriented optimal

queue on the entire networkLocal optimal threshold based

on intelligent chip detection

Set the optimal threshold based

on the current traffic model.

Local optimal threshold based

on CPU’s dynamic ECN

CPU

LSW chip

Static ECN: Local device-level intelligence

(implemented by the CPU)

November

2019

The threshold is setted by CPUStatic ECN performance: 50% higher than

that of other vendorsStatic ECN performance: 30% higher than

that of other vendors

AI Fabric Implements Zero Packet Loss, Low Latency, and High Throughput

Based on the Ethernet to Meet Service Requirements in the AI Era

CloudEngine 6865/8850/8861 CloudEngine 16800Mainstream solutions in the industry

CPU

LSW chip

Intelligent

chip

Page 21: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Five Scenarios of CloudFabric Solution: Based on Whether the Controller

and Cloud Platform Are Available

FusionSphere Third-party

OpenStack

Scenario 3: computing and hosting with the controller but no cloud platform

Scenario 4: Cloud platform, third-party controller,

and OpenStack interconnection

Network

administratorComputing

administrator

Network

administrator

Service

administrator

Remarks: The network overlay supports centralized and distributed deployment. The distributed solution is recommended.

The centralized mode does not continue to evolve. The hybrid overlay supports only the distributed mode.

Network overlay Network overlay

Hybrid overlay

System Center

/vCenter

Network overlay Network overlay

Scenario 2: Cloud platform and

third-party controller

Scenario 1: Underlay, without the

cloud platform or controller

Network

administrator

Underlay

CloudEngine Layer 2 VTEP

VMware NSX controller

Third-party configuration tools

such as Ansible or Microsoft Azure

Service

administrator

Network overlay extension

CloudEngine 1800V

ComputingHosting

Cloud platform and

network associationContainer platform and

network association

Scenario 5: Cloud platform, third-party controller,

and container cloud interconnection

Kubernetes

Agile Controller-DCN

SecoManager

Agile Controller-DCN

SecoManagerAgile Controller-DCN

SecoManager

New

ContainerOverall Intent SummarySFC Microsegmentation

Page 22: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Intent

Design Conversion

Pre-event

checkAutomatic

delivery

Service

verification

GUI-based drag-and-drop service

deployment

• Drag-and-drop service deployment,

provisioning a single VM in 18 seconds

• Microsegmentation and SFC,

eliminating the ACL bottleneck

Numerous steps and parameters

• Many parameters entered using forms,

provisioning a single VM in 3 minutes

• DCs do not support NSH-based SFC.

(Cisco APIC supports drag-and-drop

mode starting from V4.0.)vs.

10x efficient

deployment

Ultra-large-scale, standard open

protocols

• Standard protocols, easy

integration

• 4200 devices can be managed,

achieving smooth evolution

Medium scale, proprietary

protocols

• Proprietary protocol, easy to be

locked in

• 400 devices can be managedvs.

10x devices

managed

Agile Controller-DCN Provides GUI-based Drag-and-Drop Service Provisioning, Verification,

Ultra-Large-Scale Device Management, and Three-Layer Network Visibility

Closed-loop verification and

reliable service provisioning

• Pre-event resource check and

post-event service verification

NAE post-event verification and service rollback

• Pre-event resources are not checked, and

services are only verified after the eventvs.

100% service

correctness

Three-layer network visibility

• Visibility of application, logical, and

physical networks, meeting

requirements of DC services

Three-layer network visibility not supported

• Cisco APIC adds the logical network layer

starting from V4.0.vs.

100% service

visibility

Comprehensive

server accessBMs, VMs, and containers

Largest service provisioning scopeIPv4 and IPv6, and unicast and multicast

Cloud-network-security integrationMicrosegmentation and SFC

ContainerOverall Intent SummarySFC Microsegmentation

Cisco APIC supports drag-and-drop mode starting and the logical network layer in the latest version

Page 23: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Service

quality

TCP/UDP/RoCE full flow analysis,Locating faults in minutes

Visibility of applications and networks

and fault detection in seconds

Predictive maintenance based on

optical transceivers

vs.

No prediction of optical

module faults

No association between

applications and networks

At least 6 servers + 2 N9000s

No edge intelligence, no

RoCE analysis

Lightweight deployment

(3 VMs + 1 server)

Visualization capabilities, fault location, and predictive

maintenance better than Cisco

Detect faults in 1 minute, locate faults in 3 minutes,

and rectify faults in 5 minutes

Obtaining service flows and network KPIs based

on Telemetry in seconds

FabricInsight Detects Faults in Seconds, Locates Faults in Minutes, and

Provides Predictive Maintenance

Collector Analyzer

DCN

Information such as traffic

characteristics, packet loss,

delay, and traffic

Telemetry

Intelligent

chip full-flow

analysisSpine

Leaf

Server

App 1 DatabaseWeb App 2

As-Is To-Be

36 types of network

connection faults

Covering over 90% of

faults on the live network

Fault location in 15 minutesDevice

protocol

Network

reliability

Sample import

Intelligent

learning

Rapid location

Page 24: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Intelligent O&M: FabricInsight Provides Specified Flow Analysis, Edge

Intelligence + Cloud Training, and 100% Traffic Visualization

Switch-based

load balancing

Collector

Collector

SNMP: device management

ERSPAN: full flows

gRPC: performance

indicators

NetStream v9: specified

flows

Big Data

Query

Filter

Aggregation

TCPVisualized

FabricInsight

UDP RoCE

1

2

Distributed intelligenceSwitches provide edge intelligence, and analyze

flows and send them to the cloud for processing. The

analyzer configuration is reduced by five times.

Device Type(V1R19C10): CloudEngine 6881,

CloudEngine 6863, CloudEngine 16800。Device Type(V1R19C00):CloudEngine 6865,

CloudEngine 8850-64CQ, CloudEngine 6857,CloudEngine 12800。

TCP Fine-grained capabilityFabricInsight analyzes all packets of a specified

flow and displays the network quality on the

GUI.

CloudEngine 6800, CloudEngine 7800,CloudEngine 8800,CloudEngine 12800,

CloudEngine 16800

Multi-protocol processing capabilityDistributed flow awareness based on Telemetry and

multi-protocol full-data packet analysis

(TCP/UDP/RoCE)

Co-processor,

edge intelligence

Cloud training

Page 25: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

CloudFabric: Independent Deployment, Interconnection with Huawei or

Third-Party Systems, and Easy Integration with Open Ecosystem

eSight

FabricInsight

Third-party cloud platform

Virtualization management

platform

Application Scenario

Network and service decoupling: The IP

department manages the underlay network. The IT

department uses the host overlay or traditional

network. IP and IT platforms are connected using

standard protocols.

Application Scenario

End-to-end cloud-network integration: integrated

network and IT deployment, E2E integration of

CloudFabric and FusionCloud

1. CloudFabric is independently deployed

(underlay network)

2. CloudFabric connects to

Huawei FusionCloud

3. CloudFabric connects

to a third-party platform

CloudFabric can be independently deployed, integrated with Huawei systems, or integrated with third-party systems.

Underlay

network

VMVM

VM

OVSVMVM

VM

OVS

Server Server

Overlay network

Underlay network

VMVM

VM

OVSVMVM

VM

OVS

Server

FusionSphere

Network

administrator

Computing

administrator

DC

administrator

Server

Overlay network

Underlay network

VMVM

VM

OVSVMVM

VM

OVS

Server

FusionSphere

DC

administrator

Server

Third-party cloud platform

Virtualization management

platform

Third-party

VAS device

Integrator

Application Scenario

Interconnection with third-party systems: Continue

promotion of available integration solutions. The new

CloudEngine switch models need to be integrated by

integrators or customers. Open-source northbound APIs

are being considered for CloudFabric.

Page 26: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

AI training for autonomous driving is slow,

with networks as the bottleneck,

hampering the L4 GTM plan for 2021. SSD replacement does not markedly

improve the performance of the

distributed storage system, and storage

efficiency remains low.

Why Huawei

• Based on open Ethernet: Lower

price compared to InfiniBand and no

dedicated technicians required

• Innovative algorithm + dedicated

chip: VIQ, dynamic ECN, fast CNP,

and other innovative algorithms (not

supported by Cisco)

Strategy

Autonomous driving training:

40% training efficiency 53%TCO

Distributed storage:

25% IOPS

Benefits

AI is at the core of Baidu's current

business. In 2018, Baidu

implemented large-scale global

deployment of its distributed

storage and AI training services.

Challenges

Cost Performance Procurement

Cost of InfiniBand

network with same

performance

Opportunities

emerges

Test performance

of multiple vendors

VIQ and

dynamic ECN

Emphasize

separate-bid

procurement

May 2018 June 2018 to July 2018 August 2018-

Seize

Opportunity

Performance of InfiniBand at

Ethernet prices

Autonomous

driving

Facial

recognition

Data

mining

Life

science

AI

Thresho

ld

VIQ1 VIQ2

CE12800

CE6865

Baidu: Huawei AI Fabric Realizes Dedicated Network Performance at

Ethernet Prices, Emphasizing Separate-Bid Procurement

Page 27: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Huawei Confidential27

100GE 400GE40GE10GE

48*10GE 36*40GE 36*100GE

18*100GE24*40GE

48*100GE 36*400GE

16

CE16800

08

04

GA on September 30, 2019

72*25/10+6*100GE

48*25/10+4*100GE

GA on July 30, 2020

48x10G

48*10G FD

FD1:Support 25G;

IEEE 1588V2

FG:4M FIB

Uplink 2*40GE+2*100GE

36*40G FD

24*40G FD

12*100G FD 36*100G FD

24 GB buffer8 GB buffer

16*100G FD

8*100G FG

36*100G FG

16 GB buffer,

2 MB FIB

16 GB buffer, MACsec

IEEE 1588v2

4 GB buffer,

MACsec, 2 MB FIB

16

08

0404S

08S

CE12800/CE12800S

36*100G SD

64 MB buffer,

Cost-effective

18*40G+18*100G

V2R5C20

V3R20C00

CloudEngine 16800 Roadmap

48*400GE POC

Page 28: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Huawei Confidential28

CloudEngine TOR Switch Roadmap

~2018

10G

25G

40G

100G

400G

GE CE5855

CE6851

CE6856

CE6880

High

(Large buffer)CE6870 CE6875

ENP

CE6860

CE7855

CE8850-32

CE8860

CE6865

CE8861

CE8850-64

CE5880

CE6857

25G, AI Fabric, 1588,

microsegmentation

GE VXLAN

Low (Layer 2)

Middle

V2R5C20

2019 2020

CE6810

CE6881

CE6863

CE6820

GA on September 30, 2019

2020.7.30GA

CE8851: 32*100+8*400GE

CE8852: 96*100GE

V3R20C00

CE6866 HI: 48*25+8*100GE

CE6866: 48*25+8*100GE

Page 29: CloudEngine Switch 16800 & Cloud Fabric · Microsegmentation-based isolation 0 ACL rule 0 microsegmentation policy PBR depends heavily on ACL entries. Microsegmentation overcomes

Thank you