25
Scalable and Flexible Routing Service for Tencent Cloud Access Network Allen Lv, Tencent Aug, 2020

Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

Scalable and Flexible Routing Service for Tencent Cloud Access Network

Allen Lv, TencentAug, 2020

Page 2: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

Agenda

• Challenges

• Architecture

• Design Details

• Experience and Future Work

Page 3: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

3

Tencent Cloud Infrastructure Overview

54+AZs

27+Regions

100T+Public Network

Bandwidth Reserved

15EB+Storage

Page 4: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

Enterprise Branch

CVM

CVM

CDB

VPC

Private Line

| Tencent Cloud Access Network Overview

Tencent Cloud Network

CVM

CVM

CDB

VPC

AccessSite 1

Enterprise Branch

Private Line

VPN

AccessSite 2

VPN

Custom IDC

Custom IDC

ISP

Tencent Internet exchange (TIX)

ISP

Tencent Internet exchange (TIX)

Page 5: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

5

Challenges

• Massive scale forwarding table, VRFs, Tunnels…

• Roll out network features fast

• Scale up easily for rapidly growth of traffic volume

• Low Cost

Page 6: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

Line Card Line Card Line Card

| Traditional Commodity Router

• Hardware & Software Vendor Lock-in

• Hard to Scale

• Lack of feature velocity

• High Cost

PrimaryProcessor

SecondaryProcessor

Switching Fabric

Page 7: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

7

| Design Philosophies

• Scalability, each component scales up independently on demand

• Flexibility, fast features delivery (~ 2 weeks)

• Reliability, NSF, NSR, fast failover

• Operationality

Page 8: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

Software Defined Router(SDR) Overview

CS

AS AS

eBGPISP

Overlay

Data Plane

Underlay

CS

AS

CS CS

AS AS

NFV BasedRouting System

AS

Data Plane … Data Plane

Routing Plane Routing Plane … Routing Plane

Control Plane Control Plane … Control Plane

Orchestrator Orchestrator … Orchestrator

NFV BasedForwarding System

NFV BasedController

NFV BasedOrchestrator

eBGP

CustomerRouter

eBGP

CPEipsec

Tencent Cloud

Tencent FW

Tencent DDoS

Page 9: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

Software Defined Router(SDR) Inside

Edge Access(EA)

BGP

NGW

RNSO

ExternalRouter

GNSOOSS/BSS

VPC

NGWData Plane

BGPRouting Plane

RNSOControl Plane

BGP/BFD

FIB/ARPconfig/monitor

TGRE VxLAN

GNSOOrchestrator

gRPC

config/monitor

FIB/ARP

Page 10: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

| Customer Access (Private-Line GW & VPNGW)

VPNGW(SDR)

PLGW(SDR)

VPC 10.0.0.0/16

Interoperating with both Traditional Network and SDN-Based Network at large scale

BGP Session

EA

BGP Session

Internet

CustomerRouter

Traditional NetworkSDN-Based Network

Page 11: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

| End-user Access (Tencent Internet Exchange)

Large scale forwarding table (10M) and flexible Traffic Engineering

EA2TIX2(SDR)

ISP Router2

BGP Session

VPC1 115.159.246.0/24

VPC2 116.150.247.0/24

EA1TIX1(SDR)

ISP Router1

BGP Session

VxLAN Fabric

Page 12: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

| Flexibility – On-Demand Traffic Engineering

• Flexible traffic engineering based on userdemand

Site1

VPC

SDR2

Site2

VxLAN Fabric

ExternalPeer 1

ExternalPeer 2

ExternalPeer 3

ExternalPeer 4

SDR1

<SIP,DIP> ---> <SDR2, VNI>

BGP route

Page 13: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

| Flexibility - FW Service

• Support >100k flex rules for FW purpose

Data Plane

SDR

VPC

VxLANFabric

FW Service

ExternalRouter

EA

<DIP> --> <FW, VNI><SIP> --> <FW, VNI>

Page 14: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

| Flexibility - DDoS Service

SDR

VPC

DDoS Service

EA

180.10.1.1/32, DDoS

ExternalRouter

BGP route 180.10.1.1/32

Data Plane

Page 15: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

| Flexibility - DDoS Service

• Redirect attack traffic to DDoS service efficientlySDR

VPC

DDoS Service

EA180.10.1.1/32, DDoS0.0.0.0/0, DP

ExternalRouter

BGP route 180.10.1.1/32

Data Plane Only processing the real traffic

Page 16: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

| Flexibility - Interoperability

• Interoperate with existing traditional routers

• Speed up deployment of SDR

SDR

VPC

ExistingCommodityRouter

MPLSFabric

RoutingPlane

DataPlane

MPLSSwitch

ExternalRouter

eBGP

Page 17: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

| Scalability

CS

AS

CS

AS

CS CS

AS AS

NGW FCR

AS AS

RNSO

AS AS

GNSONGWData Plane

FCRRouting Plane

RNSOControl Plane

GNSOOrchestrator

• Each component scales independently

• Each network can be operated independently

• 3.2Tbps forwarding capacity

eBGP

eBGP eBGP eBGP eBGP

Page 18: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

| Scalability - Hardware Acceleration

VPC

EA

Data Plane Tencent SmartSwitch

• Introduce programmable switch for hardwareacceleration

• > 10Tbps forwarding capacityControl Plane

Elephant flow info

Flow offloading

Static flow info forHigh volume traffic

ExternalRouter

SDR

Page 19: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

| Reliability – NSF & NSR

• Single node failure will not affect the system

• Data Plane supports Non-stop forwarding (NSF)

• Routing Plane supports Non-Stop Routing (NSR)

ExternalRouter1

ExternalRouter2

RoutingPlane1

RoutingPlane2

Routing System

Control System

Forwarding SystemNGWNGWData Plane

NGWNGW

Control Plane

Page 20: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

| Reliability – NSF & NSR

• Single node failure will not affect the system

• Data Plane supports Non-stop forwarding (NSF)

• Routing Plane supports Non-Stop Routing (NSR)

ExternalRouter1

ExternalRouter2

RoutingPlane1

RoutingPlane2

Routing System

Control System

Forwarding SystemNGWNGWData Plane

NGWNGW

Control Plane

Page 21: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

| Operationality - Monitoring

• 3 Levels Data Plane Probing

• Critical resources monitoring

• Various statistics and events

Data Plane cluster

core0

server0

core0 corex

RMOS

core0 core0 corex

server1

Cluster LevelHeath check

Server LevelHeath check

Core LevelHeath check

Page 22: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

| Operational Experience

• Move manual configurations to centralized orchestrator as much as possible.

• Provide robust “One-Click” operation to quickly turn off the whole system.

• Keep the message queues among different components reliable and efficient.

Page 23: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

| Future Work

• End-to-End network quality detection and analysis system for different network layers

• Automatic traffic engineering based on more network metrics like latency, link utilization…

• Simulation and verification system to detect and fix abnormal behaviors in advance

Page 24: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

| Conclusion

• Disaggregate functionalities into individual components

• High scalability of each components at each level

• Fast features velocity via software programming

• Low Cost

switch switch …

DataPlane

DataPlane

ControlPlane

ControlPlane

Orches-trator

Orches-trator

Scalability

Flexibility

RoutingPlane

RoutingPlane

Page 25: Scalable and FlexibleRouting Service for Tencent Cloud … · 2020-08-07 · Scalable and FlexibleRouting Service for Tencent Cloud AccessNetwork Allen Lv, Tencent Aug, 2020

Thanks