39
1 Radisys Corporation Confidential Welcome! June 12 Cloud Media Processing June 19 Lowering Cost-Per-Bit with 40G ATCA June 26 Breaking Ground: Fast Track to RAN- Aware Policy Enforcement To register for the remaining webinars: http://go.radisys.com/Unlocking.html Unlocking New Revenues: Optimize & Monetize Your LTE Infrastructure

Lowering Cost-Per-Bit with 40G ATCA

  • Upload
    radisys

  • View
    1.125

  • Download
    3

Embed Size (px)

DESCRIPTION

Discover how operators can meet the demands of exploding mobile video traffic while increasing profitability and driving an improved customer experience. Learn how ATCA-based platforms deliver increased efficiency, decreased delivery costs and improved performance. Presented by: Eric Gregory - Director, Platform Systems, Karl Wale - Director, Product Line Management and Jeff Sharpe, Sr. Product Marketing Manager

Citation preview

Page 1: Lowering Cost-Per-Bit with 40G ATCA

1 Radisys Corporation Confidential

Welcome!

June 12

Cloud Media Processing

June 19

Lowering Cost-Per-Bit with 40G ATCA

June 26

Breaking Ground: Fast Track to RAN-

Aware Policy Enforcement

To register for the remaining webinars: http://go.radisys.com/Unlocking.html

Unlocking New Revenues: Optimize & Monetize Your LTE Infrastructure

Page 2: Lowering Cost-Per-Bit with 40G ATCA

2 Radisys Corporation Confidential

Lowering Cost-Per-Bit with 40G ATCA

Speakers:

Eric Gregory, Director – Platform Management

Karl Wale, Director – Product Line Management

Jeff Sharpe – Sr. Product Line Manager

June 19, 2012

Page 3: Lowering Cost-Per-Bit with 40G ATCA

3 Radisys Corporation Confidential

Agenda

Setting the Stage

Decreased Cost per Bit

Increasing Performance

• Packet Processing on x86

Increasing Market Velocity

• Embedded Solution Starter Kits

• DPI & Load Balancing in Policy & Monitoring Systems

Page 4: Lowering Cost-Per-Bit with 40G ATCA

4 Radisys Corporation Confidential

Evolved Packet Core Policy Control Radio Access Network IMS

Application

Server

Media

Resource

Function

IP

Multimedia

Subsystem

Internet

Policy &

Charging

Routing

Function

Policy &

Charging

Enforcement

Function

Mobility

Management

Entity

LTE Security

Gateway

Serving

Gateway

Packet

Gateway

eNodeB

User

Equipment

Macro Small Cells

60+ Customer Wins

Audio Video Conf

~65% Market Share

10G 40G ATCA

~40% ATCA Market Share

Dumb Smart Pipes

Traffic Management

Radisys is the One-Stop Shop Embedded Wireless Infrastructure Solutions

Home eNodeB

User

Equipment

Page 5: Lowering Cost-Per-Bit with 40G ATCA

5 Radisys Corporation Confidential

Radisys ATCA Platforms History of Success

Radisys is the #1 supplier of

global telecom platforms and

ATCA solutions

10 year history in ATCA platform

innovation

Radisys’ T-Series family moves

beyond pre-integrated,

application-ready hardware

platforms to provide a robust

solution complete with software

and services that Telecom

Equipment Manufacturers

(TEMs) can deploy now

Page 6: Lowering Cost-Per-Bit with 40G ATCA

6 Radisys Corporation Confidential

Mobile Video Is Driving Change

2008 Beijing Olympics

310,000 Streams

2010 Vancouver Olympics

2.1M Streams

2012 London Olympics

10M Streams?

FROM:

>5X

Operators are upgrading the access network to keep up with demand,

BUT…

Core network elements need a corresponding upgrade

TEMs need a platform that delivers …

1. Decreased cost-per-bit to help operators reduce their service delivery costs

2. Increased performance to keep pace with the huge up-tick in access speeds

3. Time-to-market improvements allowing TEMs to rapidly deliver new applications

Example of data growth driver…

Page 7: Lowering Cost-Per-Bit with 40G ATCA

7 Radisys Corporation Confidential

Introducing Radisys’ T40 T-Series 40G ATCA Platform

Decreased cost per bit

• 40G platforms inherently decrease the cost of delivering traffic

• The reduction in cost per bit over 10G systems is ~50%

Increased performance

• Integration of multiple 40G technologies for end-to-end 40G performance

• Every component received an upgrade

• Widest selection of blades enabling TEMs to build any application

Increased market velocity

• T-Series platform software enables TEMs to focus on their applications

• Software suite includes platform management, switch redundancy and load balancing

The T-Series 40G platform (T40) is a telecom-grade system that is

pre-integrated with the largest selection of blades,

processors and software

Page 8: Lowering Cost-Per-Bit with 40G ATCA

8 Radisys Corporation Confidential

Radisys T40 Platform Increased Performance

High speed 40G chassis

with improved cooling 640G aggregate

throughput and

40G to every

slot

160G of

flexible I/O

on the hub

28 Intel processors

(488 Cores) with

more than 1.5TB of

memory in a chassis

Multiple

NPU

options

Trillium

protocol

software

T-Series platform software ties the

hardware together into a platform

6TB of

Storage per

slot

Page 9: Lowering Cost-Per-Bit with 40G ATCA

9 Radisys Corporation Confidential

Agenda

Setting the Stage

Decreased Cost per Bit

Increasing Performance

• Packet Processing on x86

Increasing Market Velocity

• Embedded Solution Starter Kits

• DPI & Load Balancing in Policy & Monitoring Systems

Page 10: Lowering Cost-Per-Bit with 40G ATCA

10 Radisys Corporation Confidential

It’s All About the Money, Money…

Page 11: Lowering Cost-Per-Bit with 40G ATCA

11 Radisys Corporation Confidential

Let’s Start with Yesterday’s Technology

Assuming 5G worth of

total throughput…

Each ATCA board

supports 9,000 Video

Sessions

Page 12: Lowering Cost-Per-Bit with 40G ATCA

12 Radisys Corporation Confidential

A Full 10G ATCA Chassis

12 Node Boards in a 14 Slot /

10G chassis provides

• 60G worth of processing

throughput

• 108,000 video sessions

Results

in…

Page 13: Lowering Cost-Per-Bit with 40G ATCA

13 Radisys Corporation Confidential

Let’s Upgrade to 40G

ATCA-2340

40G Switching

More I/O

Roughly 3X the

performance

at 1.3 the price

Need only 4 x 40G

Node Blades to get the

same capacity

Assumed the same price for Node Blades and Hub Blades

Need a 40G switch to

carry the traffic

Page 14: Lowering Cost-Per-Bit with 40G ATCA

14 Radisys Corporation Confidential

Same Throughput, Lower Price

4 x 40G Node Blades in a 14

slot 40G chassis provides

• 60G worth of processing

throughput

• 108,000 video sessions

Provides

Page 15: Lowering Cost-Per-Bit with 40G ATCA

15 Radisys Corporation Confidential

Let’s Load it Up

12 x 40G node blades in a

14-slot 40G chassis provides

• 180G worth of processing

throughput

• 320,000 video sessions

Results

in…

Page 16: Lowering Cost-Per-Bit with 40G ATCA

16 Radisys Corporation Confidential

Reduced Delivery Costs

Page 17: Lowering Cost-Per-Bit with 40G ATCA

17 Radisys Corporation Confidential

Agenda

Setting the Stage

Decreased Cost per Bit

Increasing Performance

• Packet Processing on x86

Increasing Market Velocity

• Embedded Solution Starter Kits

• DPI & Load Balancing in Policy & Monitoring Systems

Page 18: Lowering Cost-Per-Bit with 40G ATCA

18 Radisys Corporation Confidential

Simplification

TEMs have always tried to minimize the number of

unique blades they supported in their ATCA system.

• Minimize the validation / engineering effort

• Reduce operation / supply chain cost & complexity

However, the hardware requirement of data plane and

control plane have different requirements

• Control plane – mainly signaling, general computing blade with

minimum I/O

• Data Plane – more complex processing and manipulation of

packets & content, significant I/O, and many cores to process

at high line rates

TEMs’ ATCA systems usually had switching, compute,

packet processing, and sometime DSP base blades

Page 19: Lowering Cost-Per-Bit with 40G ATCA

19 Radisys Corporation Confidential

Intel Processors & Radisys ATCA SBC 2 Y

EA

RS

2

YE

AR

S

32nm

22nm

Shrink / Derivative Ivy Bridge – 2013

New Microarchitecture Haswell

45nm

2 Y

EA

RS

Shrink/ Derivative Harpertown - 2008

New Microarchitecture Nehalem - 2009

Shrink / Derivative Westmere - 2010

See “Intel Architecture and Silicon Cadence”. Whitepaper

http://download.intel.com/technology/eep/cadence-paper.pdf

“Tick Tock” (Shrink) (Innovate)

New Microarchitecture Sandy Bridge -2012

A46XX and XE100/XE100+

A45XX and XE60/XE80

Radisys decided to follow Intel’s

Tick Tock model through Ivy

Bridge. This provides:

• Leading edge performance for x86 technology

• Allows 2 generation of processors with single HW implementation

• Provide cost-effective path for a performance jump with same hardware

Page 20: Lowering Cost-Per-Bit with 40G ATCA

20 Radisys Corporation Confidential

Increasing Performance & Bandwidth

0

2

4

6

8

10

12

A4300DP 2C Sossaman

(2005)

A4400DP 2C Woodcrest

(2007)

DP 4C Harpertown(2008)

XE60/A4580DP 4C L5518

(2009)

XE80DP 6C L5638

(2010)

A4600/XE100DP 8C E5-2448L/E5-

2648L(2012)

Re

lati

ve

SP

EC

int_

rate

_b

as

e2

00

6

10GbE Fabric 40GbE Fabric 1GbE Fabric Board Fabric

Connection

Page 21: Lowering Cost-Per-Bit with 40G ATCA

21 Radisys Corporation Confidential

Radisys SBC Designed for Packet Processing

In 2009 it was clear: during the Sandy Bridge

timeframe, some customers would move their

packet processing application to x86

XE100 – ideal for Packet processing

• Dual 40GbE NIC on board

• Support for 80GbE to fabric and dual 10GbE to front

• RTM will support an additional 8 x 10GbE ports

CPM 9 – ideal for control plane with some

packet processing

• Single 40GbE NIC to fabric

• Mezzanine card site based on MXM Standard optionally

supporting acceleration functions

Page 22: Lowering Cost-Per-Bit with 40G ATCA

22 Radisys Corporation Confidential

New Software Models

Traditional control application usually run on standard Linux

• Most OS distribution are focused on data center workloads and take advantage latest technologies for compute-intensive task

• There are limited requirements for handling +10GbE of traffic

Data plane applications are different

• Need to handle the high throughput requirements

• Traffic needs to be quickly moved from the NIC to the CPU

– At 40GbE line rate a 128 byte packet arrives every ~28ns

• The networking stack in standard OS distributions are not optimized for this type of workload

Two Options:

• Modify a standard Linux OS

• Or…….

Page 23: Lowering Cost-Per-Bit with 40G ATCA

23 Radisys Corporation Confidential

Radisys Support for Intel DPDK (aka Intel Data Plane Development Kit)

The Intel DPDK is a set of Data Plane libraries and optimized NIC drivers designed and optimized for packet processing on IA Offered as a standalone solution for integration with proprietary customer applications or as part of commercial data plane solutions from leading ecopartners

Intel DPDK is a starting-point packet processing framework for customers. It is also integrated with fully-featured solutions delivered from Intel’s lead ecosystem partners.

Environmental Abstraction Layer

Data Plane Libraries

Packet Flow Classification

Poll Mode Drivers

Queue Management

Buffer / Memory Mgmt

XE100 / A46XX

Customer / Eco

Application

Customer / Eco

Application

Customer / Eco

Application

Linux or Bare Metal Execution Environment

Page 24: Lowering Cost-Per-Bit with 40G ATCA

24 Radisys Corporation Confidential

Radisys Value Add to DPDK

Radisys is porting the Mellanox CX3 Pull Mode Driver

(PMD) to DPDK

• Will continue to sustain CX3 driver on DPDK through different

DPDK drops

• Will add features and support new drops of the Mellanox FW

Provide a support channel for DPDK via a Technical

Service Agreement

Ability to support the Trillium Fast Path protocols on

DPDK

Continue to work with Intel Partners on DPDK

Page 25: Lowering Cost-Per-Bit with 40G ATCA

25 Radisys Corporation Confidential

Reducing Overall Cost

Migration of Dataplane applications to Intel

architecture will continue. Intel is solving the

compute aspect with CPU roadmap and DPDK

Radisys’ deep understand of DPI and other dataplane

applications allowed us to solve the I/O problem with

our Sandy Bridge-based blades

Our support and services allow customers to get to

market quicker

Customers can reduce cost by migrating data plane

applications that make sense to Radisys blades using

Intel Architecture and DPDK

Page 26: Lowering Cost-Per-Bit with 40G ATCA

26 Radisys Corporation Confidential

Audience Participation

When do you plan to move your packet processing

applications over to a DPDK-enabled platform?

• In process or already have

• 3-6 months

• 6-12 months

• Will remain with packet processing

Page 27: Lowering Cost-Per-Bit with 40G ATCA

27 Radisys Corporation Confidential

Agenda

Setting the Stage

Decreased Cost per Bit

Increasing Performance

• Packet Processing on x86

Increasing Market Velocity

• Embedded Solution Starter Kits

• DPI & Load Balancing in Policy & Monitoring Systems

Page 28: Lowering Cost-Per-Bit with 40G ATCA

28 Radisys Corporation Confidential

Blended Approach to Reduced Cost

Technology Cost Reductions

• Next generation silicon

• Alternatives : MIPS / IA / NPU

Velocity = Time to Market

• Platform vs Boards & Bits

• Solutions vs board / OS level

• OS : Linux / DPDK / Bare Metal

Velocity = Better Network…Faster

• Monitor, Manage, Monetize

• …this is end goal of above

• How can you get there faster ?

Source: Heavy Reading

Revenues

Traffic

Options to

close gap

?

Revenue vs. Traffic

Growth

Voice Era

Data Era

Page 29: Lowering Cost-Per-Bit with 40G ATCA

29 Radisys Corporation Confidential

Load Balancing in Policy Solutions

1st Stage Load

Balancer 2nd Stage Load

Balancer

(optional) Application

Processing

Flow Configuration

& Management

Server weighting L4 / hash based load balancing

Stateful / Application Aware

Load Balancing

Hub Switch

- FlowEngine for Switching

- 5 tuple based decision

- Hash directs to target ‘bucket’

CPU Based

- FlowEngine stateful load balancer

- Deeper parsing of L4 headers

- Application awareness (L7)

- Network awareness (GTP)

Page 30: Lowering Cost-Per-Bit with 40G ATCA

30 Radisys Corporation Confidential

Creating Market Velocity

Vertically

Integrated

Application

Modules & Solutions

Switch

SDK

Load

Balancer

Wireless

Protocols

Stateful

Load

Balancer

DPI Library

CPU SDK

Application

Wireless

Protocols

DPI Library

CPU SDK

Switch

API/SNMP

Load Bal.

Config.

Mgmt

Application

Switch

SDK FlowEngine

Trillium

6WIND

FlowEngine

Load

Balancer

Qosmos

Vineyard

CPU SDK

Application

Modules

Trillium

Qosmos

Vineyard

CPU SDK

Switch

API/SNMP

Load Bal.

Config.

Mgmt

Application

FlowEngine Appliance

- TCP/IP Re-direct, video

gateway, content caching

- GTP load balancer,

network probe, IOGW

Application

Internet Offload GW, RAN

aware network probe,

Femto Gateway, PCEF,

S/P Gateway

Services

Page 31: Lowering Cost-Per-Bit with 40G ATCA

31 Radisys Corporation Confidential

L4 Switch-Based Load Balancing

Arriving Packets

5 Tuple Hash on User Packet…src IP, dest IP…..

Buckets assignment…blade-CPU-core-thread

Page 32: Lowering Cost-Per-Bit with 40G ATCA

32 Radisys Corporation Confidential

FlowEngine: Stateful Load Balancer

Management Application

Pre-processing Key Extraction FlowTable

Receive

packet

Configure parsing rules

Supports :

- MPLS

- IP in IP (IPv4 & IPv6)

- GTP

- L2TP/GRE

- De-fragment

Pre-processor walks through

packet to find embedded

user packet

Set fields for hash

Generates unique key

for each session/flow FlowTable entry for each

session, identified using

L4 hash

Supports :

- Millions flows

- Pre-config with mgmt interface

- New flows dynamically added

- Mgmt interface to reset target CPU

Set / Adjust

FlowTable Entries

Classifier function

Initial LB rule set

Supports :

- Src/Dest IP

- Src/Dest port

- Protocol

- Ingress port #

- GTP TEID

- VLAN ID

Page 33: Lowering Cost-Per-Bit with 40G ATCA

33 Radisys Corporation Confidential

FlowEngine: Packet Path (Known Flow)

De-tunnel L4 Header ID

Known Flow Apply Rule

Application processing

-Rate shaping

-Monitoring / Analysis

-Video compression

-Policy enforcement

Unknown Classification Engine

Stage 1 Stage 2

Stage 4

Stage 5

Flow

Table

Stage 3

Page 34: Lowering Cost-Per-Bit with 40G ATCA

34 Radisys Corporation Confidential

FlowEngine: Packet Path (New Flow)

De-tunnel L4 Header ID Flow

Table

Known Flow Apply Rule

Application processing

-Rate shaping

-Monitoring / Analysis

-Video compression

-Policy enforcement

Unknown Classification Engine

Stage 1 Stage 2 Stage 3

Stage 4

Stage 5

Page 35: Lowering Cost-Per-Bit with 40G ATCA

35 Radisys Corporation Confidential

Classification State Machine

Extract information from flows….

HTTP GMAIL Metadata

Apply Rule

Buffered

Packets

e.g. add VLAN

State Machine API

API Application adds

table entry & rule

Add new entry by default…

…or wait for application

User Application

Server Load

…arriving packets

Management Application

Set rules for each flow identified

Classification Engine : L7 based example No hit in flow table

Page 36: Lowering Cost-Per-Bit with 40G ATCA

36 Radisys Corporation Confidential

Example – Video Gateway

Video re-direct solution required

• Analyze packets for matching flows: HTTP / Video

• Intercept and re-direct TCP session to local server

• Local CPU compresses video to reduce bandwidth demands

Solution

• FlowEngine stateful load balancer

• Configuration API, flow table, DPI state machine with TCP

intercept / re-direct function

• Delivered as application embedded on CPU blade

Other applications

• Technology also suitable for content caching applications

Page 37: Lowering Cost-Per-Bit with 40G ATCA

37 Radisys Corporation Confidential

Example – GTP Load Balance

Solution required to analyze and forward matching

GTP sessions

• Identify and load balance GTP sessions

• Correlate GTP-C and GTP-U messages

• Analyze traffic and forward to back-end CPU blades/servers

Solution

• Leverage FlowEngine tools and Trillium stacks to build

stateful load balancer

• Add GTP analysis functions

• Package and deliver as appliance on blade / platform

Other applications

• Internet offload gateways, network probes, enhanced PCEF

Page 38: Lowering Cost-Per-Bit with 40G ATCA

38 Radisys Corporation Confidential

Where Radisys Can Assist

Increased Performance

• 40G platform (vs 10G) is key to keeping up with increased access speeds

• All the latest silicon to deliver the highest performance

– DPDK is a key driver

• High-density I/O to deliver the necessary connectivity

Decreased Cost-per-Bit

• 40G platform provides ~50% cost-per-bit decrease over today’s 10G systems

Improved Time-to-Market

• Radisys T-Series platforms come pre-loaded with software that enables TEMs to

immediately start application development

• DPI capabilities deliver cost saving opportunities

• Active & strong ecosystem

• Higher integration of load balancing solutions delivers market velocity and

development savings

And we have other tools to help in other areas…

Page 39: Lowering Cost-Per-Bit with 40G ATCA

39 Radisys Corporation Confidential

Q&A

Contact us!

Eric Gregory

[email protected]

John Long

[email protected]

Karl Wale

[email protected]

~ Please fill out our short survey ~

THANK YOU FOR ATTENDING!