The Next Step ofOpenStack Evolution for NFV Deployments

The Next Step ofOpenStack Evolutionfor NFV DeploymentsDirk Kutscher

NECChris Wright

Red Hat

Page 2 © NEC Corporation 2015

Intro

▌Dirk Chief Researcher for Networking @ NEC Laboratories Europe SDN Architect IRTF Information-Centric Networking RG OPNFV TSC

▌Chris Chief Technologist @ Red Hat Linux developer Cloud, KVM, network virtualization OpenDaylight & OPNFV Board


NEC – Communications and IT Solutions

▌Cloud Infrastructure

▌Telecom networks and services

▌World‘s first commercial LTE deployment

▌World‘s first commercial vEPC deployment

▌Linux- (and generally OSS)-based product range


NECNFV Platform


NEC NFV Solutions History

Linux-based ATCA systems

First-generation of virtualized systems with proprietary resource

manager

OpenStack-based VIM and orchestration systems

OPNFV-based solutions


Relevant Upstream Projects

Linux Kernel

KVM

OpenvSwitch

DPDKLibvirt

OpenStack

Neutron Nova


Red Hat Upstream Leadership

❖ Red Hat has near 20 year history in open source, we have the experience and resources to:➢ Support production-ready customers globally➢ Drive new features➢ Influence strategy and direction of project➢ Enable partner collaboration

❖ Wide ranging participation, contrasts with most others who are more narrowly focused

❖ All of these efforts allows us to create an enterprise-grade distribution with ecosystem, lifecycle, and support that customers expect from Red Hat


Red Hat Product Mapping

Linux Kernel

KVM

OpenvSwitch

DPDKLibvirt

OpenStack

Neutron Nova

RHEL w/KVM

RHEL OSP

Virt stack (QEMU + libvirt)

OVS + DPDK

SDN Controller CEPH

CloudForms

Compute Network Storage Management


Working together inside OPNFV on Requirements

Virtual Network Functions

Orchestration and Management

Continuous Build and Integration

Continuous Deployment

and Testing

NFV/Platform Requirements

Upstream/Partner Projects

Compute Storage Network

Octopus/CI

Bootstrap/GetStarted

Pharos compliant lab …

FuncTest

QTip

…

Doctor

Promise

…

OVSKVM

OpenStack

…

OpenDaylight

…


Requirements from a Telecommunications Perspective

▌General objectives for NFV-based networks1. Automation (deployment, life-cycle management, elasticity)

2. Flexbility – adding/removing new features fast

3. Cost efficiency (consolidation of functions onto fewer physical boxes)

Specific Requirements▌Availability and Fault Management

Faults can happen Detecting root-causes reliably and react quickly Minimizing downtime

▌Performance Balance virtualization with optimal resource usage

▌Multi-domain operation Extending NFV domains across DC boundaries


Work Items for OpenStack

No.

Work Item Example

1 Detecting and Notifying about Hardware Failures

Reporting HW failures to Guest layer to initiate application failover

2 Collecting Information and Configuring VM Allocation

Correlation between vCPU and pCPU/NIC for pinning configuration

3 Multi-domain orchestration Multiple NFV domain interworking across DC networks

4 OpenStack (Controller Node) Availability

Support controller node failover, isolation of controller node failure from VM operation

5 Physical Server Scale-out Automatic PM set up including installation of agent software

6 Live System Upgrade Update mechanism minimizing impact to others

7 VM connectivity VLAN tagging usage, mapping to dedicated physical NIC to each virtual NW

8 VM Control Commands VM shutdown and reboot control from outside


WI-O01: Detecting and Notifying about Hardware Failures

▌Infrastructure Failures Can and will always happen Want to avoid impact on (critical) service availability

▌ATCA approach Standby components Intensive monitoring Monitoring and management blades per box Integration into carriers‘ network management infrastructure

▌NFV and Cloud approach Have to maintain service availability levels Want to find appropriate telemetry and re-action approach ... Without losing benefits of virtualization and automation



▌Physical Machine Failure Failure of Devices: CPU, Memory, Disks (IDE, SCSI, SAS), IPMB Bus, Fan,

Chipset, etc. Device warning: Temperature Anomaly, Abnormal Voltage, etc. System Error: Kernel, File System, Block Device, Boot, etc. State Problem (Notification) : NIC Link, M-State, etc.

▌Chassis Failure EM Card Error and Warning, Switch Module Failure, etc.

▌Storage Failure Controller, Physical DK, Logical DK, Power, FAN, Battery, Monitoring Bus, Bus

between shared DK, etc.

▌LAN Redundancy Error Problems reported in Health Check: LAN Port Error, Communication Error, etc.



Mid WI-K03

ComputingHardware

StorageHardware

NetworkHardware

Hardware resources

Virtualisation LayerVirtualised

InfrastructureManager(s)

VNFManager(s)

VNF 2

OrchestratorOSS/BSS

NFVI

VNF 3VNF 1

Execution reference points Main NFV reference pointsOther reference points

Virtual Computing

Virtual Storage

Virtual Network

NFV Management and Orchestration

EMS 2 EMS 3EMS 1

Service, VNF and Infrastructure Description

Or-Vi

Or-Vnfm

Vi-Vnfm

Os-Ma

Se-Ma

Ve-Vnfm

Nf-Vi

Vn-Nf

Vl-Ha

Option1

Option2

How do HV hosts notify or relay H/W

failures to Guest OS?What is appropriate

notification I/F?

Execute recovery action(s)(e.g. Deactivate VNFC, recreate VM)

Execute recovery action(s)(e.g. Switch Over)

Notify H/W failure

Report HW failure to app (VNF instance) to initiate application failover1. Report HW failures directly from hypervisor to VMs2. Detect HW failures and report it to an orchestrator like Heat3. Use existing monitoring solutions, e.g., Zabbix, Nagios



▌Option 1: reporting from HV to VM Relay Error Notification

• Relay an NIC error by setting tap devices down (by Neutron L2 Plugin Agent)• Emulate Error as Machine Check Exceptions (MCE)

Use qemu-guest-agent to send commands from HV to VM(s)• Requires extra packages to be added to guest OS

▌ Option 2: reporting to orchestrator Detect H/W failures (e.g. abnormal CPU temperature) by Ceilometer

agent(s) and report it to Orchestrator like Heat

▌ Option 3: Zabbix or Nagios



▌Status as of Kilo / April 2015 Option 2

• Ceilometer Performance Improvements– Database data TTL (Juno)

» https://blueprints.launchpad.net/ceilometer/+spec/db-ttl » https://review.openstack.org/#/c/30635/

– Support Time To Live on Event Database» https://blueprints.launchpad.net/ceilometer/+spec/event-database-ttl» https://review.openstack.org/#/c/153943/ » https://review.openstack.org/#/c/146367/

– Time Series Database (Gnocchi)» https://wiki.openstack.org/wiki/Gnocchi

• OPNFV Doctor project identified requirements

• Russel Bryant’s Blog post– http://blog.russellbryant.net/2015/03/10/the-different-facets-of-openstack-ha/

https://blueprints.launchpad.net/ceilometer/+spec/db-ttl

https://blueprints.launchpad.net/ceilometer/+spec/db-ttl

https://review.openstack.org/#/c/30635/

https://blueprints.launchpad.net/ceilometer/+spec/event-database-ttl



https://wiki.openstack.org/wiki/Gnocchi

https://wiki.openstack.org/wiki/Gnocchi

http://blog.russellbryant.net/2015/03/10/the-different-facets-of-openstack-ha/




▌Performance requirements for virtualized carrier networks

WI-O02: Collecting Information and Configuring VM Allocation

Compute node

Socket #0 Socket #1

MemoryMemoryMemoryMemoryMemoryMemory

Core ID #0 Core ID #1 Core ID #0 Core ID #1

CPU #0(thread)

CPU #4(thread)

CPU #1(thread)

CPU #5(thread)

CPU #2(thread)

CPU #6(thread)

CPU #3(thread)

CPU #7(thread)

Node 0 Node 1



A) Collect information of H/W resources

B) Configure VM allocation (e.g. specify pCPU as scheduler hint)

C) Allocating physical resources to specific VMs

1. CPU pinning

2. RAM allocation

3. NIC: Mapping to dedicated physical NIC to each virtualized network

Requirements: Compute node

Socket #0 Socket #1



CPU #0(thread)

CPU #4(thread)

CPU #1(thread)

CPU #5(thread)

CPU #2(thread)

CPU #6(thread)

CPU #3(thread)

CPU #7(thread)

Node 0 Node 1



Compute node

Socket #0 Socket #1



CPU #0(thread)

CPU #4(thread)

CPU #1(thread)

CPU #5(thread)

CPU Node Core ID Status

0 0 0 VM0-vCPU0

1 0 1

2 1 0 VM0-vCPU1

3 1 1

4 0 0 VM1-vCPU0

5 0 1

6 1 0 disable(Reserve for Host OS)

7 1 1

CPU #2(thread)

CPU #6(thread)

CPU #3(thread)

CPU #7(thread)

Node 0 Node 1

CPU Resource Management Schema

Node Huge page size

Total pages

Availablepages

0 2M 80 40

1 1G 2 0

1 2M 40 40

CPU Architecture of Compute node

Memory Resource Management Schema

Compute Resource Management



Resources control level

CPU pinning

avoid crossing NUMA Node

avoid sharingphysical core

0 disable disable disable1 enable disable disable2 enable enable disable3 enable disable enable4 enable enable enable

1. User sets “Resource Control level” for VMs

CoreCoreCore

NUMA Node 0 NUMA Node 1

6 7

VM2 (Level1or3)VM10 1

0 1

16GB 32GB

32GB 32GB

0 1 2 3

2 3 4 5

NUMA Node 0

2 3

NUMA Node 1

VM2(Level2or4)VM10 1

0 1

16GB 32GB

32GB 32GB

0 1 2 3

6 74 5

avoid crossing NUMA Node

Core6 7

VM10 1

0 1 2 3 4 5

avoid sharingphysical core

2

VM2(Level1or2)

0 1 2

CoreCoreCoreCore7

VM10 1

0 1 2 4 5

2

VM2(Level3or4)

0 1 2

63

Legend

Virtual CPU

Available CPU

Assigned CPU

Blocked CPU

VM2’s vCPU cannot share CPU with other VM’s vCPU..

Compute Resource Allocation

2. Orchestrator allocates compute resources to VM according to “Resource Control Level”



▌Virt driver guest vCPU topology configuration (Implemented in Juno)[BP] https://blueprints.launchpad.net/nova/+spec/virt-driver-vcpu-topologyThis feature aims to give users the ability to control the vCPU topology through flavor and

image metadata.

▌Virt driver guest NUMA node placement & topology (Implemented in Kilo)[BP] https://blueprints.launchpad.net/nova/+spec/virt-driver-numa-placementThis feature aims to enhance the libvirt driver to be able to do intelligent NUMA node

placement for guests.

▌Virt driver pinning guest vCPUs to host pCPUs (Implemented in Kilo)[BP] https://blueprints.launchpad.net/nova/+spec/virt-driver-cpu-pinningUser can specify preferred and max counts of sockets, cores and threads

▌Virt driver large page allocation for guest RAM (Implemented in Kilo)[BP] https://blueprints.launchpad.net/nova/+spec/virt-driver-large-pages

▌ I/O (PCIe) Based NUMA Scheduling (Implemented in Kilo)[BP] https://blueprints.launchpad.net/nova/+spec/input-output-based-numa-scheduling

Upstream development status

https://blueprints.launchpad.net/nova/+spec/virt-driver-vcpu-topology

https://blueprints.launchpad.net/nova/+spec/virt-driver-numa-placement

https://blueprints.launchpad.net/nova/+spec/virt-driver-numa-placement

https://blueprints.launchpad.net/nova/+spec/virt-driver-cpu-pinning

https://blueprints.launchpad.net/nova/+spec/virt-driver-cpu-pinning

https://blueprints.launchpad.net/nova/+spec/virt-driver-large-pages

https://blueprints.launchpad.net/nova/+spec/virt-driver-large-pages

https://blueprints.launchpad.net/nova/+spec/input-output-based-numa-scheduling

https://blueprints.launchpad.net/nova/+spec/input-output-based-numa-scheduling

22

OPNFV Doctor: Fault management use case

Consumer C1 Consumer C2 Consumer C3

Virtualized Infrastructure Manager (VIM), e.g. OpenStack

Resource Map

Server – VM mappingServer S1 VM-1, VM-2Server S2 VM-7Server S3 VM-4

Ownership informationVM-1, VM-7 Consumer C1VM-2 Consumer C2VM-4 Consumer C3

Resource Pool

Hypervisor

Hardware Server S1

VM-1

Hypervisor

Hardware Server S2

Hypervisor

Hardware Server S3

VM-2 VM-7 VM-4

X1. Fault Monitoring

- Hardware fault- Hypervisor fault- Host OS fault

6. Execute Instruction- e.g. migrate VM

2. Inform the Consumer?If YES, find owner of

affected VMs from database

OpenStack Northbound Interface

3. FaultNotification(VM ID, Fault ID)

5. Instruction(VM ID)

4. Switch to SBY configuration

• VIM cannot detect certain NFVI faults; such is necessary to detect the faults and notify the Consumer in order to ensure the proper functioning of EPC VNFs like MME and S/P-GW

23

OPNFV Doctor: Maintenance use case

Consumer C1 Consumer C2 Consumer C3

Virtualized Infrastructure Manager (VIM), e.g. OpenStack

Resource Map

Server – VM mappingServer S1 VM-1, VM-2Server S2 VM-7Server S3 VM-4

Ownership informationVM-1, VM-7 Consumer C1VM-2 Consumer C2VM-4 Consumer C3

Resource Pool

Hypervisor

Hardware Server S1

VM-1

Hypervisor

Hardware Server S2

Hypervisor

Hardware Server S3

VM-2 VM-7 VM-4 6. Execute Instruction- e.g. migrate VM

OpenStack Northbound Interface

3. MaintenanceNotification(VM ID)5. Instruction

(VM ID)

4. Switch to SBY configuration

2. Which VMs are affected?Find Consumer owning the VM(s) from the database.

Administrator

1. MaintenanceRequest(Server S3)

• VIM needs to receive maintenance instructions from the Consumer, i.e. the operator/administrator of the VNF

24

Doctor: Two Blueprints So Far

• Notification Alarm Evaluator– https://review.openstack.org/#/c/172893/

• New nova API call to mark nova-compute down– https

://blueprints.launchpad.net/python-novaclient/+spec/support-force-down-service

https://wiki.opnfv.org/doctor

24 September 2014

OPNFV Introduction




https://blueprints.launchpad.net/python-novaclient/+spec/support-force-down-service







Page 25 © NEC Corporation 2015 NEC Confidential

Summary and Outlook

▌Open Source NFV infrastructure vital for achieving agile development of robust, high-performance and open solutions

▌NFV platform spans across multiple Open Source projects

▌Red Hat and NEC: Upstream first approach

▌OpenStack Telco WG for developing/analyzing use cases within OpenStack community

▌OPNFV: implementing ETSI NFV framework, developing new requirements with an upstream first approach


Thank you

Engineering

The Next Step ofOpenStack Evolution for NFV Deployments