11
Independent market research and competitive analysis of next-generation business and technology solutions for service providers and vendors NFV Reality & Assuring Services on Hybrid Networks A Heavy Reading white paper produced for SevOne Inc. AUTHOR: SANDRA O'BOYLE, SENIOR ANALYST, HEAVY READING

NFV Reality & Assuring Services on Hybrid Networks · Source: Verizon SDN-NFV Reference Architecture NFV SERVICE ASSURANCE & UNIFIED OPERATIONS Network operations are under pressure

  • Upload
    others

  • View
    14

  • Download
    0

Embed Size (px)

Citation preview

Page 1: NFV Reality & Assuring Services on Hybrid Networks · Source: Verizon SDN-NFV Reference Architecture NFV SERVICE ASSURANCE & UNIFIED OPERATIONS Network operations are under pressure

MONTH 2015

Independent market research and competitive analysis of next-generation

business and technology solutions for service providers and vendors

NFV Reality & Assuring Services on Hybrid Networks

A Heavy Reading white paper produced for SevOne Inc.

AUTHOR: SANDRA O'BOYLE, SENIOR ANALYST, HEAVY READING

Page 2: NFV Reality & Assuring Services on Hybrid Networks · Source: Verizon SDN-NFV Reference Architecture NFV SERVICE ASSURANCE & UNIFIED OPERATIONS Network operations are under pressure

© HEAVY READING | MARCH 2017 | NFV REALITY & ASSURING SERVICES ON HYBRID NETWORKS 2

INTRODUCTION

Heavy Reading's research shows that every operator is interested in network function virtualization (NFV) in order to benefit from virtualizing network functions and running a software-defined network (SDN) on cloud infrastructure. Half of global operators (51 per-cent) have already started executing NFV, one third (34 percent) are already in the process of defining NFV strategy, and the remaining 15 percent have either defined plans or are still thinking about it. Operators are making progress on virtualizing the mobile packet core, IP Multimedia Sub-system (IMS) and virtual customer premises equipment (vCPE), but all critical network functions will not be virtualized as quickly as expected in the cloud-native way they want. Virtual network functions (VNFs) based on microservices running in containers on cloud-native architecture with standard, reusable components that are highly available and hyper-scalable is a worthy goal, but one that will take time to achieve. The NFV journey continues, but operators are now exploring how to operate hybrid virtual and physical networks, and how to manage services in a common operating model across the different network layers and domains. Service assurance plays a critical role in main-taining the consistency of performance metrics for services as the virtual and physical elements of the service change over time. The dynamic nature of NFV requires new ap-proaches to assurance. Operators are looking at NFV management and network orchestration (MANO) and new requirements for assuring services end-to-end across a hybrid physical and virtual network infrastructure. Today's siloed approach to operations support systems (OSS) carries a significant cost burden. Operators' legacy OSSs are often comprised of thousands of vendor-, technology- and service-specific network management tools. Manual processes are re-quired to knit these disparate tools together. Greater automation would enable telcos to reduce the cost burden of OSS, which comprises around 15 percent of their total operating expenses today. This paper examines the main NFV management concerns of operators, what's required to monitor and assure services on hybrid virtual networks at scale, and the types of solutions and features to consider.

OPERATOR NFV MANAGEMENT CONCERNS

• Operators consider high service availability to be more important than high hardware availability.

• 70 percent of operators say they want consistent cloud management platforms for managing NFV and non-NFV cloud applications.

• In order to deliver a consistent quality of experience, operators need visibility into how everything in the cloud network environment is working together – from the bare metal server to the application itself.

• There is data in every layer that makes up the application or service; existing network monitoring tools can't tell the full story of customer experience in a NFV environment.

Page 3: NFV Reality & Assuring Services on Hybrid Networks · Source: Verizon SDN-NFV Reference Architecture NFV SERVICE ASSURANCE & UNIFIED OPERATIONS Network operations are under pressure

© HEAVY READING | MARCH 2017 | NFV REALITY & ASSURING SERVICES ON HYBRID NETWORKS 3

• Operators are looking for a holistic view of services through the eyes of customers and how factors impact the service over time. Operators are not just looking at the storage, compute, hypervisors or the network, but also at how they interact, and at the customer and service context.

• Services are no longer tied to a fixed set of appliances. They will comprise a collec-tion of variables with paths through the infrastructure that must be tracked and monitored, e.g., virtual private network (VPN) connection, firewall, hosted voice, load balancer, service provisioning, active testing component, etc.

• One third of operators say it's critical to be able to create any service on the fly, mix-ing and matching interoperable VNFs to achieve that.

• An intelligent performance management system must connect the dots; show what needs investigating; tie events, faults and alarms to customer-impacting issues; and alert service-level agreement (SLA) management systems, only showing what's rele-vant to business operations and suppressing noise.

• Infrastructures must react and automatically scale out when an application or service demands, and then scale back when demand decreases. It's essential for the per-formance management platform to help make recommendations and provide auto-mated actionable intelligence.

• Visualization of performance issues requires a retooling of how we think about infra-structure problems. In the past, if you had a pool of 100 servers, you'd want to know which server is not performing as it typically does. But with NFV and SDN, you must ask: "Which server is not behaving like the other 100 servers in the pool?"

• From Heavy Reading's research in the fourth quarter of 2016, just one third of mobile operators currently monitor their network end-to-end across domains; this has to change with NFV deployments, where performance management must be end-to-end in order to interact with the infrastructure and orchestrate change based on intelli-gence gathered.

• Other NFV management concerns, according to Heavy Reading's research, include the inability to match real-time network/service impact with physical/virtual device failure, and in general immature service assurance tools. However, this is starting to change as we see vendors focus on delivering service assurance solutions to support operators moving to NFV.

"We'll be operating hybrid infrastructure for some time, so you must be able to monitor and manage both physical and virtual aspects from a single platform in order to understand dependencies and business impacts in these highly elastic and complex environments," says Andre Fuetsch, President AT&T Labs and Chief Technology Officer at AT&T. "Virtualization requires new approaches to managing the traditional stack of services and can be an enabler for reduced operating costs while building out a more reactive and extensible network through the dynamic allocation of capacity that is available in a virtualized envi-ronment," says John Dye, Director of Application Development at Sprint. "To do this, we will have to drive a rationalized view of the network that can account for the mix of virtual and standalone components based on the operational state at any point in time. Network virtu-alization must be sufficiently evolved to support operational integrity and to integrate with the core functions of service assurance and performance management in order to maintain the service levels for our customers."

Page 4: NFV Reality & Assuring Services on Hybrid Networks · Source: Verizon SDN-NFV Reference Architecture NFV SERVICE ASSURANCE & UNIFIED OPERATIONS Network operations are under pressure

© HEAVY READING | MARCH 2017 | NFV REALITY & ASSURING SERVICES ON HYBRID NETWORKS 4

CURRENT STATE OF NFV

Although the European Telecommunications Standards Institute (ETSI) has taken the lead in NFV standardization (Figure 1), open source communities also play a fundamental role. Most of the initial NFV architecture driven by the largest operators (e.g., AT&T, Verizon, Telefónica, Deutsche Telekom, NTT, etc.) use OpenStack as the virtualized infrastructure management (VIM) system and use virtual machines (VMs) with hypervisors, such as KVM and VMware ESXi, to run the VNFs. NFV infrastructure (NFVI) is the term for the cloud platform that provides a virtual execution environment for VNFs on top of physical server, storage and networking resources. Figure 1: ETSI NFV – Architectural Framework

Source: ETSI The NFV goals of agility and launching new services quickly involves virtualizing entire classes of network node functions into building blocks that may be chained together to create communications services. VNFs can consist of one or more VMs or containers running on standard servers, switches and storage or cloud computing infrastructure, instead of having customer hardware appliances for each network function. A recent Heavy Reading survey of more than 120 operators globally reveals that 86 percent of telecom operators consider OpenStack to be essential or important to their success; 22 percent are currently using OpenStack for NFV; 39 percent are testing new use cases on OpenStack; and 38 percent are considering it for NFV. Benefits cited include offering new

Page 5: NFV Reality & Assuring Services on Hybrid Networks · Source: Verizon SDN-NFV Reference Architecture NFV SERVICE ASSURANCE & UNIFIED OPERATIONS Network operations are under pressure

© HEAVY READING | MARCH 2017 | NFV REALITY & ASSURING SERVICES ON HYBRID NETWORKS 5

services more quickly, faster data center virtualization and lower operating and software costs. Telecom operators also identify the Internet of Things (40 percent in production) and 5G (30 percent in production) as major NFV use cases. There are also various open source initiatives, including ETSI Open MANO and ECOMP/ Open-O, that are working on how OSS must change to support the new, dynamic environ-ment of NFV. A key challenge that NFV operators face is defining the interactions between existing OSS and NFV MANO. The NFV MANO component takes care of the NFVI (the VIM), VNFs and network services orchestration (composed of VNFs and NFVI resources), but it does not handle services that span both physical and virtual infrastructures. Yet, such hybrid physi-cal/virtual network services will likely be the reality for most operators for the foreseeable future – up to a decade, according to some operators. Also, while service activation and provisioning appear to be handled by MANO, other aspects (such as service assurance, fault finding and long-term capacity planning) are less well catered for. This can be fixed by adding new capabilities to the MANO, with new perfor-mance monitoring and assurance tools and transforming traditional OSS. For many years, OSS has been seen as an obstacle, a web of legacy systems that slow down new service launches and innovation. NFV provides an opportunity to re-think OSS functions, such as service performance, fault monitoring, SLA management, etc., that are better equipped to handle services composed of VNFs. For example, many VNF workloads:

• Are highly variable, in contrast to IT workloads, and many have very specific, fine-grained requirements for their execution environment.

• Need consistent, predictable performance with extremely low levels of latency.

• Exchange enormous amounts of traffic with the external network in contrast with IT workloads, which generate high volumes of intra-data center (East/West) traffic.

• Are interdependent by definition, as component functions of an interconnected net-work and network services.

Since operators will be running hybrid networks for the foreseeable future, with a combina-tion of physical and virtualized network functions, this will require OSS, cloud management, data management and new orchestration and performance management functions to work together more closely. To provide end-to-end service assurance of hybrid networks, a service assurance solution will need to rely on legacy network information (e.g., performance counters, alarms, events, etc.), NFV (e.g., OpenStack) and SDN controllers (e.g., OpenDaylight), as well as real-time big data analytics and correlation as inputs for actiona-ble and eventually automated decision making. Verizon's NFV architecture (see Figure 2) is heavily grounded in the reality of hybrid networks. NFV has to coexist with legacy networks, as shown below, by physical network functions (PNFs) controlled by an element management system (EMS). However, Verizon expects that with the standardization of control protocols and data models, EMSs will gradu-ally be replaced by new systems (such as SDN controllers and generic VNF managers) that work across vendor and domain boundaries.

Page 6: NFV Reality & Assuring Services on Hybrid Networks · Source: Verizon SDN-NFV Reference Architecture NFV SERVICE ASSURANCE & UNIFIED OPERATIONS Network operations are under pressure

© HEAVY READING | MARCH 2017 | NFV REALITY & ASSURING SERVICES ON HYBRID NETWORKS 6

Figure 2: Verizon SDN-NFV Architecture

Source: Verizon SDN-NFV Reference Architecture

NFV SERVICE ASSURANCE & UNIFIED OPERATIONS

Network operations are under pressure to reduce operating costs and complexity, and ensure that NFVI works, that they can pinpoint service faults in real time across physical and virtual network elements, and that they can proactively monitor service performance for customer SLAs. Getting this right is essential to meet the business goals of NFV to make it easier and faster to launch new services and make changes to existing services, deliver a better customer experience, grow revenue through third-party services and ecosystems, etc. Migrating to NFV architecture will impact operations across the business as NFV becomes a catalyst to drive new business processes. Much of this is organizational change, as the entire operations model – including processes, tools and technology, as well as people and organi-zation – must be redesigned for each functional area within Service Design and Fulfilment, Service Operations and Readiness, and Service Assurance (see Figure 3). A significant impact of NFV and SDN implementation will be negating the need for vendor-specific and/or service-specific network management tools that service providers have been using to configure, monitor and troubleshoot their networks. Operations will need to evolve from service/operational silos to standardized cross-services operations management. ETSI also argues that full automation of the capacity management, optimization and recon-figuration cycle should be done by orchestration and cloud management techniques with open and multi-vendor components, rather than vendor-specific management solutions. Operators must reduce operating expenses to support the NFV business case and migrate to "automation first, management by exception." Common errors and performance degradations

Page 7: NFV Reality & Assuring Services on Hybrid Networks · Source: Verizon SDN-NFV Reference Architecture NFV SERVICE ASSURANCE & UNIFIED OPERATIONS Network operations are under pressure

© HEAVY READING | MARCH 2017 | NFV REALITY & ASSURING SERVICES ON HYBRID NETWORKS 7

must be identified and addressed via automated self-healing and self-optimization rules, e.g., scaling out additional VMs or containers. The network is so complex that it is impossible for humans to identify and manage the sheer number of alarms, events and issues causing problems. Operators are looking to machine learning and intelligence, as well as complex algorithms, to identify anomalies and unknown interdependencies and patterns and priori-tize issues that are actually impacting customers' services and experience. Figure 3: Functional Domains Across the Service Lifecycle

Service Lifecycle

Functional Domain Automation Opportunity

Service Fulfillment

Activation and Provisioning Service portal and templates with service definition and composition using YANG, NETCONF, and TOSCA to enable end-to-end chaining from multiple vendors. NFV Orchestrator requests VNF manager to provision VNFs, service chains and active testing confirms service is operational.

Operations Support & Readiness

Change Management Capacity Management Inventory Management DevOps

Real-time topology view and inventory manage-ment system provides visibility on network and service state. Analytics layer using streaming telemetry data to adjust network resources (bandwidth, traffic priorities).

Service Assurance

Performance Management Fault Management SLA Management End-to-end Service Management

Error detection and fault reporting; reroute services automatically to limit disruptions. Real-time picture of end-to-end service quality, network components and infrastructure performance – both physical and virtual.

Source: Heavy Reading Critical functions, such as fault, outage and performance management, must be supported with smooth handoffs across different teams that maintain physical and virtual network resources. This is facilitated by having a common unified view of "service lifecycle perfor-mance, faults and outages," even if teams are using different underlying tools to resolve issues, etc. A DevOps-based model will drive closer coordination between operations and development teams to improve service agility and quality, as well as develop new skills and common understanding. Network engineers will need to apply DevOps principles to the network's tolerance for frequent changes and automated testing. Operations become more deeply involved in solution design and end-to-end testing of software prior to going live, and feedback from operations is then rolled back into development. Also key to enabling better integration of service assurance solutions are the application programming interfaces (APIs) and common information models that allow multiple vendors' systems to coexist on one network. Traditional service assurance solutions were typically designed as closed systems that required proprietary integration with other OSS and network systems. This imposed a large cost burden for operators. Under NFV, the proposition changes such that standards bodies and consortia are defining APIs to which solution vendors must conform. At the service assurance level, these APIs could be for alarm management or processor performance data. Service assurance systems should support open APIs to enable

Page 8: NFV Reality & Assuring Services on Hybrid Networks · Source: Verizon SDN-NFV Reference Architecture NFV SERVICE ASSURANCE & UNIFIED OPERATIONS Network operations are under pressure

© HEAVY READING | MARCH 2017 | NFV REALITY & ASSURING SERVICES ON HYBRID NETWORKS 8

a microservices-based architecture and support standard data models, such as TM Forum's Open Digital Ecosystem data model, the IETF's YANG and OASIS's TOSCA. BEST PRACTICES FOR NFV SERVICE ASSURANCE

For effective management of services in a hybrid virtualized environment where performance is highly dependent on underlying cloud infrastructure, self-learning and predictive techniques must be developed to manage end-to-end service performance by intelligently correlating inputs at all levels and across locations. This can be achieved by adopting some of the leading practices as outlined below:

• End-to-end, real-time visibility of infrastructure performance, including network components and cloud infrastructure. This involves real-time telemetry to monitor physical network functions (Layers 2/3), as well as VNFs and cloud infra-structure, including network compute, storage across OpenStack, VMWare, Docker, etc.

• There are fixed and limited resources that must be tracked and trended in order to understand the capacity and performance of Layer 1-3 VNF functions. Moni-toring the performance of those functions on a day-to-day basis is needed, including whether those devices have the necessary capacity to continue functioning properly, e.g., CPU utilization, memory utilization, buffers and statistics.

• A single view of hybrid NFV networks (physical and virtual networks) using a multi-vendor performance management solution that can provide a single source of truth. This must be a system that can be easily integrated with other vendor fault management systems for deeper root cause analysis.

• The solutions must also have the ability to adapt the virtual service in real time, yet maintain all health and performance historical records as the service evolves.

• Point-in-time visualization will play a crucial role in allowing service providers to find any particular point back in time, analyze the network or service, and see its per-formance during that exact period. Diagnosing, troubleshooting and triaging problems in live, dynamic virtual networks will rely heavily upon this method and progress.

• New or revised key performance/quality indicators (KPIs/KQIs), e.g., In-frastructure Response Time, VNF Contention Analysis, etc., and sophisticated algorithms that can correlate inputs at all levels and provide insightful performance views across VNFs and virtual and physical infrastructure are essential.

• Predictive analytics must be leveraged to proactively manage resources based on predicted faults, dynamically update policies and rules based on real-time traffic char-acteristics. This can help detect anomalies and potential issues before they happen, minimizing the occurrence of issues across the virtualized infrastructure.

• Self-optimization and automation capabilities must be introduced in performance management modules that can optimize configuration based on current network per-formance, e.g., scale up VMs, add new VNF instances for load balancing, configure new routes between VMs, etc.

• Early fault detection and mitigation is key to deliver carrier-grade service availa-bility and improve the customer experience. With the ability to proactively correlate physical and virtual level faults at a service level and performing VNF/network topol-ogy reconfiguration, SLAs can be proactively monitored and mean time to repair (MTTR) can be greatly reduced.

Page 9: NFV Reality & Assuring Services on Hybrid Networks · Source: Verizon SDN-NFV Reference Architecture NFV SERVICE ASSURANCE & UNIFIED OPERATIONS Network operations are under pressure

© HEAVY READING | MARCH 2017 | NFV REALITY & ASSURING SERVICES ON HYBRID NETWORKS 9

• Dynamic SLA management is needed with NFV to be able to monitor SLAs in real time, even as the service evolves and changes. Additionally, service assurance should be more closely integrated with service fulfilment in order to enable operators to rapidly provision and assure services.

• SLAs and operating-level agreements (OLAs) must include key operational parameters, such as service response time and scalability, packets lost, etc., and not be limited to the time in which an assigned ticket is acknowledged. Leading operators are also using predictive analytics on SLAs to flag cases where SLAs will not be met, so that proactive action can be taken to scale resources and avoid SLA penalties. This could certainly be automated in the future.

• To manage and meet expectations on a per-customer basis for multiple services, the focus must shift from merely monitoring network- and node-level KPIs, and turn toward analysis and correlation of service performance at every layer of the network stack.

• End-to-end service quality management (SQM) with integrated dashboards that provide the ability to drill down along the VNF chain all the way to the underlying virtual resources and help localize issues.

• Cross-domain correlation based on metrics for service accessibility, integrity and retention, which are built on new/revised KPIs/KQIs with inputs from VNFs, virtual-ized infrastructure and network layers.

• Understand the impact of infrastructure on the business: What's happening in the context of a "business transaction" – situation-aware and context-aware moni-toring and assurance for network and IT operations. This helps to avoid finger pointing over whether it's a network or application issue. End-to-end service man-agement also must integrate with OSS and business support system (BSS) for full service lifecycle and customer experience management.

CONCLUSION

Given the challenges associated with the transition to virtualized networks, service assurance solutions have become really important in the areas of network topology, service and application performance, planning for future capacity needs and guaranteeing SLAs – all while integrating with the legacy network already in existence. Historically, operators have tended to build the network first, then implement manage-ment tools after; with NFV, the focus is shifting to management and assuring the custom-er experience first. NFV significantly increases the pace of change of network topology, thus requiring greater autonomy and automation in the network, as well as real-time service assurance. In implementing NFV, service providers must avoid the trap of considering it as a separate silo to their existing network. Any service assurance solution should be cross-domain, encom-passing both NFV and physical networks. In physical networks, service assurance should provide a common view across multiple services, such as voice and data. It should even span the network and IT assets of the operator, providing monitoring capabilities and cross-layer correlation. Within the NFV environment, in addition to collecting traditional data (such as throughput, drop rates and errors), CPU utilization of the underlying computing and switching

Page 10: NFV Reality & Assuring Services on Hybrid Networks · Source: Verizon SDN-NFV Reference Architecture NFV SERVICE ASSURANCE & UNIFIED OPERATIONS Network operations are under pressure

© HEAVY READING | MARCH 2017 | NFV REALITY & ASSURING SERVICES ON HYBRID NETWORKS 10

hardware should be monitored. The MANO will be a key data source for service assurance, as well as its VIM and VNF manager subcomponents. Looking ahead to when operators will operate cloud-native networks and hyper-scale cloud infrastructure with VNFs running in containers, service assurance solutions will need to be able to manage and monitor containers and microservices and track the components of a service path through that infrastructure. The other trend toward mobile edge computing (MEC) and distributed cloud architectures means that assurance and performance monitoring will need to reach the network edge to assure latency-sensitive data apps that will be placed as close to the customer as possible. Operators that reengineer their service assurance systems for a hybrid NFV world should be able to meet the high service quality their customers expect, while keeping network operating costs under control. Simultaneous instantiation and assurance of VNFs is an imperative for delivering reliable services on demand – the key to the new revenue potential from NFV-based services.

THE SEVONE NFV SERVICE ASSURANCE SOLUTION

The SevOne NFV Service Assurance solution gives communications service providers (CSPs) the real-time visibility they need to efficiently monitor and manage physical and virtual network environments using a single, unified platform. Key benefits of the solution include:

• Accurate, real-time information on the state of the infrastructure used to provide access, transport, switching and services, so that operators can identify customer experience performance-affecting behaviors.

• Visualizations of what traffic was flowing through which links at what times, assisting operators with identifying traffic behaviors that impact their users or applications.

• Actionable insights into events occurring in the infrastructure that trigger customer experience-affecting behaviors, so CSPs can identify and remedy issues as they occur.

• Accurate representations of what end users experience when accessing applications as connectivity in the network changes dynamically.

• Assurance for each of the key layers of reference architectures.

• Lifecycle management of the physical and software resources supporting virtualization, and necessary integration of existing B/OSSs external to the NFV system.

• CSPs can set and adjust KPIs across multiple vendor implementations of:

o VNF Instances – EPC, IMS, RAN, etc.

o VNF Infrastructure Instances: compute, storage, network and VMs.

ABOUT SEVONE

SevOne provides the world's largest CSPs, MSPs and enterprises with the most comprehen-sive technology portfolio to collect, analyze and visualize network and infrastructure perfor-mance data to deliver actionable insights to compete and win in the connected world. SevOne

Page 11: NFV Reality & Assuring Services on Hybrid Networks · Source: Verizon SDN-NFV Reference Architecture NFV SERVICE ASSURANCE & UNIFIED OPERATIONS Network operations are under pressure

© HEAVY READING | MARCH 2017 | NFV REALITY & ASSURING SERVICES ON HYBRID NETWORKS 11

serves organizations that are looking to complex, dynamic next-generation infrastructure, such as SDNs, orchestrated containers and cloud technologies to support their business goals.