Upload
openstackonline
View
578
Download
0
Tags:
Embed Size (px)
Citation preview
Optimizing Performance in OpenStack CloudsSteve Croce, Senior Product Strategist, Cloud Solutions, Dell
Alok Prakash, Product Manager, Cloud Platforms Group, Intel Corporation
Today’s session
• Dell’s OpenStack Cloud Solution design tenants– Co-engineering – Reference Architectures– Extensions
• Intel: Why and how performance matters in OpenStack – The hidden impacts if performance is not managed– Noisy neighbors – Service Compute Units, Workload placement
OpenStack for the enterprise
• Repeatable, validated end-to-end
• Fully supported software, infrastructure
• Massively scalable, elastic infrastructure
• Solutions to get you agile
• Efficient deployment, rapid time to value
• Performance tuned configurations
• Extended lifecycle support
• Expertise to complement in-house skills
Snowflakes and Houses of Cards
Unstable
Not Repeatable
Not Optimized
Hard to Support
Impossible to Modify
It WILL collapse! One-of-a-Kind!
• Address the gaps around OpenStack for Enterprise use cases via a jointly engineered solution
• Deliver enterprise-grade, certified, highly-scalable, secure, and fully supported cloud infrastructure solutions
• Provide a fulfillment, deployment, and support experience like commercial software
• Support specific enterprise use cases for cloud applications, software defined storage, development and test, cloud applications
Dell Red Hat Cloud Solutions Make
OpenStack the defacto
OPEN
Cloud Solution
in the
Enterprise
Solution components
Dell PowerEdgeDell EqualLogicDell Networking
Dell reference architecture
Red Hat Enterprise Linux OpenStack Platform
Dell Professional ServicesRed Hat Professional Services Dell ProSupport
Dell Red Hat Enterprise Cloud
Solution Extensions
Co-Engineered Reference Architecture
• Validated and Prescriptive – Server configurations
– Network Design and configurations
– Storage configurations
– System sizing recommendations
• Robust, Reliable, Repeatable – Blueprint for success
– Capture and package joint expertise
– Use cases, best practices
– Accelerate your time to value
No more houses of cards
Cloud Management
Performance Management
Software Defined Networking
Solution extensions*
Application Management
* Consult Dell Ref. Architecture for complete details
∞ Investing in OpenStack
∞ Vendors you can count on!
∞ Validated with Dell Ref. Architecture*
∞ Deployment guidance*
∞ Sizing guidance and recommendations*
∞ Professional services and training
∞ Deployment services and tools
∞ Collaborative support
Dell and Intel Service Assurance Administrator
* Under development
Intel: Why performance matters
Intel: Why performance matters
Software Defined Infrastructure – Service Assurance
Automate On Standard IA Hardware
STATIC, MANUAL INFRASTRUCTURE
Storage Network Compute
Resource Pool
Orchestration Software
Infrastructure AttributesInfrastructure AttributesGovernance,
PolicyCooling
Power Trust and Compliance
Application A
Application B
Application C
Application D
Intelligent service monitoringand operational event alerts
SLA Requests
Monitoring
UtilizationPerformance LocationThermalsCompliancePower
Orchestration Software
Trust
Performance
PublicApplication Workloads & Data
Private
Physical and virtualized machine management
Challenges in Running Enterprise and Telco Workloads in the Cloud
CPU IOPS Memory Memory b/w CPU cache Instruction set
Capability Capacity Consumption Availability Resilience
DATA REGULATIONS
LEGACYWORKLOADS
Telemetry is hiddenCOMPLIANCE
Is my workload running on trusted infrastructure?
How do I avoid noisy neighbors?
Will my workload get
expected compute cycles?
Intel® Service Assurance Administrator enhances OpenStack Cloud Services
Nova Scheduler
Plug-In
Compute Node
Machine Flavor
Creator
Analysis & Remediation
Engine
Service AssuranceController
Monitoring Engine
Capacity Insight
REST API
Web Admin
Console
Compute NodeCompute Node Agent
The Challenge Workloads do not get predictable performance Limited trust of multi-tenant nodes Grey machines (unhealthy state) are not avoided Workload scheduling is less than optimal
www.intel.com/assurance
The Solution• Intelligent Workload Placement based on ability of nodes
to meet service level objectives• Compute node capacity and utilization metrics normalized
for use across generation of CPUs• VM performance monitoring and assurance with cache
contention and ‘Noisy Neighbor’ detection• Reporting of trust of VMs and nodes – BIOS, Hypervisor
whitelisting and attestation
Why Intel® Service Assurance Administrator?
Match workloads to platforms, based on capability and
capacity
Find and address software-defined infrastructure issues
Service assurance for trust-attestation
• Performance metric infrastructure capacity and utilization
• Cache Contention• System health• Resource metering
• Boot time trust attestation of nodes
• Whitelisting
Intel® Architecture Platform Monitoring and Control
“What size virtual machine must I use for my app?”
“My app is slow sometimes –How do I diagnose? “
Do I have ‘noisy neighbor’ VMs?”
“Are my VMs running on trusted platforms?”
AutomationEnhance OpenStack* to provision and monitor machine flavors with specified
service levels
Automated software-defined infrastructure
Intel® Service Assurance Administrator
EfficiencyIntegrate with IT operations tools to
determine probable root cause, report, and help remediate issues
Efficient service assurance and administration
Agility
Run workloads with confidence on software-defined infrastructure
Agile business service deployment
Enhanced Machine Flavors
Key Challenges in Cloud Performance AssuranceHow big a
VM do I need for
my workload?
What resources should be
reserved, how much burst capacity?
What metric do I use to specify and
measure performance?
What is the performance
capacity of the node?
Intel® Service Assurance Administrator defines Service Compute Unit as a performance metric for use by cloud administrators
What is the performance capacity
of the machine flavor?
Am I getting specified performance without
interference from ‘noisy neighbors’?
Linux
Compute node
VM VMVM VMKVM
AppAppApp
Services
VM
Monitoring Reporting
Platform monitoring to detect and avoid ‘noisy neighbors’
Cache ContentionMemory b/w
Resource Metering
Noisy Neighbor’s impact on VM performance
Share
d L3
C
ach
e
Core
Core
Core
Core
Core
Core
Core
Core
Shared Infrastructure
Share
d L3
C
ach
e
Core
Core
Core
Core
Core
Core
Core
Core
HypervisorOS Application & Services
VMVM VMVM
What is a Service Compute Unit (SCU)?
18
Provisioning Orchestration
Match workloads to platforms, based on capability and capacity
Service Compute UnitsPerformance metric for cloud
infrastructure capacity and utilization
vCPUApproach
Intel® Service Assurance
Administration
Resource Allocation
Static, processor dependent
SCU – dynamic, processor
independent
Machine Portability
Not portable across
processor family or
generations
SCU is portable across generations
of Intel CPU and across processor
familyPerformance Service Level Monitoring
NoneService Assurance of
SCU units
SCU = function of (frequency, throughput, instruction set efficiency, cache size) 1 GCEU = 2.5 SCUs (Approximate)1 EC2 = 2.5 to 3 SCUs (Approximate)SCUs can be set to define a floor (guaranteed capacity) and ceiling (burst capacity)
How much performance capacity are available on my host?
19
For each compute node, performance capacity status for OS and VMs are
monitored continuously, displaying how much resource is available, e.g. SCU and
number of cores available
How much cache contention are there in the compute nodes?
20
Using Cache Monitoring Technology, console displays LLC Cache Contention
status for the OS and VM on each compute node
Extending flavors with VM bursting and assurance capabilities
21
VM will only be created on nodes that have been trust-attested to be safe during
boot
VM will reserve 1 SCU of compute performance, and burst up to 2 SCU
based on available compute resources
Extending flavors with trust and core-pinning capabilities
VM will reserve one dedicated core from the CPU compute resource
VM will only be created on nodes that have been trust-attested to be safe during
boot
Finding out the performance consumption level for each compute node at the OS level
Allocating SCU for OS level consumption of
compute resource
Finding out VM core-pinning status and metrics Core Pinning, VM received dedicated core from the
CPU
VM SLO status include number of cores assigned
and VM uptime
Learning more about each VM SCU utilization metrics
Real Time VM SCU utilization and core
pinning metrics
Conducting probable cause analysis of “Noisy Neighbors” problems
Cloud Administrator can take remedial action (e.g. evacuate a VM to a different node) once Intel® SAA has identified Noisy Neighbors (VMs aggressively using shared compute resources in the platform, like CPU cache) and VMs that are affected.
Intel® Service Assurance Administrator Key Feature SummaryFeature Customer pain point Solution
Compute throughput metering
How do I know how many VMs and workloads I can put on a compute node?How do I specify how much compute capability is needed by my VM?
SCU, service compute unit, a portable compute performance metric • Characterize node’s compute capacity• Specify required VM capacity• Measure utilization
Host OS and App Services Assurance
How do I know that applications or OS services such as Ceph or backup are not creating issues for my VMs?
Enable creation of Assurance SLA to grant compute quota to Host Apps and OS services
Contention Aware Scheduling
How do I detect ‘noisy neighbor’ VMs and affected VMs? How do I automate the VM scheduling process?
Intelligent Workload Placement based on ranking of node’s ability to meet service level objectives, using telemetry
Grey Machine Avoidance How do I avoid systems that might not be healthy, e.g. running hot with fan failure?
Gather system health info via IPMI protocols and use as a metric in workload placement
Trusted Compute Pool –Policy enforcement
How do I find out that my workloads are running on trusted nodes?
Use Intel® Trusted Execution Technology for boot time attestation of each nodes FW/SW
Contact the Dell Red Hat OpenStack Solutions team• [email protected]
Engage a Dell Solution Center (contact your Dell Rep) • dell.com/solutioncenters
Dell Cloud resources sites• Dell.com/openstack• Dell.com/redhat• youtube.com/Dell
Intel SAA Resources Site• www.intel.com/assurance• Contact Intel SAA team by clicking “Contact Us”
Resources & Contacts
Questions?
Thank you!
Intel Confidential — Do Not Forward31
Legal Disclaimers:
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS.
Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information.
The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intel's current plan of record product roadmaps.
Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families. Go to: http://www.intel.com/products/processor_number.
Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.
Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go to: http://www.intel.com/design/literature.htm
Code names featured are used internally within Intel to identify products that are in development and not yet publicly announced for release. Customers, licensees and other third parties are not authorized by Intel to use code names in advertising, promotion or marketing of any product or services and any such use of Intel's internal code names is at the sole risk of the user
Intel, and the Intel logo are trademarks of Intel Corporation in the United States and other countries.
*Other names and brands may be claimed as the property of others.
Copyright ©2013, 2014, 2015 Intel Corporation. Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others.
The OpenStack Word Mark and OpenStack Logo are either registered trademarks/service marks or trademarks/service marks of the OpenStack Foundation, in the United States and other countries and are used with the OpenStack Foundation's permission. We are not affiliated with, endorsed or sponsored by the OpenStack Foundation, or the OpenStack community."
Intel Confidential — Do Not Forward
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel.
Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
32
Optimization Notice
Dell and Red Hat Enterprise Cloud SolutionsPrescriptive, practical, right-sized choices fit for you
PoC System• Dev/test• Concept testing• Prototyping• Rapid deploy
Pilot System• Mid-scale production• Scale testing• Sizing options, 1-3 racks • 10G networking • High Availability• Optimized Inktank Ceph
Scale out System
• Data center scale• Production workloads• PowerEdge R, C Series• 10G networking • High Availability• Optimized Inktank Ceph• Tailored designs• Customized integrations
with OpenStack ecosystem technologies
Shift your focus from infrastructure to service
IT silos
Serv
er
Sto
rag
e
Ne
two
rkin
g
Soft
war
e
Man
agem
ent
Management
Open standards-based hardware
Infrastructure control
Enterprise applications & workloads
Mu
lti-
ven
do
r, c
ross
-pla
tfo
rm
un
ifie
d m
anag
emen
t
Orchestration stack