39

Design Considerations for Large Scale Deployment of Oracle ... · PDF fileAbout Oracle VM X86 and SPARC Oracle Virtual Networking Oracle Linux ... –Oracle VM Virtual Machines are

Embed Size (px)

Citation preview

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Design Considerations for Large Scale Deployment of Oracle VM in Oracle’s Managed Cloud Service

2

Jose Fernando Niño Higuera

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Safe Harbor Statement

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

3

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Program Agenda

What is Oracle Managed Cloud Services (OMCS)

OMCS Design Criteria

How We Use Oracle VM

Lessons Learned and Best Practices

Benefits of Oracle VM and OMCS

1

2

3

4

5

4

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Managed Cloud Services (OMCS) ( )

5

1. Purchase Software Product and self-host

2. Subscribe to Software as a Service (SaaS)

3. Have OMCS host the customer’s software

Three Ways for Customers to use Oracle Software

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle SaaS, PaaS, IaaS Cloud Offerings

Customers Marketing Sales Service People Global Human Resources Talent Management Business Financials Procurement Project Portfolio Management

Platform Database Java Database Backup Developer Documents Business Intelligence Mobile

Clo

ud

Mar

ketp

lace

Supply Chain Value Chain execution Product Value Chain Enterprise Performance Enterprise Planning Financial Planning Social Social Network Social Marketing Social Engagement & Monitoring Social Data & Insight

Infrastructure Compute Storage Messaging

Software as a Service

Infrastructure as a Service

Platform as a Service

Multi-Tenant Shared Machines* Oracle owns Hardware and Software – Customer pays for usage

* typically

6

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Managed Cloud Services (OMCS)

Oracle Database Fusion Middleware Web Center Engineered Systems Identity Management

Extended Managed Cloud Service

Applications Managed Cloud Service

Technology Managed Cloud Service

E-Business Suite PeopleSoft Siebel J D Edwards Hyperion Business Intelligence Commerce Agile Retail Governance Risk & Compliance

Fusion Applications Demand Management Markdown Optimization Information Discovery Project Management Beehive Collaboration Transportation Management User Productivity Kit Retail Predictive Application

Backup Refresh Upgrade Migration CEMLI Management Business Transaction Monitoring Security PCI & HIPAA Compliance Disaster Recovery Non-Production Environment Service Other Extended Services

Single-Tenant Dedicated Machines Customer owns Software - Oracle owns Hardware and manages everything

Typically in the Oracle Data Center – but sometimes @customer/partner

7

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Managed Cloud Services (OMCS)

Why use OMCS – Single Provider:

Hardware, Software, Network, Storage, Interoperability If there’s a problem, it’s Oracle’s problem

– Expertise: Let Oracle manage Oracle Large expert pool available around the clock Direct Access to Product Development Groups

– Leverage: Design & Optimize once, Repeat often

8

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Virtualization Strategy

• At the core of Oracle’s cloud strategy

• Integrated VM lifecycle & cloud management solution with Oracle Enterprise Manager

• Supports both x86 and SPARC

• Integrated with OpenStack

• Cloud platform for Oracle & Non-Oracle applications – Supports Oracle Linux, Oracle Solaris, Microsoft Windows

9

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

About Oracle VM

X86 and SPARC

Oracle Virtual Networking

Oracle Linux Oracle Solaris

Oracle VM Templates Oracle Real Application Clusters(RAC) Oracle E-Business Suites Oracle JD Edward EnterpriseOne …….

Oracle Enterprise Manager

Integrated for scale & ease of deployment

Boosts Performance by 30%.

End-to-end Management- Physical to Virtual To Cloud

10

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

OMCS Design Criteria

1. Security

2. Isolation of Users (Customers)

3. Stability

4. Disaster Containment

5. Large Scale

6. Performance

7. Cost

11

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

OMCS Design Criteria: Security

• Protect the Customer’s Data under all circumstances

• Rigorous Security Review Process Deployment Architecture Need-to-know access to data Authentication, Authorization & Audit for all Activities

• Guard against unauthorized Access Intrusion Detection

• Perimeter Security around the Deployment Cloud Additional Segregation & Firewalls within

• Security Patches via Routine Maintenance

Virtualization allows us the luxury of having separate Machines for each Tier within each Instance

Security

Isolation of Users (Customers)

Stability

Disaster Containment

Large Scale

Performance

Cost

12

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

OMCS Design Criteria: Isolation of Users (Customers)

• Strong Separation between Customers is mandatory One customer must never see another’s data or activities

• After years of testing and Operation

– Oracle VM Virtual Machines are proven just as safe as physical machines

– Virtual disks with backing store from a storage pool are safe

• Flat Network alone doesn’t isolate enough for OMCS iptables + ebtables + Perimeter Firewall provide Security vLAN + Internal Firewall provide Isolation

Oracle VM Virtual Machines plus vLANs meet our Segregation Requirements

Security

Isolation of Users (Customers)

Stability

Disaster Containment

Large Scale

Performance

Cost

13

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

OMCS Design Criteria: Stability

• Preservation of Data must never be in question

• What matters is the Customer’s App Infrastructure components have to support this goal

• Avoid Single Points of Failure Redundancy wherever possible

Choose proven, stable Infrastructure Components with active/active or active/passive Failover Capabilities

Security

Isolation of Users (Customers)

Stability

Disaster Containment

Large Scale

Performance

Cost

14

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

OMCS Design Criteria: Disaster Containment

• In spite of the best prevention, high-impact infrastructure breakdowns can happen

• Backups, of course Online snapshots, on-site storage, off-site

• Limit the number of VMs that can be affected

• Fully segregated Zones

• We accept certain limitations e.g. limited migration mobility between zones

• Optional Disaster Recovery at a different Data Center

We partition each data center into smaller self-contained “Zones”

Security

Isolation of Users (Customers)

Stability

Disaster Containment

Large Scale

Performance

Cost

15

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

OMCS Design Criteria: Large Scale

• Design for 100,000 Virtual Machines Actual number of deployed VMs is approaching 20,000

• Allow very large VMs 100+ vCPUs and multi-TB memory Typical today: 4 – 32 vCPU and 16 - 128 GB Typically 4-8 GB per vCPU for Oracle Applications

• Accommodate Multiple concurrent Operators Start, stop, resize, clone, etc

• Service 1,000+ Customers Each Customer is a Corporation Quick turn-around time at scale Oracle VM 3 with multiple Oracle VM Managers

and lots of Server Pools

Security

Isolation of Users (Customers)

Stability

Disaster Containment

Large Scale

Performance

Cost

16

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

OMCS Design Criteria: Performance

• The Deployment Architecture must not substantially limit performance provided by the underlying raw hardware

• The Networking Stack in VMs must run at GigE+ speeds

• Use the best Virtualization Method available for each use case

Select Paravirtualization wherever possible.

Otherwise Hardware Virtualization with PV Drivers

Security

Isolation of Users (Customers)

Stability

Disaster Containment

Large Scale

Performance

Cost

17

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

OMCS Design Criteria: Cost Reduce Cost through standard building blocks and repeatable Process

• Standard Hardware Over-provisioning in the interest of Uniformity is acceptable

• “Certified Configuration” images for all Applications and Databases Pre-Design a library of standard building blocks Invest in Tuning and Testing Re-use these for every customer

• Repeatable Standard Process Cookbook Automation wherever possible

• Share Infrastructure where Possible Segregate where Necessary

Standard Hardware stays fixed for one model year Certified Configurations with Periodic Updates

Process Cookbooks

Security

Isolation of Users (Customers)

Stability

Disaster Containment

Large Scale

Performance

Cost

18

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle VM in OMCS

Introduced first Xen

based Virtual Machine

All Server Virtualization in

OMCS uses Oracle VM.

2005 2006 – 2007 2007 2014

Evangelize, Certify, Prove Security

Experiment with HVM vs. ParaVirt vs. other

Introduce Windows VM

Performance Optimization Work

Oracle VM 0.9 (pre-release and joint

Beta) Linux Paravirtualized only

All Server Deployments are

virtualized by default

15,000 + VMs in operation

2008

19

Oracle VM Initial Release

Oracle VM 2.2.2 in use in some Legacy Zones

Oracle VM 3.2.4 in use since 2012

Oracle VM Pre-release beta

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle VM Deployment in OMCS

Network

Server Pool

Storage

Oracle VM Manager

Zone

Self-contained Zones Even a catastrophic zone failure of one zone cannot affect the other zones

All required networking equipment Switch/Router, Load Balancer, Firewall, Security Single switch hop from any to any node in zone (full 10GigE bandwidth, no shared uplinks)

Redundant Storage NAS per Server Pool SAN per Zone

Four Oracle VM Server Pools with 12 physical servers each (48 total)

One Oracle VM Manager Instance Also: Legacy Zones Giant Zones, being migrated / converted

20

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Global Deployment

Shared Services

Network

Server Pool

Stor age

OVM Manager

Network

Server Pool

Stor age

OVM Manager

Network

Server Pool

Stor age

Oracle VM Manager

Data Center 1 Data Center 2

Enterprise Manager

Zone 1

Zone 2

Zone 3

7 Data Centers 4 Primary 3 Disaster Recovery

Multiple Zones

per Data Center Target 100 zones total

Shared Service Zone

Redundant or non-critical

Enterprise Manager One global Instance Redundant

Global

21

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

What do OMCS Customers share?

• Shared

– Data Center Real Estate

– Power, Cooling

– Generic Network Internet, WAN, LAN

– Storage Pool

– Physical Server Pool

• Dedicated – Customer Network

WAN and/or VPN

DNS Name Space

LAN Subnets and VLAN

– Customer specific Gear

– Customer Storage Shares, Projects, LUNs

– Machines Virtual and Physical

22

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

OMCS Server Hardware

• Standard Building Block Sun Server X4-2 with 24 cores (48 Threads) and 512 GB memory CPU oversubscription yields approx 50 - 100 vCPUs Bonded dual 10gigE NIC

• Specialty Configurations Sun Server X4-4 for high performance applications Sun Server X4-2L with SSD for low latency transient storage 128GB memory configuration for certain 32-bit VMs FibreChannel and Infiniband

23

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Networking

• Bonded dual 10gigE NIC per physical Server vLANs in the Hypervisor

• Administrative vLANs Dom-0 login Cluster Heartbeat Live Migration NFS Network

• Separate vLANs per Customer Public and Private Middle Tier Access Privileged and Restricted Database Access Database Cluster Interconnect

• PVLAN and XVLAN

24

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Engineered Systems Engineered Systems are an integral part of OMCS Deployments

Exadata Database Machine Physical Machines only

Oracle Big Data Appliance Oracle Exalytics In-Memory

Machine Physical Machines only

Oracle Exalogic Elastic Cloud With Server Virtualization, IB Partitioning

Oracle SuperCluster With Hardware Virtualization

25

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Standard and Custom Automation

• Custom Scripting against the Oracle VM Manager and Oracle Enterprise Manager APIs

• Library of Partial and complete Workflows

• Build abstract composite objects (“Instance”) in one command

• Infrastructure Provisioning Subnet, IP, DNS, vLAN, Firewall, LoadBalancer

• Application Provisioning

Automate Provisioning of Server, Network, Storage at once

26

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Do we use Physical Machines?

No, not normally.

Virtualization is the Default

We even virtualize single VMs using

the whole machine

Third-party Applications

which are not certified as VMs

Specialty Applications Appliances,

some Infiniband, FC, DAS

Yes, sometimes.

27

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Lessons Learned

• Don’t hesitate to virtualize machines Oracle VM is mature, stable, enterprise-proven

• Don’t put all your eggs in one basket Compartmentalize large domains into smaller zones

• Oracle VM Managers can themselves be hosted in VMs Just no circular References!

• Virtualization is also a great tool to right-size machines for License Compliance

28

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Re-Sizing VMs

Reasons: 1. Most Applications choke on resource reductions 2. Most Applications ignore resource increases 3. Those that can deal with it require re-tuning

We always stop / start the VM after vCPU or Memory resizing

29

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Live Migration and Ksplice

Ksplice Allows us to patch the kernel in the running VM

Live Migration Used occasionally for Server (HW) Maintenance Used occasionally for Capacity Rebalancing

We cannot always impose a reboot (downtime) on our customers

30

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

How We Migrate Machines

Our Data Center safety zones prevent Live Migration across Zones Occasional Live Migrations within a Zone The majority of moves are Cold Migrations a) Shutdown b) Image move/copy across zone via Router c) Restart

Special Case: LiveMigration-to-self Useful to re-initialize certain driver functions

31

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

The VM Tetris Problem

32-bit VMs have to reside in lowest 128GB of physical Memory

Start 32-bit VMs first, then 64-bit

Repeated start-stop cycles of mixed 32-bit and 64-bit VMs lead to fragmentation

Eventually, no low memory can be found, and VMs fail to start or live migrate

Our Solution: Limit 32-bit VMs to small physical machines (128GB memory)

32

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Core Dumps

Core Dumps can become large and take a long time Our Environment: Hypervisor core dumps entire memory to local disk VMs: Core dump to local disk NetDump no longer used (too slow)

33

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Performance Considerations

• We found that paravirtualized VMs run at nearly the native speed of the physical machine

• HVM imposes some performance decrease, can be greatly reduced through PV drivers

• The Network stack is plenty fast enough for most applications

We treat VMs as fully equivalent to physical machines

The Advantages of OVM far outweigh some negligible performance impact

We treat VMs as fully equivalent to Physical Machines

The Advantages of Oracle VM far outweigh some negligible performance loss

34

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Designed & Tested Together

• Oracle Managed Cloud Services – 15,000+ VMs

• Internal Testing

– 22,700 Oracle x86 servers supporting 182,400 Oracle Virtual Machines

– 26,700,000 test and production hours per week

– Workloads: software/hardware development, corporate infrastructure

• Test Environments

– Oracle x86 Server Hardware

– Oracle Storage

– Oracle Operating Systems (Oracle Solaris and Oracle Linux)

– Oracle VM

– Oracle Database , Oracle Middleware, Oracle Applications

Oracle Develops & Uses The Stack Internally

35

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

@ORCL_Virtualize

Facebook.com/ OracleVirtualization

Blogs.oracle.com/virtualization

Oracle VM Group

YouTube.com/ OracleVirtualization

Download: edelivery.oracle.com/oraclevm Visit us: oracle.com/virtualization

Stay Connected Join the Oracle VM and Oracle Cloud Communities

@OracleCloudZone, #OracleCloud

Facebook.com/ OracleCloudComputing

Blogs.oracle.com/cloud

Learn more: oracle.com/cloud Try now: cloud.oracle.com

36

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 37

“We are a diverse business with a multitude of internet and media properties growing at varying rates, as well as frequent acquisitions and divestures with a global reach. Oracle Managed Cloud Services provides us with the scalability and flexibility to successfully manage an expansive and complex organization that supports several finance and accounting management teams.” Paul Scribano, Vice President, Finance, Mindspark Interactive Inc., part of IAC Search and Media Inc., IAC

“We have a small IT staff, so we must work very efficiently and optimize our resources. Oracle Managed Cloud Services is critical to our ability to run and optimize our Oracle E-Business Suite environment. It ensures extremely high availability, timely patches and maintenance, industry-leading 24/7 support, and world-class system backup and recovery.” Cindy Shieh, Information Systems Manager, Greenball Corp.

“Oracle E-Business Suite running through Oracle Managed Cloud Services provides a compelling value proposition for Genworth Financial. It allows us to take advantage of industry-leading enterprise applications and gain the expertise of Oracle managing the applications on Oracle technology.” JP Raffenot, Director of IT/Applications, Genworth Financial Inc.

What OMCS Customers Are Saying:

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 38