31

High Availability for Enterprise Clouds

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: High Availability for Enterprise Clouds
Page 2: High Availability for Enterprise Clouds

High Availability for Enterprise Clouds:Oracle Solaris Cluster and OpenStack

Eve Kleinknecht Principal Product Manager

Thorsten Früauf Principal Software Engineer

November 18, 2015

Page 3: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Safe Harbor Statement

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

3

Page 4: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Agenda

OpenStack on Oracle Solaris

Oracle Solaris Cluster for OpenStack

HA for OpenStack cloud controller on Oracle Solaris – two main topologies to achieve HA

• fine grained approach • blackbox approach

– pros/cons for those topologies

Discussion - Q / A

1

2

3

4

4

4

Page 5: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

OpenStack Overview

• Open source cloud software – Generic solution for IaaS, PaaS and SaaS

• Oracle OpenStack optimized for – Database as a Service, Java as a Service

• Combines compute, network and storage resources – Self-service dashboard – Services exposed through REST APIs

What is OpenStack?

Single Management Pane

VM VM VM

Virtualized Data Center Resources

5

Page 6: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

OpenStack Services

Component Description Component Description

Nova Compute virtualization Glance Image management and deployment

Cinder Block storage Swift Object storage

Neutron Software defined networking Heat Application and VM orchestration

Keystone Authentication between cloud services Murano Application catalog

Horizon Web based dashboard Trove Database as a Service

Overview of Core Components

6

Page 7: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

OpenStack Across Oracle’s Portfolio

Horizon Centralized Cloud Management

Zones and Kernel Zones

Nova / Ironic Self-Service Compute

and Bare Metal

Elastic Virtual Switch and Open vSwitch

Neutron Software Defined

Networking

ZFS File System

Cinder / Swift Cloud Scale Storage

Unified Archives

Heat / GlanceMurano / Trove

Platform as a Service

Built into the Infrastructure

7

Page 8: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Benefits of Running OpenStack on Oracle Solaris

• Engineered for security and compliance – Minimal privileges for cloud services

– Lock down infrastructure with immutability

• Assured reliability and scale – Automatic service restart and node

dependencies – Guaranteed data integrity

• Seamless upgrade, instant roll-back

OS. Virtualization. SDN. OpenStack. Complete.

8

Page 9: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Agenda

OpenStack on Oracle Solaris

Oracle Solaris Cluster for OpenStack

HA for OpenStack cloud controller on Oracle Solaris

Discussion - Q / A

1

2

3

9

4

9

Page 10: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Mission-Critical Cloud RequirementsIf you need:

• Mission-critical service level

• Minimal downtime for maintenance

• Business Continuity

Oracle Solaris Cluster delivers:

• Local, fast, automatic failover for application and services

• Managed switchover of applications and resources among servers or sites

• Safe, reliable, orchestrated recovery from site failure

10

Page 11: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Oracle Solaris Cluster Functions• Monitor health of all cluster

components: – Servers, storage, network, OS, virtual

machines, applications

• Deliver resiliency to failures through – Hardware redundancy – Robust cluster protection algorithms – Policy-based cluster infrastructure

and applications recovery procedures

• Enable low-impact maintenance

11

Page 12: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

• Data services: failover, scalable • Storage services: global file

system, failover, scalable • Network services: logical

hostname, load balancing • Dependencies management • Monitoring services

Oracle Solaris Cluster Services

12

Page 13: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 13

Applications High Availability• Built-in application agents

• Fine-grained control of application: specific start, stop and probing procedures

• Do not require any change in application

• Fully tested in physical and virtualized environment

• Build-your-own agent toolkit for easy creation of custom agents

Page 14: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

• Choice of VM or application centric model

• Choice of technology: Oracle VM for SPARC domain or zone

• Built-in asset optimization with load balancing, affinity and dependency management at application or VM level

14

Oracle Solaris Cluster and Virtualization

Application Failover Fine-grained control of application inside zone or domain

app

web

db

VM

Workload Failover: Zone or domain is blackbox

VM

Page 15: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

• Managed zone switchover with cold, warm or live migration (kernel zone)

• Automatic zone restart or zone failover upon node failure

• No modification of workload

• Dependencies and load management at zone level

Failover Zones : VM HA

Planned Maintenance: Workload migration

Unplanned Outage: Immediate workload restart or failover

VM

VM

15

Page 16: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

• Application specific protection: policy based management and fault isolation

• Ease of use : configuration and administration across virtual cluster

• Security isolation: delegated administration and security model extended across cluster

• Dependencies and load management at application level

16

Zone Clusters: Application HA with Virtualization

app

db

Solaris 11 Solaris 11

Solaris 11

Solaris 11

Solaris 11

zone cluster

zone cluster

zone cluster

web

Page 17: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Agenda

OpenStack on Oracle Solaris

Oracle Solaris Cluster for OpenStack

HA for OpenStack cloud controller on Oracle Solaris

Discussion - Q / A

1

2

3

17

4

Page 18: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

HA approaches for the OpenStack cloud controllerA) fine grained control over OpenStack services by Solaris Cluster

● best practices as found in other Oracle Optimized Solutions for multi-tiered applications and the approach taken on Linux (OpenStack HA guide)

● published white paper describes this approach with specific example ● prioritize fast failure detection and recovery time of individual services

B) blackbox approach by using HA failover kernel zones ● prioritize simplicity of administration ● Solaris Cluster manages the kernel zones to protect against global node

failures

18

Page 19: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

• Example HA OpenStack node deployment: – Clustered Cloud controller nodes

with Oracle Solaris Cluster (OSC) – Clustered Oraccle ZFS storage

appliance (ZFS SA) • shared storage for OSC • quorum device for OSC • Cinder driver for iSCSI targets provided

to nova compute

– Swift storage nodes (optional) • configure HA Swift ring

19

Example HA node deployment

Page 20: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

HA for OpenStack cloud controllerfine grained approach (white paper)

• all OpenStack cloud controller components are under cluster control (start, stop, probe)

• IP addresses and shared file systems used by services under cluster control

• usage of the cluster load balancer for scalable services

• define inter-component dependencies on the specific service level – orchestration of service start/stop across zones – fast failure detection and failover times

20

Page 21: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

HA for OpenStack cloud controller - HA SMF proxy (1)

• The HA SMF proxy data service is a central component for HA OpenStack in the fine grained topology: – implements a dedicated cluster SMF restarter – enables/disables SMF services on behalf of cluster – ability to specify resource dependencies to other cluster services running in

different resource groups, within different zones or nodes for orchestration – comes in three flavors: failover, multi-master and scalable

21

Page 22: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

HA for OpenStack cloud controller - HA SMF proxy (2)• OpenStack components are deeply integrated with SMF on Solaris

– get started as dedicated non-root UNIX users – some with additional or reduced set of privileges configured – some making use of a variety of SMF method tokens, to expand SMF

properties as option variables for the method script – OpenStack components are implemented through Python

• even the Python method scripts import SMF functions, thus require to be started within an SMF context

• SMF is also used to catch the sometimes verbose Python messages and stack traces into the dedicated SMF service log file

22

Page 23: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

HA for OpenStack cloud controller - HA SMF proxy (3)• Generic approach to provide HA for OpenStack SMF services:

– failover services (stateful active/passive) • configure HAStoragePlus/ScalMountPoint resource to store dynamic FS content • configure SUNW.LogicalHostname resource for service endpoint • configure SUNW.Proxy_SMF_failover resource for SMF service

– scalable services (stateless active/active) • ensure static content is identical across nodes/zones • configure failover RG with SUNW.SharedAddress resource for service endpoint • configure scalable RG with SUNW.Proxy_SMF_scalable resource for SMF service

• OpenStack service configuration specify corresponding IP-address and storage managed by cluster

23

Page 24: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Fine grained approach - pros and cons• Pro:

– fast failure detection per service • option to further improve by adding

OpenStack service specific probes

– fast takeover time in case of unplanned outages

– usage of cluster load balancer allows to configure stateless services in a scalable way out of the box (rabbitmq, OpenStack api, Horizon, etc)

– matches industry wide approach to provide HA for OpenStack on Linux

24

• Con: – interdigitation with OpenStack

installation more involved • order of install and some pre-setup and

post-setup tasks required for cluster – small changes in administration

• svcadm vs. clrs for OpenStack services • zone cluster

– strict change management required • OpenStack upgrade procedure • configuration files to be kept in sync

across cluster nodes – not easy to apply to already existing

non-HA OpenStack deployments

Page 25: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

HA for OpenStack cloud controllerblackbox approach with failover zones

• cluster does only manage (start, stop probe) the failover kernel zones – optional monitoring of suri used in KZ config

• individual OpenStack services and IP addresses not managed by cluster

• inter-component dependencies can only be configured on the kernel zone granularity – though there is an option with sczsmf

• ability to distribute kernel zones across global cluster nodes

25

Page 26: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Blackbox approach - pros and cons• Pro:

– separation of cluster and OpenStack installation and upgrade

– administration and upgrade of OpenStack services near identical to non-HA setup

– on S11.3 onwards live migration can be used for failover kernel zones to reduce planned downtime considerably

26

• Con: – longer takeover time after node

failure (KZ boot in addition) – individual OpenStack service failure

can't trigger failover • rely purely on SMF to detect service • in case sczsmf is used, conflict with live

migration – scalability of services requires extra

external HA load balancer (hard or software)

Page 27: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

Flexibility through mix and match of topologies• HA approaches are not either-or - they can be combined

– start out with blackbox HA – separation in tiers allows to adapt each tier as required – ability to use e.g. MySQL cluster within a zone cluster without changing the

overall architecture

• both topologies have security isolation between tiers by design • scalability can be addressed by component as needed by specific

use cases – some need to scale horizon as users bang on the BUI – some may not require BUI, instead focus on usage of OpenStack CLI or Heat – option to use cluster load balancer, but also switch to hardware load balancer

27

Page 28: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 28

Discussion - Q / A

Page 29: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. |

References• Oracle Openstack for Oracle Solaris

http://www.oracle.com/technetwork/server-storage/solaris11/technologies/openstack-2135773.html

• Oracle Solaris Clusterhttp://www.oracle.com/technetwork/server-storage/solaris-cluster/overview/index.html

• Oracle Solaris Cluster technical resourceshttp://www.oracle.com/technetwork/server-storage/solaris-cluster/documentation/cluster-how-to-1389544.html

• White Paper: Providing High Availability to the OpenStack Cloud Controller on Oracle Solaris with Oracle Solaris Clusterhttp://www.oracle.com/technetwork/server-storage/solaris-cluster/documentation/ha-for-openstack- cloud-2537455.pdf

29

Page 30: High Availability for Enterprise Clouds

Copyright © 2015, Oracle and/or its affiliates. All rights reserved. | 30

Page 31: High Availability for Enterprise Clouds