Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Designed for UptimeHands-on High Availability Lab for SAP HANA on AWS - Labs
Abdelrahman Mohamed, Alliances Solution Architect, SUSE
1 Designed for Uptime
Publication Date: 2020-04-24
Contents
1 Introduction 3
2 SAP HANA Failover Test Case 8
3 SAP HANA Primary Database Migration 11
2 Designed for Uptime
Revision 1.0 from 24.04.2020
SUSE LLC
10 Canal Park Drive
Suite 200
Cambridge MA 02142
USA
http://www.suse.com/documentation
Copyright © 2018 SUSE LLC and contributors. All rights reserved.
Permission is granted to copy, distribute and/or modify this document under the terms of theGNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the InvariantSection being this copyright notice and license. A copy of the license version 1.2 is included inthe section entitled GNU Free Documentation License .
For SUSE trademarks, see Trademark and Service Mark list http://www.suse.com/company/le-
gal/ Linux* is a registered trademark of Linus Torvalds. All other third party trademarks arethe property of their respective owners. A trademark symbol ®, ™ etc.) denotes a SUSE trade-mark; an asterisk (*) denotes a third party trademark.
All information found in this book has been compiled with utmost attention to detail. However,this does not guarantee complete accuracy. Neither SUSE LLC, the authors, nor the translatorsshall be held liable for possible errors or the consequences thereof.
1 IntroductionThe labs that are listed in this guide will cover the following:
A failover test case from the primary SAP HANA node to the secondary SAP HANA node
Migrating a primary SAP HANA role using the cluster migration commands
1.1 Connectivity Test
The SAP HANA HA Cluster primary and secondary nodes are deployed in private subnets. TheSAP HANA instances cannot be reached directly over the Internet unless there is Virtual PrivateNetwork (VPN) or Direct Connect established between the client network where you are access-ing the server from and where the servers are running.
3 Designed for Uptime
Use the Bastion Host Server deployed in the public subnet (already deployed or set up using thequick start template) or AWS Systems Manager to access the SAP HANA servers.
EXAMPLE 1: CONNECT TO THE SAP HANA INSTANCES USING THE BASTION HOST SERVER:
1. SSH to the Bastion Host Server:
$ ssh -i <SSH KEY> ec2-user@<BASTION HOST SERVER IP ADDRESS>
2. Verify that you can connect to the SAP HANA cluster nodes from the Bastion HostServer:
$ ssh -i <SSH KEY> ec2-user@<PRIMARY SAP HANA NODE IP ADDRESS>$ ssh -i <SSH KEY> ec2-user@<SECONDARY SAP HANA NODE IP ADDRESS>
1.2 Validating the Lab Setup
The rst thing to do after the SAP HANA systems have been set up with SUSE High AvailabilityExtension (HAE) using the quick start template is to validate that the cluster has been deployedand is working properly.
Below nd a list of the tools and commands used to validate the cluster:
HAWK Web User Interface (UI) and the crm_mon command cluster tools which help tocheck the status of the cluster and take actions as needed
hdbnsutil and HDBSettings.sh systemReplicationStatus.py SAP HANA toolswhich are used to show the SAP HANA System Replication (HSR) status and manage repli-cation
SAPHanaSR-showAttr which is a SUSE HAE tool that shows the replication status andsome other quick reports
4 Designed for Uptime
1.2.1 Validating the SAP HANA HA Cluster Deployment Status Using thecrm_mon Command
EXAMPLE 2: CHECK THE STATUS OF THE SAP HANA CLUSTER USING crm_mon COMMAND
1. SSH to the Bastion Host Server. From the Bastion Host Server, you will be able toconnect to the private networks that the SAP HANA cluster nodes are located in:
$ ssh -i <SSH KEY> ec2-user@<BASTION HOST SERVER IP ADDRESS>
2. SSH into the primary SAP HANA node which should be node01 :
$ ssh -i <SSH KEY> ec2-user@<PRIMARY SAP HANA NODE IP ADDRESS>
3. Switch to user root :
ec2-user@node01:~> sudo su -
4. Run the crm_mon -rnf1 command that will show the online nodes, master SAPHANA database and started resources. See the below output example:
node01:~ # crm_mon -rnf1Stack: corosyncCurrent DC: node01 (version 1.1.18+20180430.b12c320f5-3.15.1-b12c320f5) - partition with quorumLast updated: Wed Jan 29 11:37:00 2020Last change: Wed Jan 29 11:36:12 2020 by root via crm_attribute on node01
2 nodes configured6 resources configured
Node node01: online res_AWS_STONITH (stonith:external/ec2): Started res_AWS_IP (ocf::suse:aws-vpc-move-ip): Started rsc_SAPHanaTopology_HDB_HDB00 (ocf::suse:SAPHanaTopology): Started rsc_SAPHana_HDB_HDB00 (ocf::suse:SAPHana): MasterNode node02: online rsc_SAPHanaTopology_HDB_HDB00 (ocf::suse:SAPHanaTopology): Started rsc_SAPHana_HDB_HDB00 (ocf::suse:SAPHana): Slave
No inactive resources
Migration Summary:* Node node01:
5 Designed for Uptime
* Node node02:
5. Repeat the above-mentioned steps on the secondary SAP HANA node.
1.2.2 Validating the SAP HANA System Replication Status Using theSAPHanaSR-showAttr Tool
EXAMPLE 3: USE THE SAPHanaSR-showAttr TOOL TO CHECK SAP HANA SYSTEM REPLICATION STATUS:
TipThe following procedures can be done on any cluster node.
1. You must be logged in as the root user.
2. Run the SAPHanaSR-showAttr command on the two nodes. Ensure that the sync_s-tate value is SOK.
1.2.3 Validating the SAP HANA System Replication Status Using the hdbnsutilCommand:
EXAMPLE 4: USE THE hdbnsutil COMMAND TO CHECK THE SAP HANA SYSTEM REPLICATION STATUS:
1. SSH into the primary SAP HANA node:
$ ssh -i <SSH KEY> ec2-user@<PRIMARY SAP HANA NODE IP ADDRESS>
2. Switch to the <sid>adm user (which is hdbadm user in this lab):
ec2-user@node01:~> sudo su - hdbadm
3. Run the hdbnsutil -sr_state command to view the current SAP HANA systemreplication state. Find below an example from the primary SAP HANA node:
hdbadm@node01:/usr/sap/HDB/HDB00> hdbnsutil -sr_state
System Replication State
~~~~~~~~
6 Designed for Uptime
online: true
mode: primaryoperation mode: primarysite id: 1site name: HAP
is source system: trueis secondary/consumer system: falsehas secondaries/consumers attached: trueis a takeover active: false
Host Mappings:
~~~~~~
node01 -> [HAS] node02node01 -> [HAP] node01
Site Mappings:
~~~~~~HAP (primary/primary) |---HAS (sync/logreplay)
Tier of HAP: 1Tier of HAS: 2
Replication mode of HAP: primaryReplication mode of HAS: sync
Operation mode of HAP: primaryOperation mode of HAS: logreplay
Mapping: HAP -> HASdone.
4. Repeat the above-mentioned steps on the secondary SAP HANA node.
7 Designed for Uptime
1.2.4 Validating the SAP HANA System Replication Status Using thesystemReplicationStatus.py Script
EXAMPLE 5: USE THE HDBSettings.sh systemReplicationStatus.py PYTHON SCRIPT: TO CHECK THE SAPHANA SYSTEM REPLICATION STATE:
1. Stay logged in as hdbadm user on the primary SAP HANA node.
2. Run the HDBSettings.sh systemReplicationStatus.py command to view theoverall replication state. See the example below:
hdbadm@node01:/usr/sap/HDB/HDB00> HDBSettings.sh systemReplicationStatus.py
3. Type exit to return to user ec2-user .
2 SAP HANA Failover Test CaseWe will test a failover scenario observing how the SUSE HA cluster behaves when the primaryHANA database crashes.
What is the current situation?
The primary HANA database is running on the node01 host.
What is the expected outcome?
1. The cluster detects the stopped primary HANA database (on node01) and marks theresource failed.
2. The cluster promotes the secondary HANA database (on node02) to take over asprimary.
3. The cluster migrates the IP address to the new primary (on node02).
4. After some time, the cluster shows the sync_state of the stopped primary (onnode01) as SFAIL .
5. Because AUTOMATED_REGISTER value is set to true , the cluster registers the failedHANA database against the new primary.
6. After the automated register and resource refresh, the system replication pair ismarked as in sync SOK .
8 Designed for Uptime
ImportantPREFER_SITE_TAKEOVER and AUTOMATED_REGISTER are important SAPHana resourceagent (RA) settings that govern the cluster behavior as follows:
PREFER_SITE_TAKEOVER defines if SAPHana Resource Agent (RA) prefers to switchover to slave instance instead of restarting master locally.The default value is yes.
AUTOMATED_REGISTER defines whether a former primary should be automaticallyregistered to be secondary to the new primary. With this parameter, you can adaptthe level of system replication automation.The default value is false.
EXAMPLE 6: SIMULATE A COMPLETE BREAKDOWN OF THE PRIMARY DATABASE SYSTEM ON NODE01:
1. Connect via SSH to the Bastion Host Server and launch two terminals:
$ ssh -i <SSH KEY> ec2-user@<BASTION HOST SERVER IP ADDRESS>
2. From the Bastion Host Server, connect via SSH to the primary and secondary HANAnodes:
$ ssh -i <SSH KEY> ec2-user@<PRIMARY SAP HANA NODE IP ADDRESS>$ ssh -i <SSH KEY> ec2-user@<SECONDARY SAP HANA NODE IP ADDRESS>
3. Switch to user root on both nodes:
ec2-user@node01:~> sudo su -ec2-user@node02:~> sudo su -
4. From the secondary node (on node02), run the command crm_mon -rnf to see thecurrent cluster status:
node02:~ # crm_mon -rnf
5. From the primary node (on node01), run the command SAPHanaSR-showAttr toensure the sync_state is SOK :
node01:~ # SAPHanaSR-showAttr
9 Designed for Uptime
6. From the primary node (on node01), switch to user <sid>adm (which is the hdbadmuser in this lab), then run the command`HDB kill -9`:
node01:~ # sudo su - hdbadmhdbadm@node01:/usr/sap/HDB/HDB00> HDB kill -9
7. Keep an eye on the running cluster monitoring window on node02 to see the expectedoutcome that was shown at the beginning.
TipAs we work on a testing environment, the failover may take 2-3 minutes.
8. After the failover is completed successfully, validate that the node02 host owns theOverlay IP and is the new HANA primary node. Run (as root on node02 host) thecommand crm_mon -rfn1 :
node02:~ # crm_mon -rnf1Stack: corosyncCurrent DC: node01 (version 1.1.18+20180430.b12c320f5-3.15.1-b12c320f5) - partition with quorumLast updated: Tue Feb 18 16:45:01 2020Last change: Tue Feb 18 16:44:41 2020 by root via crm_attribute on node02
2 nodes configured6 resources configured
Node node01: online res_AWS_STONITH (stonith:external/ec2): Started rsc_SAPHanaTopology_HDB_HDB00 (ocf::suse:SAPHanaTopology): Started rsc_SAPHana_HDB_HDB00 (ocf::suse:SAPHana): SlaveNode node02: online res_AWS_IP (ocf::suse:aws-vpc-move-ip): Started rsc_SAPHanaTopology_HDB_HDB00 (ocf::suse:SAPHanaTopology): Started rsc_SAPHana_HDB_HDB00 (ocf::suse:SAPHana): Master
No inactive resources
Migration Summary:* Node node01: rsc_SAPHana_HDB_HDB00: migration-threshold=5000 fail-count=1 last-failure='Tue Feb 18 16:39:33 2020'* Node node02:
10 Designed for Uptime
Failed Actions:* rsc_SAPHana_HDB_HDB00_monitor_60000 on node1 'master (failed)' (9): call=28, status=complete, exitreason='', last-rc-change='Tue Feb 18 16:39:33 2020', queued=0ms, exec=0ms
9. Look at the AWS control panel to validate the Overlay IP has moved:
10. On node02 (the new primary database), switch user to hdbadm user and execute thefollowing commands to validate the SAP HANA system replication status:
ec2-user@node02:~> sudo su - hdbadm
hdbadm@node02:/usr/sap/HDB/HDB00> HDBSettings.sh systemReplicationStatus.py
hdbadm@node02:/usr/sap/HDB/HDB00> hdbnsutil -sr_state
11. On any node, execute the SAPHanaSR-showAttr command and ensure that the newsecondary HANA database (node01) sync_state is OK (SOK).
{node2}:~ # SAPHanaSR-showAttr
12. On any node, switch to user root and run the following command to clean up thefailed actions reported by the crm_mon tool:
node01:~ # crm resource refresh rsc_SAPHana_HDB_HDB00
3 SAP HANA Primary Database MigrationAs a result of crashing the primary HANA database in the previous lab, the current:
Primary HANA database is working on the node02 instance.
Secondary HANA database is working on the node01 instance.
11 Designed for Uptime
This lab aims to exchange the roles of the nodes. This means at the end the primary should runon node01 instance and the secondary should run on node02 instance.
EXAMPLE 7: MOVING AN SAP HANA PRIMARY USING THE CLUSTER TOOLSET:
1. Connect via SSH to the Bastion Host Server and launch two terminals:
$ ssh -i <SSH KEY> ec2-user@<BASTION HOST SERVER IP ADDRESS>
2. From the Bastion Host Server, connect via SSH to the primary and secondary node:
$ ssh -i <SSH KEY> ec2-user@<PRIMARY SAP HANA NODE IP ADDRESS>$ ssh -i <SSH KEY> ec2-user@<SECONDARY SAP HANA NODE IP ADDRESS>
3. on the both terminals, Switch to user root :
ec2-user@node01:~> sudo su -ec2-user@node02:~> sudo su -
4. On any node, ensure that the node01 sync_state value is SOK :
node02:~ # SAPHanaSR-showAttr
5. On any node, Identify the SAP HANA "Master-Slave" resource agent name by execut-ing the following command:
node02:~ # crm resource status
TipIn this lab, the SAP HANA Master/Slave resource agent name ismsl_SAPHana_HDB_HDB00
6. On node01, monitor the cluster status by executing the crm_mon command:
node01:~ # crm_mon -rnf
7. On node02, create a "move away from this node" rule by using the force option:
node02# crm resource move msl_SAPHana_HDB_HDB00 force
What is the expected outcome?
12 Designed for Uptime
Because of the "move away" (force) rule, the cluster will stop the current pri-mary site (node02). After that, the cluster will promote the slave resource agenton the secondary site (node01) to be the new master if the system replicationwas in sync before.
ImportantMigration without the force option will cause a takeover without theformer primary to be stopped. Only the migration with force option issupported.
8. Keep an eye on the running cluster monitoring window on node01 to see the expectedoutcome.
9. Confirm that the secondary node has completely taken over to be the new primaryrole by executing the command SAPHana-showAttr . Ensure that the attribute "roles"for the new primary starts with "4:P".
node02:~ # SAPHanaSR-showAttr
10. Clear the SAP HANA Master/Slave resource agent location constraint ban to allowthe cluster to start the new secondary:
node02# crm resource clear msl_SAPHana_HDB_HDB00
11. Confirm that the new secondary has started by executing the command SAPHanaSR-showAttr and checking for the attributeFo "roles" for the new primary. It must startwith "4:S":
node02:~ # SAPHanaSR-showAttr
13 Designed for Uptime
1
Designed For Uptime
HOL-1060
Hands-on HA Lab for SAP HANA on AWS
2
HA Lab For SAP HANA On AWS
1. SUSE + AWS + SAP
2. SAP HANA Quick Start
3. SUSE Cluster Components
4. Lab Introduction
5. Resource Agent Constraints
6. Test and Manage the Cluster
3
Custo
me
r A
do
ptio
n
2010 Today
Key
SUSE Linux Available on AWS
SAP HANA Quick Start featuring SUSE Linux
Enterprise Server for SAP Applications High Availability
SUSE Manager and SUSE Linux Enterprise Server for
SAP Applications Available on AWSSupport for new X1 and X1e EC2 Instances (2TB
/ 4TB)
SUSE Linux Enterprise Server for SAP Applications
available on the AWS Marketplace
Nitro (C5 / M5) instances for SAP
SUSE Cloud Application Platform on AWS EKS
Support for new EC2 Bare Metal for HANA
Support for new i3 instance types
Milestones
3
Milestone Key
First to Market with EC2
(among Enterprise Linux vendors)
New SUSE Solution Available
on AWS
SUSE Public Cloud
Engineering + EC2 Milestone SUSE Linux Enterprise Server Available on AWS Free Tier
SUSE Linux Enterprise Server Available on AWS Gov Cloud
SAP Certified High Availability Solution for
SAP HANA Supported and Available on AWS Quick Start
2014
2015
2016
2019SUSE Linux Enterprise Server for SAP Applications
Available on AWS Marketplace via AISPL (for India)
4
SUSE + AWS +
SAP
5
Reliability and
Resilience
Performance
Ease of
Use and
Deployment
Base OS
and Support
SUSE Linux
Enterprise High
Availability
Remote Storage
Encryption
Management
SAP HANA HA
Resource Agents &
Cluster Connector
SAP HANA Firewall
Workload Memory
Management
Performance
Configuration and
Tuning
Installation Wizard &
YaST for SAP HASUSE Connect
Public Cloud
Platform Images
S/4 HANA
Transition SupportSUSE Package Hub
SUSE Linux Enterprise
Server
SAP Specific
Update Channel24x7 Priority Support
Extended Service Pack
Overlap Support
SAP Specific Features from SUSE
SUSE Products and Services
SUSE Linux Enterprise Server for SAP Applications
6
SUSE And AWS: High Availability Portfolio
• SAP HANA SR Performance Optimized Scenario on the AWS
• Multiple SAP HANA Systems on One Host in a scale-up cluster
• Multi tenant SAP HANA Database
• SAP HANA SR Performance Optimized Scale-out Scenario
• SAP NetWeaver High Availability Cluster
• SUSE is certified to run both ENSA1 and ENSA2 in a high
availability cluster
• SAP ASE using SAP ASE Disaster Recovery
(white paper release being finalized)
7
SAP HANA Quick
Start
8
SAP HANA Quick Start Public subnet
Bastion host
Public subnet
VPC
Private subnet
AWS Cloud
Availability Zone 1
NAT gateway
Bastion host
SAP HANA System
Replication
172.16.128.0/20
172.16.0.0/19
Private subnet
Availability Zone 2
NAT gateway
172.16.144.0/20
172.16.32.0/19172.16.0.0/16
Auto Scaling group
Primary SAP HANA
Secondary SAP HANA
SUSE Cluster Components
9
VPC
AWS Cloud
Availability Zone 1 Availability Zone 2
HANA HA Step-by-Step
Private subnet
172.16.0.0/19
Private subnet
172.16.32.0/19
172.16.0.0/16
Primary SAP HANA
ValidateParam
10
VPC
AWS Cloud
Availability Zone 1 Availability Zone 2
Private subnet
172.16.0.0/19
Private subnet
172.16.32.0/19
172.16.0.0/16
Primary SAP HANA
Secondary SAP HANA
HANA HA Step-by-Step
11
VPC
AWS Cloud
Availability Zone 1 Availability Zone 2
Private subnet
172.16.0.0/19
Private subnet
172.16.32.0/19
172.16.0.0/16
Primary SAP HANA
Secondary SAP HANA
PreHAConfig
1. Update network config
2. Update network config
3. Full HANA backup
HANA HA Step-by-Step
12
VPC
AWS Cloud
Availability Zone 1 Availability Zone 2
Private subnet
172.16.0.0/19
Private subnet
172.16.32.0/19
172.16.0.0/16
Primary SAP HANA
Secondary SAP HANA
HAConfig
1. Setup awscli
2. Setup tags
3. Disable Src/Dest
and so on…
Enable HSR Register secondary with primary
SAP HANA System
Replication
HANA HA Step-by-Step
13
VPC
AWS Cloud
Availability Zone 1 Availability Zone 2
Private subnet
172.16.0.0/19
Private subnet
172.16.32.0/19
172.16.0.0/16
Primary SAP HANA
Secondary SAP HANA
HAConfig
1. Setup corosynckey and copy to
secondary
HANA HA Step-by-Step
14
VPC
AWS Cloud
Availability Zone 1 Availability Zone 2
Private subnet
172.16.0.0/19
Private subnet
172.16.32.0/19
172.16.0.0/16
Primary SAP HANA
Secondary SAP HANA
HAConfig
2. Create corosync.conf file
3. Start Pacemaker
HANA HA Step-by-Step
15
VPC
AWS Cloud
Availability Zone 1 Availability Zone 2
Private subnet
172.16.0.0/19
Private subnet
172.16.32.0/19
172.16.0.0/16
Primary SAP HANA
Secondary SAP HANA
HAConfig
4. Create res_AWS_STONITH
5. Create res_AWS_IP
HANA HA Step-by-Step
16
VPC
AWS Cloud
Availability Zone 1 Availability Zone 2
Private subnet
172.16.0.0/19
Private subnet
172.16.32.0/19
172.16.0.0/16
Primary SAP HANA
Secondary SAP HANA
HAConfig
6. Create rsc_SAPHanaTopology
7. Create
rsc_SAPHana
HANA HA Step-by-Step
17
VPC
AWS Cloud
Availability Zone 1 Availability Zone 2
Private subnet
SAP HANA System
Replication
172.16.0.0/19
Private subnet
172.16.32.0/19
172.16.0.0/16
Primary SAP HANA
Secondary SAP HANA
SUSE
HA
ClusterOverlay
IP
HANA HA Step-by-Step
18
SUSE Cluster
Components
19
Routing From Internal
The current overlay IP address agent allows application servers inside a virtual
private computer (VPC) to access a protected SAP HANA server in that VPC,
but it doesn’t provide access to on-premises applications.
It requires applications like HANA Studio to be managed inside the VPC via RDP
or a jump server. The Route 53 agent works around this restriction by using a
name-based approach to allow on-premises users to connect to the VPC.
The two agents operate in parallel: The overlay IP agent routes traffic from the
overlay IP address to the active node. The Route 53 agent updates the name of
the SAP HANA server with the current IP address.
20
SUSE on AWS: Cluster Agents
SUSE on AWS Scale-up Scenario uses OCF Agents with SUSE being the
provider.
The Resource Agents are found in /usr/lib/ocf/resource.d/suse
• SAPHana
• SAPHanaTopology
• aws-vpc-move-ip
• Optional aws-vpc-route53
The AWS Stonith Agent is defined as externel/ec2
• Directory path /usr/lib64/stonith/plugins/external/ec2
21
SAPHana: SAP HANA Agent
Location: /usr/lib/ocf/resource.d/suse/SAPHana
Use case: Agent for SAP HANA databases in scale-up scenarios
Function:
Agent manages take-over for a SAP HANA database with system replication
in an OCF master/slave configuration RA performs the actual check of the
SAP HANA database instances
Managing the two SAP HANA instances means that the resource agent
controls the start/stop of the instances Check the synchronzation status of
the two SAP HANA databases.
Cluster avoids a takeover to the secondary site if the state is not “SOK”.
22
SAPHanaTopology: SAP HANA Agent
Location: /usr/lib/ocf/resource.d/suse/SAPHanaTopology
Use case: Helps to manage two SAP HANA databases with system replication.
Function:
Agent that analyzes the SAP HANA topology and "sends" all findings via the node
status attributes to all nodes in the cluster
23
aws-vpc-move-ip: The Overlay IP Agent
Location: /usr/lib/ocf/resource.d/suse/aws-vpc-move-ip
Use case: Used for application server to database traffic
Function:
Changes an AWS routing table entry
Send traffic to an IP address to an instance-id
Check whether routing table has the correct setting
24
ec2: The STONITH Agent
Location: /usr/lib64/stonith/plugins/external/ec2
Use case: Kill the other node. Use AWS infrastructure
Function:
It fences the other cluster node
It starts, stops other cluster node
It monitors other cluster nodes
25
/etc/hosts
This file describes a number of hostname-to-address mappings for the
TCP/IP subsystem. It is the recommended name resolution method for a cluster..
You will add the cluster nodes to the bastion server’s host file.
If you are using an On-Demand SUSE subscription the public cloud update server will
be listed in the host file.
Considerations:
Changes to /etc/hosts need to be changed across all the nodes.
26
Resource Agents
and
Constraints
27
Resource Types
•Primitive
•Group
•Clone
•Multi-state
28
Primitive Resource
aws-vpc-move-ip is a Primitive Resources without special conditions
Resource runs as a single instance
• Runs on one node
Uses a resource agent from a provider
Configuration includes timeout settings for:
• Start, Stop, Monitor
29
Clone Resource
SAPHanaTopology is a Clone Resource agent
• Resource runs simultaneously on both nodes
• To avoid all clones being stopped or started together use the option: interleave=true
• SAPHanaToplogy is stateless
• The resource agent state does not interact with the clones running on other nodes
30
Multi-state Resource
SAPHana is a specialized type of clone resource
Resource agents are in one of two states: Active and Passive
• Active - rsc_SAPHana_HDB_HDB00 (ocf::suse:SAPHana): Master
• Passive - rsc_SAPHana_HDB_HDB00 (ocf::suse:SAPHana): Slave
Multi-state two additional operations: Promote and Demote
• Start, Stop, Monitor, Promote and Demote
To avoid all clones being stopped or started together use the option: interleave=true
• Constraints are used
31
Constraints
A constraint is used to influence resource agent behaviour
Types of constraint:
• Colocation & Order are used in the AWS deployed Performance Optimized
Scenario
Constraints have scores
• INFINITY - Must happen
• -INFINITY - Must not happen
• Intermediate values assign relative priority
32
Colocation Constraints
Colocation constraints tell the cluster that the location of one resource depends on the
location of another one.
Resource res_AWS_IP and msl_SAPHana_HDB00 configured to run on the same
node
crm configure show
• colocation col_IP_Primary 2000: res_AWS_IP:Started
msl_SAPHana_HDB_HDB00:Master
Positive constraint keeps resources together
When using with a Multi-state resources use one of the following after the resource
name: Master or Slave
33
Order Constraint - SAP
SAPHanaTopology is started before SAPHana
• SAPHANATopology and "sends" all findings via the node status attributes to all
nodes in the cluster
Order constraint ensures that two resources run in the correct order. The score
determines the location relationship between the resources: The constraint is
mandatory if the score is greater than zero.
crm configure show
order ord_SAPHana 2000: cln_SAPHanaTopology_HDB_HDB00 msl_SAPHana_HDB_HDB00
34
SUSE HA Lab
Commands
35
crm_mon: Executed Using root
crm_mon provides a summary of cluster’s current state. Outputs varying
levels of detail in a number of different formats.
Example: crm_mon –rfn1
-r displays inactive resources
-f displays resource fail counts
-n displays resources by node
-1 displays the cluster status once and exits
36
SUSE On AWS: Cluster Agents
Node %name%: status
%given_name% (agent): status/role
Node prihana: online
rsc_SAPHana_HDB_HDB00 (ocf::suse:SAPHana): Master
res_AWS_STONITH (stonith:external/ec2): Started
rsc_SAPHanaTopology_HDB_HDB00 (ocf::suse:SAPHanaTopology): Started
res_AWS_IP (ocf::suse:aws-vpc-move-ip): Started
Node sechana: online
rsc_SAPHanaTopology_HDB_HDB00 (ocf::suse:SAPHanaTopology): Started
rsc_SAPHana_HDB_HDB00 (ocf::suse:SAPHana): Slave
37
crm: Executed Using root
The crm shell is a cmd-line based cluster configuration and management tool. crm
works both as a command-line tool to be called directly from the system shell, and
as an interactive shell with extensive tab completion and help. The crm shell can
be used to manage every aspect of configuring and maintaining a cluster.
Example: crm resource cleanup %resource_name%
Command tells the cluster to forget about past states and to reprobe the updated
SAP HANA database systems
38
SAPHanaSR-showAttr: Executed Using root
SAPHanaSR-showAttr shows Linux cluster attributes for SAP HANA system
replication automation. The overall system replication (SR) state is shown as
well as the HANA state on each node.
The output shows the following section: hostname (Hosts), state of the Linux
cluster resource (clone_state/sok), actual master score on that node (score),
site where the host sits (site)
Example: SAPHanaSR-showAttr
39
Default User Accounts
There are two user accounts hdbadm and hacluster are that are created during the HA and
HANA setup. The passwords are set using the SAP HANA Quick Start field below:
hacluster is a Linux user that is created during the SUSE HA configuration, and is added to
the haclient group so that it has access to the Hawk web cluster console.
hdbadm is a Linux user that is created during the SUSE HA configuration, and enables the
administering SAP HANA at the command-line on the SUSE server.
40
hdbnsutil : Executed Using hdbadm
hdbnsutil is an SAP command line tool that can set up or manage SAP
HANA System replication.
Example: hdbnsutil –sr_state
-sr_state shows status information about system replication site
41
HDBSettings.sh systemReplicationStatus.py: Executed using hdbadm
HDBSettings.sh systemReplicationStatus.py is a script that prints a human-
readable table of the system replication channels and their status. The key
replication status states are located in the column, Replication Status, which
should be ACTIVE.
The command needs to be executed on the primary.
Example: HDBSettings.sh systemReplicationStatus.py
42
Test Case: Crash
Primary Database
Hands-on Labs
43
Test Case 1: Crash Primary Database on the Primary HANA Database (prihana)HDB kill -9
Expected Outcome:
• The cluster detects the crashed primary HANA database (on prihana) and marks the
resource failed
• The cluster promotes the secondary HANA database (on sechana) to take over as
primary.
• The cluster migrates the IP address to the new primary (on sechana)
• The cluster starts the database on prihana and joins as the secondary
• Run resource cleanup
44
VPC
AWS Cloud
Availability Zone 1
Amazon S3
Amazon Route 53
Availability Zone 2
Corporate data center
SAPGUI Logon Group
SAP HANA HA Quick Start
Private subnet
SAP HANA System
Replication
172.16.0.0/19
Private subnet
172.16.32.0/19
172.16.0.0/16
Primary SAP HANA
Secondary SAP HANA
SUSE HA
ClusterOverlay
IP
NLB
X
45
VPC
AWS Cloud
Availability Zone 1
Amazon S3
Amazon Route 53
Availability Zone 2
Corporate data center
SAPGUI Logon Group
SAP HANA HA Quick Start
Private subnet
SAP HANA System
Replication
172.16.0.0/19
Private subnet
172.16.32.0/19
Primary SAP HANA
Secondary SAP HANA
SUSE HA
ClusterOverlay
IP
NLB
XHANA Database crash detected
172.16.0.0/16
46
VPC
AWS Cloud
Availability Zone 1
Amazon S3
Amazon Route 53
Availability Zone 2
Corporate data center
SAPGUI Logon Group
SAP HANA HA Quick Start
Private subnet
SAP HANA System
Replication
172.16.0.0/19
Private subnet
172.16.32.0/19
Primary SAP HANA
Secondary SAP HANA
SUSE HA
ClusterOverlay
IP
NLB
X2. SAP HANA
Takeover
172.16.0.0/16
47
VPC
AWS Cloud
Availability Zone 1
Amazon S3
Amazon Route 53
Availability Zone 2
Corporate data center
SAPGUI Logon Group
SAP HANA HA Quick Start
Private subnet
SAP HANA System
Replication
172.16.0.0/19
Private subnet
172.16.32.0/19
Primary SAP HANA
PrimarySAP HANA
SUSE HA
ClusterOverlay
IP
NLB
X2. SAP HANA
Takeover
172.16.0.0/16
48
VPC
AWS Cloud
Availability Zone 1
Amazon S3
Amazon Route 53
Availability Zone 2
Corporate data center
SAPGUI Logon Group
SAP HANA HA Quick Start
Private subnet
SAP HANA System
Replication
172.16.0.0/19
Private subnet
172.16.32.0/19
Primary SAP HANA
PrimarySAP HANA
SUSE HA
ClusterOverlay
IP
NLB
X3. Route
Table Update
172.16.0.0/16
49
VPC
AWS Cloud
Availability Zone 1
Amazon S3
Amazon Route 53
Availability Zone 2
Corporate data center
SAPGUI Logon Group
SAP HANA HA Quick Start
Private subnet
SAP HANA System
Replication
172.16.0.0/19
Private subnet
172.16.32.0/19
Primary SAP HANA
PrimarySAP HANA
SUSE HA
ClusterOverlay
IP
NLB
X3. Route
Table Update
OverlayIP
172.16.0.0/16
50
VPC
AWS Cloud
Availability Zone 1
Amazon S3
Amazon Route 53
Availability Zone 2
Corporate data center
SAPGUI Logon Group
SAP HANA HA Quick Start
Private subnet
SAP HANA System
Replication
172.16.0.0/19
Private subnet
172.16.32.0/19
Primary SAP HANA
PrimarySAP HANA
SUSE HA
ClusterOverlay
IP
NLB
3. Route Table Update
OverlayIP
172.16.0.0/16
?
51
VPC
AWS Cloud
Availability Zone 1
Amazon S3
Amazon Route 53
Availability Zone 2
Corporate data center
SAPGUI Logon Group
SAP HANA HA Quick Start
Private subnet
SAP HANA System
Replication
172.16.0.0/19
Private subnet
172.16.32.0/19
Secondary SAP HANA
PrimarySAP HANA
SUSE HA
Cluster
NLB
4. Opt.NLB update
OverlayIP
172.16.0.0/16
5. HANA Database is secondary target
52
Manage The
Cluster
53
Change The Primary Cluster Node
• Identify the SAP HANA "Master-Slave" resource agent name.
• Create a "move away from the primary node" rule by using the
force option.
• What is the expected outcome?
Because of the "move away" (force) rule, the cluster will stop the
current primary site (prihana). After that, the cluster will promote the
slave resource agent on the secondary site (sechana) to be the
new master if the system replication was in sync before.
54
Sometimes Things Change…
55
Overview Software Management
Software version control is essential.
All cluster nodes should run the same software versions
• Two patch levels within the cluster is allowed while patching the cluster
Good software management is good practice
Repositories provide updates:
• AWS Marketplace or EC2 purchased instances have access to Public Cloud
Update Infrastructure
• Cluster nodes direct connection to SCC - Do not use this method
• SUSE Manager server
• Update SMT/RMT server dedicated to the cluster
56
zypper Command (1/2)
zypper – command that can install, update, remove packages, manage repositories and perform
various queries
Too many options to list so providing a few key commands:
zypper in %package name% - installs package
zypper up %package name% - updates the package to the new version
zypper patch %package name% - patches current package version
zypper lu – lists all available updates to installed packages
zypper lp – lists all available patches
zypper up - updates all packages to the new version
zypper patch - patches current version
57
zypper Command (2/2)
More commands:
zypper lr – lists repositories
zypper search -f %file path% - Find out what package installed the file
zypper info --provides %package name% - Find out what the package provides
Package Names for used Agents
SAPHanaSR - SAPHanaTopology, SAPHana
cluster-glue – ec2 and aws-vpc-move-ip 12 SP3 and earlier rel
resource-agents – contains aws-vpc-move-ip 12 SP4 and later
Quick reference
https://en.opensuse.org/images/1/17/Zypper-cheat-sheet-1.pdf
https://en.opensuse.org/images/3/30/Zypper-cheat-sheet-2.pdf
58
Upgrading And Patching
Always check the appropriate documentation
If it is a major release update read the release notes
Backup
• Test the backup can be restored
Run procedure in a test environment
59
Please submit your questions online