High Availability Almost Everywhere?

High Availability Almost High Availability Almost Everywhere?Everywhere?

Ramon KaganRamon KaganComputing and Network ServicesComputing and Network Services

York UniversityYork University

AgendaAgenda

Analysis of previous pseudo-high availability Analysis of previous pseudo-high availability solutionssolutions

Additions to those solutions to create high Additions to those solutions to create high availability or higher availabilityavailability or higher availability

Look at what we have done at YorkULook at what we have done at YorkU What this achieves over and above high availabilityWhat this achieves over and above high availability

Technical in the middleTechnical in the middle

The SituationThe Situation

Lack of financial resources for large mid/main-Lack of financial resources for large mid/main-frame true clustering serverframe true clustering server

Lack of financial resources for overtime hours for Lack of financial resources for overtime hours for regular maintenanceregular maintenance

SLAs – understood or writtenSLAs – understood or written

Past SolutionsPast Solutions

Purchase multiple small systems with combined Purchase multiple small systems with combined computing powercomputing power

Create service “clusters”Create service “clusters” Use of load balancing techniquesUse of load balancing techniques

DNS Shuffle recordsDNS Shuffle records ProxiesProxies Switch/Router load balancingSwitch/Router load balancing Linux Virtual Server (LVS)Linux Virtual Server (LVS)

Past Solutions – DNS ShufflePast Solutions – DNS Shuffle

AdvantagesAdvantages Simple to setupSimple to setup Simple RR schemeSimple RR scheme No special requirements for serversNo special requirements for serversDisadvantagesDisadvantages RR doesn’t account for different systems typesRR doesn’t account for different systems types Faulty system still redirected to every 1 in NFaulty system still redirected to every 1 in N Sys admin may not be in control of DNSSys admin may not be in control of DNS External DNS update only after TTLExternal DNS update only after TTL

Past Solution – ProxiesPast Solution – Proxies

AdvantagesAdvantages Sys admin in full controlSys admin in full control Easily updated on the flyEasily updated on the fly No special requirements for serversNo special requirements for servers Ability to log activityAbility to log activityDisadvantagesDisadvantages Single point of failureSingle point of failure Proxy capable of entire bandwidthProxy capable of entire bandwidth Faulty system redirected to until redirection rules Faulty system redirected to until redirection rules

are modifiedare modified

Past Solutions – Switch/Router Past Solutions – Switch/Router Load BalancingLoad Balancing

AdvantagesAdvantages No additional hardwareNo additional hardware Less “bouncing” aroundLess “bouncing” around No special requirements for serversNo special requirements for serversDisadvantagesDisadvantages Expensive LicensingExpensive Licensing Sys admin may not be in controlSys admin may not be in control Faulty system still redirect to – health checking still Faulty system still redirect to – health checking still

not reliable and delay may be over 30 secnot reliable and delay may be over 30 sec

Past Solutions – LVS - NATPast Solutions – LVS - NATAdvantagesAdvantages No special requirements for serversNo special requirements for servers Load balancing across VLANs, algorithmsLoad balancing across VLANs, algorithms Easily updated on the flyEasily updated on the fly Sys admin in controlSys admin in controlDisadvantagesDisadvantages Single point of failureSingle point of failure Server capable of entire bandwidthServer capable of entire bandwidth Faulty server still redirected toFaulty server still redirected to Issues with persistent connectionsIssues with persistent connections Good command of IP tables a mustGood command of IP tables a must

Past Solutions – LVS - DRPast Solutions – LVS - DRAdvantagesAdvantages Easy setupEasy setup Bandwidth minimalBandwidth minimal Director requirements minimalDirector requirements minimal Load balancing algorithmsLoad balancing algorithmsDisadvantagesDisadvantages Single point of failureSingle point of failure Load balancing on a single VLANLoad balancing on a single VLAN Extra ethernet card for Windows serversExtra ethernet card for Windows servers Faulty server still redirected toFaulty server still redirected to Issues with persistent connectionsIssues with persistent connections

Past Solutions - SummaryPast Solutions - Summary

None meet all the requirementsNone meet all the requirements Need to address single points of failure for proxies Need to address single points of failure for proxies

and LVSand LVS Need to address fault system redirection for allNeed to address fault system redirection for all High bandwidth services not practical for proxies High bandwidth services not practical for proxies

and LVS – NATand LVS – NAT DNS solutions have TTL issues – especially in DNS solutions have TTL issues – especially in

failuresfailures

KeepaliveDKeepaliveD

Addresses single point of failureAddresses single point of failure Director availability using failover protocols – VRRP2 Director availability using failover protocols – VRRP2

(RFC 2338)(RFC 2338) Addresses faulty system redirectionsAddresses faulty system redirections

Real server availability using health-checkingReal server availability using health-checking Designed for LVSDesigned for LVS Can be manipulated to increase availability for Can be manipulated to increase availability for

non-LVS linux-based servicesnon-LVS linux-based services

TerminologyTerminology

VIP – virtual IP, aka service IP VIP – virtual IP, aka service IP e.g. webmail.yorku.cae.g. webmail.yorku.ca

Real Server – actual host of the serviceReal Server – actual host of the service Server Pool – farm of real serversServer Pool – farm of real servers Virtual Server – access point to server pool (load Virtual Server – access point to server pool (load

balancer or director)balancer or director) Virtual Service – service being served by virtual Virtual Service – service being served by virtual

server under VIPserver under VIP

Health Checking FrameworkHealth Checking Framework

4 avenues for health checking4 avenues for health checking TCP_CHECK – layer4, basic vanilla TCP connection TCP_CHECK – layer4, basic vanilla TCP connection

attemptattempt HTTP_GET – layer5, performs GET HTTP, computes HTTP_GET – layer5, performs GET HTTP, computes

MD5 sum and validates against the expected valueMD5 sum and validates against the expected value SSL_GET – same as HTTP_GET but uses SSL SSL_GET – same as HTTP_GET but uses SSL

connectionsconnections MISC_CHECK – the kitchen sink – define your own test MISC_CHECK – the kitchen sink – define your own test

parameters for the service and return 0 or 1parameters for the service and return 0 or 1

Failover – VRRP FrameworkFailover – VRRP Framework

Election for control of VIP addressesElection for control of VIP addresses Dynamic failover of IPs on failuresDynamic failover of IPs on failures

Main functionalities are:Main functionalities are: FailoverFailover VRRP instance synchVRRP instance synch Nice FallbackNice Fallback Advert Packet integrity – via IPSEC – MulticastsAdvert Packet integrity – via IPSEC – Multicasts System call capabilitiesSystem call capabilities

KeepaliveD & LVS – DRKeepaliveD & LVS – DR

Service 1 Cluster

Service 1 Cluster

Service 2 Cluster

Service 2 Cluster

Service 3 Cluster

Service 3 Cluster

User

Director Director

KeepaliveD & LVS – DR KeepaliveD & LVS – DR

Service 1 Cluster

Service 1 Cluster

Service 2 Cluster

Service 2 Cluster

Service 3 Cluster

Service 3 Cluster

User

Director Director

KeepaliveD & LVS – DR KeepaliveD & LVS – DR Configuration ExampleConfiguration ExampleSection 1 – Global definitionsSection 1 – Global definitions who to notify, how to notify and whom to notify aswho to notify, how to notify and whom to notify as

global_defs {global_defs { notification_email {notification_email { [email protected]@yorku.ca }} notification_email_from [email protected]_email_from [email protected] smtp_server 130.63.236.104smtp_server 130.63.236.104 smtp_connect_timeout 30smtp_connect_timeout 30 lvs_id CNSLBlvs_id CNSLB}}

KeepaliveD & LVS – DR KeepaliveD & LVS – DR Configuration ExampleConfiguration Example

vrrp_instance VI_1 {vrrp_instance VI_1 { state state MASTERMASTER interface eth0interface eth0 virtual_router_id 51virtual_router_id 51 priority priority 250250

smtp_alertsmtp_alert advert_int 1advert_int 1 authentication {authentication { auth_type AHauth_type AH auth_pass passwrd1auth_pass passwrd1 }} virtual_ipaddress {virtual_ipaddress { 130.63.236.146130.63.236.146 130.63.236.223130.63.236.223 130.63.236.212130.63.236.212 }}}}

vrrp_instance VI_2 {vrrp_instance VI_2 { state state BACKUPBACKUP interface eth0interface eth0 virtual_router_id 91virtual_router_id 91 priority priority 200200

smtp_alertsmtp_alert advert_int 1advert_int 1 authentication {authentication { auth_type AHauth_type AH auth_pass passwrd2auth_pass passwrd2 }} virtual_ipaddress {virtual_ipaddress { 130.63.236.140130.63.236.140 130.63.236.137130.63.236.137 }}}}

Section 2 – VRRP instance definition

KeepaliveD & LVS – DR KeepaliveD & LVS – DR Configuration ExampleConfiguration Example

# OPTERA.CCS.YORKU.CA - HTTP (Port 80)# OPTERA.CCS.YORKU.CA - HTTP (Port 80)virtual_server 130.63.236.137 80 {virtual_server 130.63.236.137 80 { delay_loop 10delay_loop 10 lb_algo wrrlb_algo wrr lb_kind DRlb_kind DR protocol TCPprotocol TCP

# estrela.ccs.yorku.ca# estrela.ccs.yorku.ca real_server 130.63.236.224 80 {real_server 130.63.236.224 80 { weight 1weight 1 HTTP_GET {HTTP_GET { url {url { path /index.htmlpath /index.html digest digest

254440db00e00a3eb49b266de0d457c9254440db00e00a3eb49b266de0d457c9 }} connect_timeout 20connect_timeout 20 nb_get_retry 3nb_get_retry 3 delay_before_retry 15delay_before_retry 15 }} }}

# etoile.ccs.yorku.ca# etoile.ccs.yorku.ca real_server 130.63.236.225 80 {real_server 130.63.236.225 80 { weight 1weight 1 HTTP_GET {HTTP_GET { url {url { path /index.htmlpath /index.html digest digest

bd32b6a8c221083362c056c88c2ccb87bd32b6a8c221083362c056c88c2ccb87 }} connect_timeout 20connect_timeout 20 nb_get_retry 3nb_get_retry 3 delay_before_retry 15delay_before_retry 15 }} }}}}

Section 3 – Virtual Service Definition

KeepaliveD & LVS – DR KeepaliveD & LVS – DR Configuration ExampleConfiguration ExampleIPVSADM OUTPUTIPVSADM OUTPUT

orite:~# ipvsadmorite:~# ipvsadmIP Virtual Server version 1.0.10 (size=4096)IP Virtual Server version 1.0.10 (size=4096)Prot LocalAddress:Port Scheduler FlagsProt LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn-> RemoteAddress:Port Forward Weight ActiveConn InActConnTCP optera.ccs.yorku.ca:www wrrTCP optera.ccs.yorku.ca:www wrr -> estrela.ccs.yorku.ca:www Route 1 0 0-> estrela.ccs.yorku.ca:www Route 1 0 0 -> etoile.ccs.yorku.ca:www Route 1 0 0-> etoile.ccs.yorku.ca:www Route 1 0 0

KeepaliveD with a TwistKeepaliveD with a Twist

Some services don’t run seamlessly under LVSSome services don’t run seamlessly under LVS Long term connection-based services like IMAPLong term connection-based services like IMAP

TCP timeout issues exist with both over/under TCP timeout issues exist with both over/under compensatingcompensating

Not a real error, but an annoyance for users as the clients Not a real error, but an annoyance for users as the clients needlessly pop-up a messageneedlessly pop-up a message

Need a different way to get closer to HA and load-Need a different way to get closer to HA and load-balancingbalancing


DNS Shuffle

IMAP Cluster


AchievementsAchievements Automatic failover on system downAutomatic failover on system down TTL problems resolvedTTL problems resolved

DeficienciesDeficiencies Health checking – would need to be done at the DNS Health checking – would need to be done at the DNS

shuffle record levelshuffle record level Load of one server is transferred completely to anotherLoad of one server is transferred completely to another Additional IP address needed PER server in clusterAdditional IP address needed PER server in cluster

KeepaliveD with a Twist - KeepaliveD with a Twist - ConfigurationConfiguration

vrrp_instance VI_1 {vrrp_instance VI_1 { state BACKUPstate BACKUP interface eth0interface eth0 virtual_router_id 201virtual_router_id 201 priority 55priority 55 … …virtual_ipaddress {virtual_ipaddress { 130.63.236.201130.63.236.201 }}}}

vrrp_instance VI_2 {vrrp_instance VI_2 { state MASTERstate MASTER interface eth0interface eth0 virtual_router_id 202virtual_router_id 202 priority 60priority 60 … …virtual_ipaddress {virtual_ipaddress {

130.63.236.202130.63.236.202 }}}}

vrrp_instance VI_3 {vrrp_instance VI_3 { state BACKUPstate BACKUP interface eth0interface eth0 virtual_router_id 203virtual_router_id 203 priority 45priority 45 … …virtual_ipaddress {virtual_ipaddress {

130.63.236.203130.63.236.203 }}}}

vrrp_instance VI_4 {vrrp_instance VI_4 { state BACKUPstate BACKUP interface eth0interface eth0 virtual_router_id 204virtual_router_id 204 priority 25priority 25 … …virtual_ipaddress {virtual_ipaddress {130.63.236.204130.63.236.204 }}}}

vrrp_instance VI_5 { state BACKUP interface eth0 virtual_router_id 205 priority 15 …virtual_ipaddress {

130.63.236.205 }}

(KeepaliveD with a Twist)(KeepaliveD with a Twist)22

For some services For some services LVS &LVS & KeepaliveDKeepaliveD, and , and KeepaliveD with a TwistKeepaliveD with a Twist are not enough are not enough Databases are a prime example (MySQL)Databases are a prime example (MySQL) In a replicated environment only the master must be In a replicated environment only the master must be

written towritten to LVS & KeepaliveD are excellent for the read-only LVS & KeepaliveD are excellent for the read-only

operations across the replicated environmentoperations across the replicated environment Health checking is only sufficient to validate the service, Health checking is only sufficient to validate the service,

not to take corrective actionsnot to take corrective actions


How do you deal with a master failure?How do you deal with a master failure?

Apply Apply LVS & KeepaliveDLVS & KeepaliveD and and KeepaliveD with a Twist KeepaliveD with a Twist simultaenouslysimultaenously

LVS & KeepaliveDLVS & KeepaliveD for read-only operations for read-only operations KeeapaliveD with a TwistKeeapaliveD with a Twist for master failover using the for master failover using the

system calls capabilities to make the necessary changessystem calls capabilities to make the necessary changes


Director Director

M

Write Ops


vrrp_instance VI_1 {vrrp_instance VI_1 { state BACKUPstate BACKUP interface eth0interface eth0 virtual_router_id 96virtual_router_id 96 priority 100priority 100 advert_int 1advert_int 1 smtp_alertsmtp_alert authentication {authentication { auth_type AHauth_type AH auth_pass yourpassauth_pass yourpass }} virtual_ipaddress {virtual_ipaddress { # mysql.yorku.ca# mysql.yorku.ca

130.63.236.230130.63.236.230 }}

notify_master "/usr/local/etc/notify_takeover"notify_master "/usr/local/etc/notify_takeover"}}

#!/bin/sh#!/bin/sh

/usr/bin/mysql –e “stop slave;”/usr/bin/mysql –e “stop slave;”sleep 2sleep 2/usr/bin/mysql –e “reset master;”/usr/bin/mysql –e “reset master;”

/usr/bin/mailx –s “`hostname` has taken over /usr/bin/mailx –s “`hostname` has taken over as master for mysql.yorku.ca” as master for mysql.yorku.ca” [email protected]@yorku.ca < /dev/null < /dev/null

mailto:[email protected]

(KeepaliveD with a (KeepaliveD with a Twist)2 – – Future ConsiderationFuture Consideration

Director Director

M M M

Where we are at YorkUWhere we are at YorkULVS & KeepaliveDLVS & KeepaliveD Public subnetPublic subnet

3 directors balancing:3 directors balancing:Mail delivery services (3 x Debian Linux)Mail delivery services (3 x Debian Linux)

ClamAV, DCC, MIMEDefang, SpamAssassin, Bogofilter, ClamAV, DCC, MIMEDefang, SpamAssassin, Bogofilter, Procmail, SendmailProcmail, Sendmail

Web-based email for students ( 3 x Debian Linux)Web-based email for students ( 3 x Debian Linux) Apache, PHP, Horde, IMP, Turba, MnemoApache, PHP, Horde, IMP, Turba, Mnemo

Web-based email for staff (2 x Debian Linux)Web-based email for staff (2 x Debian Linux) Apache, PHP, Horde, IMP, Turba, MnemoApache, PHP, Horde, IMP, Turba, Mnemo

Web Registration and Enrolment (2 x Solaris)Web Registration and Enrolment (2 x Solaris) Apache, WebObjectsApache, WebObjects

Where we are at YorkUWhere we are at YorkULVS & KeepaliveDLVS & KeepaliveD

3 directors (cont’d):3 directors (cont’d):Central web proxy service (2 x Debian Linux)Central web proxy service (2 x Debian Linux)

Apache2 – mod_proxy & mod_rewriteApache2 – mod_proxy & mod_rewriteCentral web services (2 x Debian Linux)Central web services (2 x Debian Linux)

Apache & the kitchen sinkApache & the kitchen sinkLDAP (3 x Debian Linux)LDAP (3 x Debian Linux)

OpenLDAPOpenLDAP

Private SubnetPrivate Subnet 2 directors balancing:2 directors balancing:

SSL Proxy Service (2 x Debian Linux)SSL Proxy Service (2 x Debian Linux) Apache2 – mod_proxy Apache2 – mod_proxy

Where we are at YorkUWhere we are at YorkUKeepaliveD with a TwistKeepaliveD with a Twist

Staff Postoffice – IMAP/POPStaff Postoffice – IMAP/POP 5 Servers (Debian Linux)5 Servers (Debian Linux)

UW-IMAPUW-IMAP

Student Postoffice – IMAP/POPStudent Postoffice – IMAP/POP 3 Servers (Debian Linux)3 Servers (Debian Linux)

Courier – only the IMAP and POP componentsCourier – only the IMAP and POP components

Where we are at YorkUWhere we are at YorkU(KeepaliveD with a Twist)(KeepaliveD with a Twist)22

Project for 2004Project for 2004 MySQL (3 x Debian Linux)MySQL (3 x Debian Linux)

Health checking to be conducted by 3 public subnet Health checking to be conducted by 3 public subnet directorsdirectors

Investigation into multiple masters still pending resultsInvestigation into multiple masters still pending results

So where’s the balance for So where’s the balance for maintenance?maintenance?

LVS & KeepAliveDLVS & KeepAliveD Remove a real-server from the service midday Remove a real-server from the service midday

with little to no effect on the servicewith little to no effect on the service KeepAliveD with a TwistKeepAliveD with a Twist

Remove a real-server during off-hours (turn off Remove a real-server during off-hours (turn off KeepAliveD), work on the server next day, add KeepAliveD), work on the server next day, add server back in to service after maintenance server back in to service after maintenance during off hoursduring off hoursOff hours work can be cron’d Off hours work can be cron’d

SummarizeSummarize

KeepaliveD allow us to achieve high KeepaliveD allow us to achieve high availability or close to that for many servicesavailability or close to that for many services

KeepaliveD can be manipulated to be a KeepaliveD can be manipulated to be a failover mechanism for Linux systemsfailover mechanism for Linux systems

It is possible to balance the high uptime and It is possible to balance the high uptime and maintenance paradox in many casesmaintenance paradox in many cases

Questions?Questions?

Documents

High Availability Almost Everywhere?