Scalable High Availability TechnoIogies - Óbudai Egyetemusers.nik.uni-obuda.hu/poserne/eap/Scalable_High... · 2012-11-28 · •Owner issues: The users usually can manage the service

Scalable high availability technologies

Gergely Tomka

prototype template (5428278)\screen library_new_final.ppt 11/28/2012

Session Outline

2

• What is high availability (a practical approach)

• Variations:

• Hardware solutions

• Hot/Cold clusters

• Large clusters

• Requirements, implementations and results


High availability

• The service must be available when we want it

• 24/7/365 availability – is it real?

• Problems:

− How to detect a service failure?

− How to detect if the service is working?

− How to stop the service reliably?

− How to start a service reliably?

• Where is the single point of failure?

3


Single Point Of Failure

• Desktop machine: one disk, one NIC, one CPU, one PSU

• Server machine: one motherboard/system bus, multiple cpus, psus, nics

• Mainframe: three cores, two is working on the same issue, and the third is checking if the results are the same.

• Two node cluster from server machines: no SPOF in hardware!

• Are you sure?

4


Hardware redundancy

• Every cable must be duplicated

• Every network device must be replicated

• The network protocol must be able to handle the loss of a link

• SAN connections and devices most be doubled too

• System operators must be redundant and available too, no lonely heroes...

5


Single Point Of Failure

• Problems with the two nodes:

− Connection for checking service availability

− Common resources, mostly SAN disks

− Different levels of disaster: citywide? Or just a small fire in the serverroom?

− Price

• Example of risk assesment:

• A huge bomb could make 20 km destruction

• Place the two servers at least 30 km away

• If there was more than one 10 Mt H-bomb, trading will stop anyway.

6


Hot/Cold clusters

• Two nodes, one is working, one is waiting for trouble

• Users must define:

− Resources: disks, network interfaces, processes

− Scripts: for checking, starting and stopping the service or the resources

• Veritas Cluster Services or Coyote could be used for this purpose

• Failover is when the roles are changed

• Heartbeat: various solutions to allow the nodes to check on each other, usually via network.

7


Failure modes of Hot/cold clusters

• Maintenance script issues: service is down, but the script shows it up, etc.

− Solution: better scripts

• Owner issues: The users usually can manage the service without VCS, which means a service is not under control.

− Hard to detect, as the service is up, but the VCS tools are not in sync with reality

− Good recipe for disaster

• Lost connection: lost heartbeat, can lead to split brain

− Network or overload

• Split brain: when the two nodes are both hot

− Perfect recipe for disaster, especially with shared storage

− Gazelle/SCSI disk reservation

• Failed resources

8


Special cases

• Hot/hot node: two services, one is the default for each nodes.

• Various fencing solutions:

− Human: only manual failover

− SCSI disk reservation

9


Multinode clusters

• Ultimately one server is not powerful enough

• Variations:

− Load-balancing clusters

− Grids

− ESX

• Applications must be modified

• Performance limit is the sky

• Can be made efficiently

10


Load balancing

11


Load balancers

• The load balancer is a tool to distribute load between computing nodes

• Works with multiple clients, small jobs (transactions)

• It is, in itself a single point of failure

• Clustering, quick failover, preservation of states is important

• Methods for distributing the load:

− DNS : slow, not flexible, easy

− Router: quick, flexible, but the traffic must go through the load balancer

− Layer 7/App level load balancing: the load balancer must examine the traffic, not just passing it to the nodes

12


Load balancers

• Selecting nodes:

− Round robin

− Least loaded

− Standby nodes

• F5 could be used, but the Linux Virtual Server is also very good for testing

13


Grid

• In simple terms a grid is a group of computers working on the same tasks

• Works with big customers and parallel tasks

• Distributing the tasks and collecting the results is a big task

− Overloaded fileservers

− Torrent-like task and input data distribution

− Bandwidth

• The application must be rewritten for hundreds of independent nodes

14


Grid

• Nodes must be:

− Cheap

− Efficient

− Easy to replace

− Properly monitored

15


Grid effect

16


Grid Management

• Managing hundreds of nodes is a different art

• No individual alerts for errors

• No hardware repairs

• Statistical analysis of logfiles

• Geospatial analysis: find broken rooms/racks/networks/buildings

17


Sample Grid troubleshooting

/afs/grid_utils/bin/search "Waiting for busy volume" /var/log/syslog |

/afs/grid_utils/bin/message_filter.pl

Shows that it occurs 50.000 times a day on 17.000 hosts, roughly 3/host/day, winner has

13/day. So it's not a small group of hosts, not a single personality, not application

related, probably not a huge deal at all, just background noise.

Is it AFS-cell related? No, the distribution of the error is even among AFS cells:

$ cut -d" " -f15 afs.txt | sort | uniq -c

Is it tied to a single or few AFS volumes? Not sure, we have 157 volumes with the error per

day but only 12 with 1000+ errors, 20 with 10-1000 errors, the rest less than 3. Top AFS

volume IDs: 5879181, 5369925, 5368179.

$ cut -d" " -f11 afs.txt | sort | uniq -c | sort -nr

Is it tied to a certain time of the day? I cut out the hours:minutes from the timestamp. Not

really, it's nicely distributed throughout the day, every hour has a few, however it

stands out, that it always occurs at :00 minutes, every hour, with higher activity at

7:00, 15:00, 21:00 - now that's suspicious.

$ cut -d" " -f4 afs | cut -d":" -f1,2 | sort | uniq -c

18


ESX

• VmWare ESXi is one of the current solutions for virtualized desktops

• Advantages of virtual desktops:

− Smaller, quieter devices in the workspace (thin client, old desktop machines)

− Desktop applications could be closer to the data

• Disadvantages:

− Added complexity

− Very new idea, lots of new problems

19


ESX

• Effects on teamwork:

− Windows desktops

− Vcenter is running on Windows servers

− ESXi is a stripped down Linux

− Storage is on netapp filers

20


ESX clusters

• Vmotion – seamless migration of virtual hosts between nodes

− Vcenter software is necessary

− Cluster size is only 16 nodes

− A Vcenter can handle only 300 nodes

− Good for maintenance and for performance

• Vcenter is a serious disadvantage when you need hundreds of nodes

• Thousands of desktops are troublesome

− Very different culture in maintenance (reboot!)

− Disk activity, scheduled tasks (antivirus)

21


Storage solutions

• Cache systems

− Can help on read operations

− Cannot help on write operations

• Deduplication

− For ESX

− Need more CPU to recognize similar blocks/files, but it’s using less space

• Speed

− Handmade NFS fileserver, 10GB writes got saturated immediately

22


Q & A

23

Documents

Scalable High Availability TechnoIogies - Óbudai Egyetemusers.nik.uni-obuda.hu/poserne/eap/Scalable_High... · 2012-11-28 · •Owner issues: The users usually can manage the service