View
249
Download
7
Category
Preview:
Citation preview
金仲達國立清華大學資訊工程學系
king@cs.nthu.edu.tw
Cluster Computing:
An Introduction
2
Clusters Have Arrived
3
What is a Cluster?
A collection of independent computer systems working together as if a single system
Coupled through a scalable, high bandwidth, low latency interconnect
The nodes can exist in a single cabinet or be separated and connected via a network
Faster, closer connection than a network (LAN) Looser connection than a symmetric multiprocessor
4
Outline
Motivations of Cluster Computing Cluster Classifications Cluster Architecture & its Components Cluster Middleware Representative Cluster Systems Task Forces on Cluster Resources and Conclusions
5
Motivations of
Cluster Computing
6
How to Run Applications Faster ?
There are three ways to improve performance: Work harder Work smarter Get help
Computer analogy Use faster hardware: e.g. reduce the time per
instruction (clock cycle) Optimized algorithms and techniques Multiple computers to solve problem
=> techniques of parallel processing is mature and can be exploited commercially
7
Motivation for Using Clusters
Performance of workstations and PCs is rapidly improving
Communications bandwidth between computers is increasing
Vast numbers of under-utilized workstations with a huge number of unused processor cycles
Organizations are reluctant to buy large, high performance computers, due to the high cost and short useful life span
8
Motivation for Using Clusters
Workstation clusters are thus a cheap and readily available approach to high performance computing Clusters are easier to integrate into existing networks Development tools for workstations are mature
Threads, PVM, MPI, DSM, C, C++, Java, etc.
Use of clusters as a distributed compute resource is cost effective --- incremental growth of system!!! Individual node performance can be improved by adding
additional resource (new memory blocks/disks) New nodes can be added or nodes can be removed Clusters of Clusters and Metacomputing
9
Key Benefits of Clusters
High performance: running cluster enabled programs
Scalability: adding servers to the cluster or by adding more clusters to the network as the need arises or CPU to SMP
High throughput System availability (HA): offer inherent high system
availability due to the redundancy of hardware, operating systems, and applications
Cost-effectively
10
Why Cluster Now?
11
Hardware and Software Trends
Important advances taken place in the last five year Network performance increased with reduced cost Workstation performance improved
Average number of transistors on a chip grows 40% per year Clock frequency growth rate is about 30% per year Expect 700-MHz processors with 100M transistors in early 2000
Availability of powerful and stable operating systems (Linux, FreeBSD) with source code access
12
Why Clusters NOW?
Clusters gained momentum when three technologies converged:Very high performance microprocessors
workstation performance = yesterday supercomputers
High speed communicationStandard tools for parallel/ distributed computing &
their growing popularity Time to market => performance Internet services: huge demands for scalable, available,
dedicated internet servers big I/O, big compute
13
Efficient Communication
The key enabling technology:from killer micro to killer switch Single chip building block for
scalable networks high bandwidth low latency very reliable
Challenges for clusters greater routing delay and less than
complete reliability constraints on where the network
connects into the node UNIX has a rigid device and
scheduling interface
14
Putting Them Together ...
Building block = complete computers(HW & SW) shipped in 100,000s:Killer micro, Killer DRAM, Killer disk,Killer OS, Killer packaging, Killer investment
Leverage billion $ per year investment Interconnecting building blocks => Killer Net
High bandwidth Low latency Reliable Commodity (ATM, Gigabit Ethernet,
MyridNet)
15
Windows of Opportunity
The resources available in the average clusters offer a number of research opportunities, such as Parallel processing: use multiple computers to build
MPP/DSM-like system for parallel computing Network RAM: use the memory associated with each
workstation as an aggregate DRAM cache Software RAID: use the arrays of workstation disks to
provide cheap, highly available, and scalable file storage Multipath communication: use the multiple networks
for parallel data transfer between nodes
16
Windows of Opportunity
Most high-end scalable WWW servers are clusters end services (data, web, enhanced information services,
reliability)
Network mediation services also cluster-based Inktomi traffic server, etc. Clustered proxy caches, clustered firewalls, etc. => These object web applications increasingly compute
intensive => These applications are an increasing part of the
“scientific computing”
17
Classification of
Cluster Computers
18
Clusters Classification 1
Based on Focus (in Market)High performance (HP) clusters
Grand challenging applicationsHigh availability (HA) clusters
Mission critical applications
19
HA Clusters
20
Clusters Classification 2
Based on Workstation/PC OwnershipDedicated clustersNon-dedicated clusters
Adaptive parallel computingCan be used for CPU cycle stealing
21
Clusters Classification 3
Based on Node ArchitectureClusters of PCs (CoPs)Clusters of Workstations (COWs)Clusters of SMPs (CLUMPs)
22
Clusters Classification 4
Based on Node Components Architecture & Configuration:Homogeneous clusters
All nodes have similar configuration
Heterogeneous clusters Nodes based on different processors and running
different OS
23
Clusters Classification 5
Based on Levels of Clustering:Group clusters (# nodes: 2-99)
A set of dedicated/non-dedicated computers --- mainly connected by SAN like Myrinet
Departmental clusters (# nodes: 99-999)Organizational clusters (# nodes: many 100s)Internet-wide clusters = Global clusters
(# nodes: 1000s to many millions) Metacomputing
24
Clusters and Their
Commodity Components
25
Cluster Computer Architecture
26
Cluster Components...1aNodes
Multiple high performance components:PCsWorkstationsSMPs (CLUMPS)Distributed HPC systems leading to
Metacomputing They can be based on different architectures and
running different OS
27
Cluster Components...1bProcessors There are many (CISC/RISC/VLIW/Vector..)
Intel: Pentiums, Xeon, Merced…. Sun: SPARC, ULTRASPARC HP PA IBM RS6000/PowerPC SGI MPIS Digital Alphas
Integrating memory, processing and networking into a single chip IRAM (CPU & Mem): (http://iram.cs.berkeley.edu) Alpha 21366 (CPU, Memory Controller, NI)
28
Cluster Components…2OS
State of the art OS: Tend to be modular: can easily be extended and new
subsystem can be added without modifying the underlying OS structure
Multithread has added a new dimension to parallel processing
Popular OS used on nodes of clusters: Linux (Beowulf) Microsoft NT (Illinois HPVM) SUN Solaris (Berkeley NOW) IBM AIX (IBM SP2) …..
29
Cluster Components…3High Performance Networks
Ethernet (10Mbps) Fast Ethernet (100Mbps) Gigabit Ethernet (1Gbps) SCI (Dolphin - MPI- 12 usec latency) ATM Myrinet (1.2Gbps) Digital Memory Channel FDDI
30P
Cluster Components…4Network Interfaces
Dedicated Processing power and storage embedded in the Network Interface
An I/O card today Tomorrow on chip?
$
M I/O bus (S-Bus)50 MB/s
MryicomNet
P
Sun Ultra 170
MyricomNIC
160 MB/s
M
31
Cluster Components…4Network Interfaces
Network interface cardMyrinet has NICUser-level access support: VIAAlpha 21364 processor integrates processing,
memory controller, network interface into a single chip..
32
Cluster Components…5 Communication Software
Traditional OS supported facilities (but heavy weight due to protocol processing).. Sockets (TCP/IP), Pipes, etc.
Light weight protocols (user-level): minimal Interface into OS User must transmit directly into and receive from the
network without OS intervention Communication protection domains established by
interface card and OS Treat message loss as an infrequent case Active Messages (Berkeley), Fast Messages (UI), ...
33
Cluster Components…6aCluster Middleware
Resides between OS and applications and offers an infrastructure for supporting:Single System Image (SSI)System Availability (SA)
SSI makes collection of computers appear as a single machine (globalized view of system resources)
SA supports check pointing and process migration, etc.
34
Cluster Components…6bMiddleware Components
Hardware DEC Memory Channel, DSM (Alewife, DASH) SMP
techniques
OS/gluing layers Solaris MC, Unixware, Glunix
Applications and Subsystems System management and electronic forms Runtime systems (software DSM, PFS etc.) Resource management and scheduling (RMS):
CODINE, LSF, PBS, NQS, etc.
35
Cluster Components…7aProgramming Environments
Threads (PCs, SMPs, NOW, ..) POSIX Threads Java Threads
MPI Linux, NT, on many Supercomputers
PVM Software DSMs (Shmem)
36
Cluster Components…7bDevelopment Tools?
Compilers C/C++/Java/
RAD (rapid application development tools):GUI based tools for parallel processing modeling
Debuggers Performance monitoring and analysis tools Visualization tools
37
Cluster Components…8Applications
Sequential Parallel/distributed (cluster-aware applications)
Grand challenging applications Weather Forecasting Quantum Chemistry Molecular Biology Modeling Engineering Analysis (CAD/CAM) ……………….
Web servers, data-mining
38
Cluster Middleware and
Single System Image
39
Middleware Design Goals
Complete transparency Let users see a single cluster system
Single entry point, ftp, telnet, software loading...
Scalable performance Easy growth of cluster
no change of API and automatic load distribution
Enhanced availability Automatic recovery from failures
Employ checkpointing and fault tolerant technologies
Handle consistency of data when replicated..
40
Single System Image (SSI)
A single system image is the illusion, created by software or hardware, that a collection of computers appear as a single computing resource
Benefits: Usage of system resources transparently Improved reliability and higher availability Simplified system management Reduction in the risk of operator errors User need not be aware of the underlying system
architecture to use these machines effectively
41
Desired SSI Services
Single entry point telnet cluster.my_institute.edu telnet node1.cluster.my_institute.edu
Single file hierarchy: AFS, Solaris MC Proxy Single control point: manage from single GUI Single virtual networking Single memory space - DSM Single job management: Glunix, Condin, LSF Single user interface: like workstation/PC
windowing environment
42
SSI Levels
Single system support can exist at different levels within a system, one is able to be built on another
Application and Subsystem Level
Operating System Kernel Level
Hardware Level
46
Availability Support Functions
Single I/O space (SIO): Any node can access any peripheral or disk devices
without the knowledge of physical location.
Single process space (SPS) Any process can create processes on any node, and
they can communicate through signals, pipes, etc, as if they were one a single node
Checkpointing and process migration Saves the process state and intermediate results in memory
or disk; process migration for load balancing
Reduction in the risk of operator errors
47
Relationship among Middleware Modules
48
Strategies for SSI
Build as a layer on top of existing OS (e.g. Glunix) Benefits:
Makes the system quickly portable, tracks vendor software upgrades, and reduces development time
New systems can be built quickly by mapping new services onto the functionality provided by the layer beneath, e.g. Glunix/Solaris-MC
Build SSI at the kernel level (True Cluster OS) Good, but can’t leverage of OS improvements by
vendor e.g. Unixware and Mosix (built using BSD Unix)
49
Representative Cluster Systems
50
Research Projects of Clusters
Beowulf: CalTech, JPL, and NASA Condor: Wisconsin State University DQS (Distributed Queuing System): Florida
State U. HPVM (High Performance Virtual Machine):
UIUC& UCSB Gardens: Queensland U. of Technology, AU NOW (Network of Workstations): UC Berkeley PRM (Prospero Resource Manager): USC
51
Commercial Cluster Software
Codine (Computing in Distributed Network Environment): GENIAS GmbH, Germany
LoadLeveler: IBM Corp. LSF (Load Sharing Facility): Platform Computing NQE (Network Queuing Environment): Craysoft RWPC: Real World Computing Partnership, Japan Unixware: SCO Solaris-MC: Sun Microsystems
55
Comparison of 4 Cluster Systems
56
Task Forces
on Cluster Computing
57
IEEE Task Force on Cluster Computing (TFCC)
http://www.dgs.monash.edu.au/~rajkumar/tfcc/
http://www.dcs.port.ac.uk/~mab/tfcc/
58
TFCC Activities
Mailing list, workshops, conferences, tutorials, web-resources etc.
Resources for introducing the subject in senior undergraduate and graduate levels
Tutorials/workshops at IEEE Chapters ….. and so on.
Visit TFCC Page for more details: http://www.dgs.monash.edu.au/~rajkumar/tfcc/
59
Efforts in Taiwan
PC Farm Project at Academia Sinica Computing Center: http://www.pcf.sinica.edu.tw/
NCHC PC Cluster Project: http://www.nchc.gov.tw/project/pccluster/
60
NCHC PC Cluster
A Beowulf class cluster
61
System Hardware
5 Fast Ethernet switching hubs
62
System Software
63
Conclusions
Clusters are promising and funOffer incremental growth and match with
funding patternNew trends in hardware and software
technologies are likely to make clusters more promising
Cluster-based HP and HA systems can be seen everywhere!
64
The Future
Cluster system using idle cycles from computers will continue
Individual nodes will have of multiple processors Widespread usage of Fast and Gigabit Ethernet and
they will become de facto network for clusters Cluster software bypass OS as much as possible Unix-based OS are likely to be most popular, but
the steady improvement and acceptance of NT will not be far behind
65
The Challenges
Programming enable applications, reduce programming effort, distributed
object/component models?
Reliability (RAS) programming effort, reliability with scalability to 1000’s
Heterogeneity performance, configuration, architecture and interconnect
Resource Management (scheduling, perf. pred.) System Administration/Management Input/Output (both network and storage)
66
Pointers to Literature on
Cluster Computing
67
Reading Resources..1aInternet & WWW
Computer architecture:http://www.cs.wisc.edu/~arch/www/
PFS and parallel I/O:http://www.cs.dartmouth.edu/pario/
Linux parallel processing:http://yara.ecn.purdue.edu/~pplinux/Sites/
Distributed shared memory:http://www.cs.umd.edu/~keleher/dsm.html
68
Reading Resources..1bInternet & WWW
Solaris-MC:http://www.sunlabs.com/research/solaris-mc
Microprocessors: recent advanceshttp://www.microprocessor.sscc.ru
Beowulf:http://www.beowulf.org
Metacomputinghttp://www.sis.port.ac.uk/~mab/Metacomputing/
69
Reading Resources..2Books
In Search of Clusterby G.Pfister, Prentice Hall (2ed), 98
High Performance Cluster ComputingVolume1: Architectures and SystemsVolume2: Programming and Applications
Edited by Rajkumar Buyya, Prentice Hall, NJ, USA.
Scalable Parallel Computingby K Hwang & Zhu, McGraw Hill,98
70
Reading Resources..3Journals
“A Case of NOW”, IEEE Micro, Feb1995 by Anderson, Culler, Paterson
“Fault Tolerant COW with SSI”, IEEE Concurrency by Kai Hwang, Chow, Wang, Jin, Xu
“Cluster Computing: The Commodity Supercomputing”, Journal of Software Practice and Experience by Mark Baker & Rajkumar Buyya
Recommended