72
How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan [email protected] p

How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan [email protected]

Embed Size (px)

Citation preview

Page 1: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

How to make PC Cluster Systems?

Tomo Hiroyasu Doshisha University Kyoto Japan [email protected]

Page 2: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

ClusterCluster

A Cluster is a type of parallel or distributed processing system, which consists of a collection of interconnected stand alone/complete computers cooperatively working together as a single, integrated computing resource.

clus·ter n. 1. A group of the same or similar elements gathered or occur

ring closely together; a bunch: “She held out her hand, a small tight cluster of fingers” (Anne Tyler).

2. Linguistics. Two or more successive consonants in a word, as cl and st in the word cluster.

Page 3: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Why Parallel Processing?

Page 4: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Evolutionary Computation

FeaturesIt simulates the mechanism of creatures’ heredity and evolution. It can apply to several types of problems.It needs a huge computational costs.

There are several individuals.Tasks can be divided into sub tasks.

High Performance Computing

Page 5: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Name Rmax(Gflops)4938

9632

2144

6144

1336

# Proc

8192

2379

5808

1608

1417

ASCI White

ASCI Red2

Top500Top500http://www.top500.org

SP Power III5

ASCI Blue4

ASCI Blue Pacific3

1

Ranking

Parallel Computers

Page 6: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Commodity HardwareCPU

Pentium

AlphaPower etc.

NetworkingInternet

Lan

Wan

Gigabit

cable less

etc.

PCs + Networking

PC Clusters

Page 7: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Why PC Cluster?Why PC Cluster?

hardwareCommodity Off-the-shelf

SoftwareOpen sourceFree ware

PeoplewareUniversity students and staffLab nerds

High ability

Low Cost

Easy to setup

Easy to use

Possession

Page 8: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Name Rmax(Gflops)237

232.6

143.3

96.2

64.7

# Proc

512

580

528

196

132

Los Lobos

CPlant Cluster84

Top500Top500http://www.top500.org

SCore II/PIII 800 MHz396

Kepler PIII 650 MHz215

CLIC PIII 800 MHz126

60

Ranking

Page 9: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Contents of this tutorial

Concept of PC ClustersSmall ClusterAdvanced Cluster

HardwareSoftware

Books, Web sites, …Conclusions

Page 10: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

What is cluster computing systems?

Page 11: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Beowulf Cluster

A Beowulf is a collection of personal computers (PCs) interconnected by widely available networking running any one of several open-source Unix-like operating systems.Some Linux clusters are built for reliability instead of speed. These are not Beowulfs. The Beowulf Project was started by Donald Becker when he moved to CESDIS in early 1994. CESDIS was located at NASA's Goddard Space Flight Center, and was operated for NASA by USRA.

http://beowulf.org/

Page 12: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

AvalonAvalon

Los Alamos NationalLaboratory

Alpha(140) + MyrinetBeowulfFirst Beowulf in the ranking of Top 500

http://cnls.lanl.gov/Frames/avalon-a.html

Page 13: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

The Berkeley NOW project

The Berkeley NOW project is building system support for using a network of workstations (NOW) to act as a distributed supercomputer on a building-wide scale. April 30, 1997: NOW makes LINPACK Top 500!June 15, 1998: NOW Retreat Finale

http://now.cs.berkeley.edu/

Page 14: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Cplant ClusterCplant Cluster

Sandia National LaboratoryAlpha(580) + Myrinet

http://www.cs.sandia.gov/cplant/

Page 15: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

RWCP ClusterRWCP Cluster

Japanese typical clusterScore, Open MPMyrinet

http://pdswww.rwcp.or.jp/

Page 16: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Doshisha Cluster

Pentium III 0.8G (256) + Fast EthernetPentium III 1.0 G (2*64) + Myrinet 2000

http://www.is.doshisha.ac.jp/cluster/index.html

Page 17: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Let’s start to build simple Cluster system !!

Page 18: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Simple Cluster

$10000

8nodes + gateway(file server)Fast EthernetSwitching Hub

Page 19: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

What do we need?Hardware

CPUmemorymotherboardhard disccasenetwork cardcablehub

Normal PCs

Page 20: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Classification of Parallel Computers

Page 21: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Classification of Parallel Computers

Page 22: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

What do we need?Software

OStoolsEditorCompilerParallel Library

Page 23: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Message passing

Page 24: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Message Passing Libraries

MPI (Message Passing Interface)

PVM (Parallel Virtual Machine)

PVM was developed at Oak Ridge National Laboratory and the University of Tennessee.

MPI is an API of message passing.1992: MPI forum1994 MPI 11887 MPI 2

http://www.epm.ornl.gov/pvm/pvm_home.html

http://www-unix.mcs.anl.gov/mpi/index.html

Page 25: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Implementations of MPI

Free ImplementationMPICH : LAM :WMPI : Windows 95 , NTCHIMP/MPIMPI Light

Bender ImplementationImplementations of parallel computersMPI/PRO :

Page 26: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Procedure of constructing clusters

Prepare several PCs

Connected PCs

Install OS and tools

Install developing tools and parallel library

Page 27: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Installing MPICH/LAM

# rpm –ivh lam-6.3.3b28-1.i386.rpm

# rpm –ivh mpich-1.2.0-5.i386.rpm

# dpkg –i lam2_6.3.2-3.deb# dpkg –i mpich_1.1.2-11.deb# apt-get install lam2# apt-get install mpich

Page 28: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Parallel programming (MPI)

Massive parallel computer

PC-Cluster

user

gateway

JobsTasks

Page 29: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

InitializationCommunicatorAcquiring number of process

Acquiring rank

Termination

Programming style sheet

# include “mpi.h”

int main( int argc, char **argv )

{

MPI_Init(&argc, &argv ) ;

MPI_Comm_size( …… ) ;MPI_Comm_rank( …… ) ;

/* parallel procedure */

MPI_Finalize( ) ;

return 0 ;

}

Page 30: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

One by one communication

Group communication

Communications

Process A Process BReceive/send data

Receive/send data

Page 31: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

One by one communication

[Sending] MPI_Send( void *buf, int count, MPI_Datat

ype datatype, int dest, int tag, MPI_Comm comm)void *buf : Sending buffer starting address (IN)

int count : Number of Data (IN)

MPI_ Datatype datatype : data type (IN)

int dest : receiving point (IN)

int tag : message tag (IN)

MPI_Comm comm : communicator(IN)

Page 32: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

[Receiving]MPI_Recv( void *buf, int count, MPI_Datatyp

e datatype, int source, int tag, MPI_Comm comm, MPI_Status status)

void *buf : Receiving buffer starting address (OUT)

int source : sending point (IN)

int tag : Message tag (IN)

MPI_Status *status : Status (OUT)

One by one communication

Page 33: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

~Hello.c~#include <stdio.h>#include "mpi.h"void main(int argc,char *argv[]){ int myid,procs,src,dest,tag=1000,count; char inmsg[10],outmsg[]="hello"; MPI_Status stat; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD,&myid); count=sizeof(outmsg)/sizeof(char); if(myid == 0){ src = 1; dest = 1; MPI_Send(&outmsg,count,MPI_CHAR,dest,tag,MPI_COMM_WORLD); MPI_Recv(&inmsg,count,MPI_CHAR,src,tag,MPI_COMM_WORLD,&stat); printf("%s from rank %d\n",&inmsg,src); }else{ src = 0; dest = 0; MPI_Recv(&inmsg,count,MPI_CHAR,src,tag,MPI_COMM_WORLD,&stat); MPI_Send(&outmsg,count,MPI_CHAR,dest,tag,MPI_COMM_WORLD); printf("%s from rank %d\n",&inmsg,src); } MPI_Finalize(); }

Page 34: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

One by one communication

MPI_Recv(&inmsg,count,MPI_CHAR,src, tag,MPI_COMM_WORLD,&stat); MPI_Send(&outmsg,count,MPI_CHAR,dest, tag,MPI_COMM_WORLD);

MPI_Sendrecv(&outmsg,count,MPI_CHAR,dest, tag,&inmsg,count,MPI_CHAR,src, tag,MPI_COMM_WORLD,&stat);

Page 35: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

0

0.5

1

1.5

2

2.5

3

3.5

4

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1x

y

Calculation of PI (approximation)

1

0 21

4dx

x

Integral calculus is divided in to sub sections.

Each subsection is allotted to processors.

Results of calculation are assembled.

-Parallel conversion-

Page 36: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Group communication

BroadcastMPI_Bcast( void *buf, int count, MPI_Datatype

datatype, int root, MPI_Comm comm )

Data

Rank of sending point

Page 37: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

• Communication and operation (reduce) MPI_Reduce( void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, int root, MPI_Comm comm ) Operation handle

Operation

Group Communication

Rank of receiving point

MPI_SUM, MPI_MAX, MPI_MIN, MPI_PROD

Page 38: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Approximation of PI Programming flow

Page 39: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

More Cluster systems !!

Page 40: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

CPU

Intel Pentium III, IVAMD Athlon Transmeta Crusoe

Hardware

http://www.intel.com/

http://www.amd.com/

http://www.transmeta.com/

Page 41: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Network

GigabitWake On LAN

Hardware

EthernetGigabit EthernetMyrinetQsNetGiganetSCI

AtollVIAInfinband

Page 42: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Hard disc

SCSIIDERaidDiskless Cluster

Hardware

http://www.linuxdoc.org/HOWTO/Diskless-HOWTO.html

Page 43: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Case

Rack

Hardware

Box

inexpensive

compact

maintenance

Page 44: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Software Software

Page 45: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

OS

Linux Kernels

Open source networkFree ware

The /proc file systemLoadable kernel modulesVirtual consolesPackage management

Features

Page 46: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

OS

Linux Kernels

Linux Distributions

Red Hat www.redhat.comDebian GNU/Linux www.debian.orgS.u.S.E. www.suse.comSlackware www.slackware.org

http://www.kernel.org/

Page 47: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Administration software

Administration software

NFS ( Network File System)NIS (Network Information System)NTP (Network Time Protocol)

server client

client

client

Page 48: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Resource Management and SchedulingResource Management and Scheduling

Process distributionLoad balanceJob scheduling of multiple tasks

CONDOR http://www.cs.wisc.edu/condor/DQS http://www.scri.fsu.edu/~pasko/dqs.htmlLSF http://www.platform.com/index.htmlThe Sun Grid Engine http://www.sun.com/software/gridware/

Page 49: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Tools for Program Development

Tools for Program Development

Editor Emacs

Language C, C++, Fortran, Java

CompilerGNU http://www.gnu.org/

NAG http://www.nag.co.uk

PGI http://www.pgroup.com/

VAST http://www.psrv.com/

Absoft http://www.absoft.com/

Fujitsu http://www.fqs.co.jp/fort-c/

Intel http://developer.intel.com/software/products/compilers/index.htm

Page 50: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Tools for Program Development

Tools for Program Development

MakeCVS Debugger

GdbTotal View http://www.etnus.com

Page 51: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Free MPI Implementations

Lam

http://www-unix.mcs.anl.gov/mpi/index.htmlEasy to useHigh portability

for UNIX, NT/Win, Globus

mpich

http://www.lam-mpi.org/High availability

Page 52: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

LAM (6.3.2)

0

10

20

30

40

50

60

0 20 40 60

Number of Process

Spe

edup

1X5X10X50X100X500X

MPICH (1.2.1)

0

10

20

30

40

50

60

0 20 40 60

Number of Process

Spe

edup

1X5X10X50X100X500X

# node 32 ,2Processor type Pentium III 700MHz

Memory 128 Mbytes OS Linux 2.2.16

Network Fast Ethernet , TCP/IP

Switching HUB

MPICH VS LAM ( SMP)

DGAGcc(2.95.3), mpicc-O2 –funroll - loops

Page 53: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

LAM (6.4-a3)

0

2

4

6

8

0 10 20 30

Number of Process

Spe

edup

1X5X10X50X100X500X

MPICH (1.2.0)

0

2

4

6

8

0 10 20 30

Number of Process

Spe

edup

# node 8processor Pentium 850MHzⅢmemory 256 Mbytes

OS Linux 2.2.17Network Fast Ethernet , TCP/IP

Switching HUB

DGAGcc(2.95.3), mpicc-O2 –funroll - loops

MPICH VS LAM (# process)

Page 54: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

ProfilerProfiler

MPE (MPICH)Paradyn http://www.cs.wisc.edu/paradyn/

Vampierhttp://www.pallas.de/pages/vampir.htm

Page 55: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Message passing library for WinMessage passing library for Win

PVMPVM3.4WPVM

MPImpichWMPI(Critical Software)MPICH/NT(Mississippi State Univ.)MPI Pro(MPI Software Technology)

Page 56: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Cluster Distribution

FAI http://www.informatik.uni-koeln.de/fai/

Alinka http://www.alinka.com/

Mosix http://www.mosix.cs.huji.ac.il/

Bproc http://www.beowulf.org/software/bproc.html

Scyld http://www.scyld.com/

Scorehttp://pdswww.rwcp.or.jp/dist/score/html/index.html

Page 57: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Math LibraryPhiPac from BerkeleyFFTW from MIT www.fftw.orgAtlas

Automatic Tuned Linear Algebra softwarewww.netlib.org/atlas/

ATLAS is an adaptive software architecture and faster than all other portable BLAS implementations and it is comparable with machine specific libraries provided by the vender.

Page 58: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Math Library

PETScPETSc is a large stuite of data structures and routin

es for both uni and parallel processor scientific computing.

http://www-fp.mcs.anl.gov/petsc/

Page 59: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Parallel Genetic Algorithms

Page 60: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Models of Parallel GAs

Master Slave (Micro grained )

Cellular (Fine grained)

Distributed GAs

(Island, Coarse grained)

Page 61: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Master Slave model

a) delivers each individual to slave

Master node

client client client

evaluate

client client client client client client

b) returns the value as soon as finishes calculation

c) sends non-evaluated individual from master

crossover mutation evaluation selection

evaluate evaluate

Page 62: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Cellular GAs

Page 63: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Distributed Genetic Algorithms(Island GAs)

subpopulation

migration

Page 64: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Searching Ability of DGAs

Page 65: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Books and Web sites

Page 66: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

BooksBooks

“Building Linux Clusters”“How to Build Beowulf”“High Performance Cluster Computing”

Page 67: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Web sitesWeb sitesIEEE Computer Society Task Force on Cluster Computinghttp://www.ieeetfcc.org/

White Paper http://www.dcs.port.ac.uk/~mab/tfcc/WhitePaper/

Cluster top 500http://clusters.top500.org/

Beowulf Projecthttp://www.beowulf.org/

Beowulf Under Groundhttp://www.beowulf-underground.org/

Page 68: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

In this tutorial….

Concept of cluster systemHow to built systemsParallel Genetic Algorithms

Page 69: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

SSI(Single System Image)

SSI(Single System Image)

Entry pointFile directoryControl pointVirtual NetworkMemory SpaceJob ManagerUser InterfaceMisc

Page 70: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Global Computing (GRID)

Internet

There are several types of computers

Powerful calculation resources

ex. SETI@home

Project rc5

Page 71: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

From global to space computingFrom global to space computing

Page 72: How to make PC Cluster Systems? Tomo Hiroyasu Doshisha University Kyoto Japan tomo@is.doshisha.ac.jp

Distributed ComputingDistributed Computing