(Mark Ellisman, UCSD) - irit.frGeorges.Da-Costa/cours/grid/CoursGrille1.pdf · We need to get to one micron to know location of every cell. ... w . en s- ly of r/~d p zSCH EDUL tui

1

Grid ComputingJean-Marc Pierson

October 2008pierson at irit.fr

2

And comparisons must bemade among many

We need to get to one micron to know location of every cell. We’re just now starting to get to 10 microns

A Brain is a Lot of Data!(Mark Ellisman, UCSD)

3

Outline A very short introduction to Grids A brief introduction to parallelism A not so short introduction to Grids “Gridification” of a sequential program Some grid middlewares and grid projects In-depth study of the Globus middleware

4

Grid concepts : an analogy

Electric power distribution :the electric network and high voltage

2

5

Grids concepts

Computer power distribution :Internet network and high performance (parallelism and distribution)

6

A Definition ofGrid Computing

Ian Foster: “Resource sharing & coordinatedproblem solving in dynamic, multi-institutionalvirtual organizations”. http://www-fp.mcs.anl.gov/~foster/

Current views: see for instance the survey fromHeinz Stockinger in 2006 :http://hst.web.cern.ch/hst/publications/DefiningTheGrid-1.1.pdf

Oldest: Corbato in 1965:http://www.multicians.org/fjcc3.html

See http://en.wikipedia.org/wiki/Grid_computingfor more or alternate definitions

7

Concrete examplesof GridS

Europe: EGEEwww.eu-egee.org/240 institutions, 45 countries72,000 CPU, 20 PB disk, 200,000 concurrent jobsHigh Energy Physics, BioInformatics, Astrophysics

Several projects in US LabsJapan’s Naregi, GridBus in Australia, …France:

8

An Example VirtualOrganization: CERN’sLarge Hadron Collider

1800 Physicists, 150 Institutes, 32 Countries

100 PB of data by 2010; 70,000 CPUs

3

9

A Market or a niche?

Only a scientist thing? Market: 4 Billions $ by 2008 (www.gridtoday.com)

Major vendors are on the hype: IBM, Sun, Intel,Microsoft, Oracle, HP, Hitachi, …

About 30 big companies involved in the Open GridForum: www.gridforum.org

+ myriad of smaller companies. + services/experts companies (CS-SI, IBM, Atos,

Accenture, Cap Gemini, …) IN2P3, CEA, CNES, Metéo France, Airbus, IFP,

Banques, Peugeot, FT, … Simulations (Physics, Environment, Finance),

Genetics, Chemistry, Games, Data centers, … 10

Where does all thiscome from???

11

Parallelism : anintroduction

Grids dates back only 1996 Parallelism is older ! (first classification in 1972)

Motivations :need more computing power (weather forecast,

atomic simulation, genetics…)need more storage capacity (petabytes and more)in a word : improve performance ! 3 ways ...Work harder --> Use faster hardwareWork smarter --> Optimize algorithmsGet help --> Use more computers !

12

Parallelism : the oldclassification

Flynn (1972) Parallel architectures classified by the

number of instructions (single/multiple)performed and the number of data(single/multiple) treated at same time

SISD : Single Instruction, Single Data SIMD : Single Instruction, Multiple Data MISD : Multiple Instructions, Single Data MIMD : Multiple Instructions, Multiple Data

4

13

SIMD architectures

in decline since 97 (disappeared frommarket place)

concept : same instruction performed onseveral CPU (as much as 16384) on differentdata

data are treated in parallel

14

MIMD architectures

different instructions are performed inparallel on different data

divide and conquer : many subtasks inparallel to shorten global execution time

large heterogeneity of systems

15

Another taxonomy

based on how memories and processorsinterconnect

SMP : Symmetric Multiprocessors MPP : Massively Parallel Processors Constellations Clusters Distributed systems

16

SymmetricMulti-Processors (1/3)

Small number of identical processors (2-64)Share-everything architecture

single memory (shared memory architecture)single I/Osingle OSequal access to resources

CPU CPU CPU

HDMemory

network

5

17

Pro :Easy to program :

only one address space to exchange data(but programmer must take care ofsynchronization in memory access : criticalsection)


18


Cons :poor scalability : when the number of

processors increases, the cost to transfer databecomes too high; more CPUs = more accessmemory by the network = more need in memorybandwidth !

Direct transfer from proc. to proc. (->MPP)Different interconnection schema (full

impossible !, growing in O(n2) when nb of procsincreases by O(n)) : bus, crossbar, multistagecrossbar, ...

19

Massively ParallelProcessors (1/2)

Several hundred nodes with a high speedinterconnection network/switch

A share-nothing architectureeach node owns a memory (distributed memory),

one or more processors, each runs an OS copy

CPU

CPU

CPUMemory

network

Memory

Memory20

Pros :good scalability

Cons :communication between nodes longer than in shared

memory; improve interconnection schema : hypercube,(2D or 3D), torus, fat-tree, multistage crossbars

harder to program :data and/or tasks have to be explicitly distributed to nodesremote procedure calls (RPC, JavaRMI)message passing between nodes (PVM, MPI), synchronous or

asynchronous communications

DSM : Distributed Shared Memory : a virtual memoryupgrade : processors and/or communication ?

Massively ParallelProcessors (2/2)

6

21

Constellations

a small number of processors (up to 16) clusteredin SMP nodes (fast connection)

SMPs are connected through a less costlynetwork with “poorer” performance

With DSM, memory may be addressed globally :each CPU has a global memory view, memory andcache coherence is guaranteed (ccNuma)

CPUCPU

Memorynetwork

CPU

CPU CPUCPU

Memorynetwork

CPU

CPU CPUCPU

Memorynetwork

CPU

CPU

Interconnection network Periph.22

Clusters a collection of workstations (PC for

instance) interconnected through highspeed network, acting as a MPP/DSM withnetwork RAM and software RAID(redundant storage, // IO)

clusters = specialized version of NOW :Network Of Workstation

Pros :low coststandard componentstake advantage of unused computing power

23

Distributed systems

interconnection of independent computers each node runs its own OS each node might be any of SMPs, MPPs,

constellations, clusters, individualcomputer …

the heart of the Grid ! «A distributed system is a collection of independent

computers that appear to the users of the system as a singlecomputer » Distributed Operating System. A. Tanenbaum,Prentice Hall, 1994

24

Where are we today(Oct 20, 2008) ?

a source for efficient and up-to-dateinformation : www.top500.org

the 500 best architectures ! we are at 1 petaflops=1000 Teraflops 1 Flops = 1 floating point operation per

second 1 TeraFlop = 1000 GigaFlops = 100 000

MegaFlops = 1 000 000 000 000 flops = onethousand billion operations per second

7

25

Today's bests comparison on a similar matrix maths test (Linpack) :Ax=b

26

NEC earth simulatorSingle stage crossbar : 2700 km of cables

700 TB disk space1.6 PB mass storagearea : 4 tennis court,

3 floors

- a MIMD withDistributed Memory

27

How it grows ? in 1993 (14 years ago!)

n°1 : 59.7 GFlopsn°500 : 0.4 GflopsSum = 1.17 TFlops

in 2007 (yesterday?)n°1 : 280 TFlops (x4666)n°500 : 4 TFlops (x10000)Sum = 4920 Tflops

28

8

29 30

31 32

Problems of the parallelismTwo models of parallelism :

driven by data flow : how to distribute data ?driven by control flow : how to distribute tasks ?

Scheduling :which task to execute, on which data, when ?how to insure highest compute time (overlap

communication/computation?) ? Communication

using shared memory ?using explicit node to node communication ?what about the network ?

Concurrent accessto memory (in shared memory systems)to input/output (parallel Input/Output)

9

33

The performance ? Ideally grows linearly Speed-up :

if TS is the best time to treat a problem in sequential,its time should be TP=TS/P with P processors !

Speedup = TS/TP

limited (Amdhal law): any program has a sequentialand a parallel part : TS=F+T//, thus the speedup islimited : S = (F+T//)/(F+T///P)<1/F

Scale-up :if TPS is the time to treat a problem of size S with P

processors, then TPS should also be the time to treat aproblem of size n*S with n*P processors

34

Network performance analysis scalability : can the network be extended ?

limited wire length, physical problems

fault tolerance : if one node is down ?for instance in an hypercube

multiple access to media ? inter-blocking ? The metrics :

latency : time to connectbandwidth : measured in MB/s

35

Tools/environment for parallelism (1/2)

Communication between nodes :By global memory ! (if possible, plain or virtual)Otherwise :low-level communication : sockets

s = socket(AF_INET, SOCK_STREAM, 0 );

mid-level communication library (PVM, MPI)info = pvm_initsend( PvmDataDefault );info = pvm_pkint( array, 10, 1 );info = pvm_send( tid, 3 );remote service/object call (RPC, RMI, CORBA)service runs on distant node, only its name and

parameters (in, out) have to be known

36

Tools/environment for parallelism (2/2)Programming tools

threads : small processesdata parallel language (for DM archi.):

HPF (High Performance Fortran)say how data (arrays) are placed, the system will infer the

best placement of computation (to minimize totalcomputation time (e.g. further communications)

task parallel language (for SM archi.):OpenMP : compiler directives and library routines; based

on threads. The parallel program is close to sequential; it isa step by step transformParallel loop directives (PARALLEL DO)Task parallel constructs (PARALLEL SECTIONS)PRIVATE and SHARED data declarations

10

37

Bibliography / WebographyG.C Fox, R.D William and P.C Messina"Parallel Computing Works !"Morgan Kaufmann publisher, 1994, ISBN 1-55860-253-4

M. Cosnard and D Trystram"Parallel Algorithms and Architectures"Thomson Learning publisher, 1994, ISBN 1-85032-125-6

M. Gengler, S. Ubéda and F. Desprez,

"Initiation au parallélisme : concepts, architectures et algorithmes"

Masson, 1995, ISBN 2-225-85014-3

Parallelism: www.ens-lyon.fr/~desprez/SCHEDULE/tutorials.html www.buyya.com/cluster

Grids : www.lri.fr/~fci/Hammamet/Cosnard-Hammamet-9-4-02.ppt

TOP 500 : www.top500.org PVM:www.csm.ornl.gov/pvm

OpenMP : www.openmp.org HPF : www.crpc.rice.edu/HPFF38

Computational grid “HW and SW infrastructure that provides

dependable, consistent, pervasive and inexpensiveaccess to high-end computational capabilities”

Performance criteria : security reliability computing power latency services throughput

39

Grid Definition RefinedUse Open ProtocolsIs decentralizedDeliver non-trivial QoS

40

Levels of cooperation End system (computer, disk, sensor…)

multithreading, local I/O

Cluster (heterogeneous) synchronous communications, DSM, parallel I/O parallel processing

Intranet heterogeneity, distributed admin, distributed FS and

databases low supervision, resource discovery high throughput

Internet no control, collaborative systems, (international) WAN brokers, negotiation

11

41

Grid Characteristics Large Scale Heterogeneity Multiple Domains of Administration Autonomy… but coordination Dynamicity Flexibility Extensibility Security

42

Basic servicesAuthentication/Authorization/TraceabilityActivity control (monitoring)Resource informationResource brokeringSchedulingJob submission, data access/migration and

executionAccounting

43

Layered Grid Architecture(By Analogy to Internet Architecture)

Application

Fabric“Controlling things locally”: Accessto, & control of, resources

Connectivity“Talking to things”: communication(Internet protocols) & security

Resource“Sharing single resources”:negotiating access, controlling use

Collective“Coordinating multiple resources”:ubiquitous infrastructure services,app-specific distributed services

InternetTransport

Application

Link

Intern

et Protocol Arch

itecture

From I. Foster 44

Elements of theProblem Resource sharing

Computers, storage, sensors, networks, … Heterogeneity of device, mechanism, policy Sharing conditional: negotiation, payment, …

Coordinated problem solving Integration of distributed resources Compound quality of service requirements

Dynamic, multi-institutional virtual orgs Dynamic overlays on classic org structures Map to underlying control mechanisms

From I. Foster

12

45

Aspects of the Problem Need for interoperability when different groups want

to share resources Diverse components, policies, mechanisms E.g., standard notions of identity, means of communication,

resource descriptions

Need for shared infrastructure services to avoidrepeated development, installation E.g., one port/service/protocol for remote access to

computing, not one per tool/application E.g., Certificate Authorities: expensive to run

A common need for protocols & services

From I. Foster 46

ResourcesDescriptionAdvertisingCatalogingMatchingClaimingReservingCheckpointing

47

Resource layers Application layer

tasks, resource requests

Application resource management layer intertask resource management, execution environment

System layer resource matching, global brokering

Owner layer owner policy : who may use what

End-resource layer end-resource policy (e.g. O.S.)

48

Resource management(1)

Services and protocols depend on theinfrastructure

Some parameters stability of the infrastructure (same set of resources or

not) freshness of the resource availability information reservation facilities multiple resource or single resource brokering

Example request : I need from 10 to 100 CE eachwith at least 128 MB RAM and a computing powerof 50 Mips

13

49

Resource managementand scheduling (1)

Levels of scheduling job scheduling (global level ; perf : throughput) resource scheduling (perf : fairness, utilization) application scheduling (perf : response time, speedup,

produced data…)

Mapping/scheduling resource discovery and selection assignment of tasks to computing resources data distribution task scheduling on the computing resources (communication scheduling)

Individual perfs are not necessarily consistent withthe global (system) perf !

50

Resource managementand scheduling (2)

Grid problemspredictions are not definitive : dynamicity !Heterogeneous platformsCheckpointing and migration

51

GRAM GRAM GRAM

LSF Condor NQE

Application

RSL

Simple ground RSL

InformationService

Localresourcemanagers

RSLspecialization

Broker

Ground RSL

Co-allocator

Queries& Info

A ResourceManagement System example (Globus)

52

Resource information (1) What is to be stored ?

Organization, people, computing resources, softwarepackages, communication resources, event producers,devices…

what about data ???

A key issue in such dynamics environments A first approach : (distributed) directory (LDAP)

easy to use tree structure distribution static mostly read ; not efficient updating hierarchical poor procedural language

14

53

Resource information(2)

But :dynamicitycomplex relationshipsfrequent updatescomplex queries

A second approach : (relational) database

54

Programming the grid:potential programming

modelsMessage passing (PVM, MPI)Distributed Shared MemoryData Parallelism (HPF, HPC++)Task Parallelism (Condor)Client/server - RPCAgentsIntegration system (Corba, DCOM, RMI)

55

Program execution :issues

Parallelize the program with the right jobstructure, communication patterns/procedures,algorithms

Discover the available resources Select the suitable resources Allocate or reserve these resources Migrate the data (or the code) Initiate computations Monitor the executions ; checkpoints ? React to changes Collect results

56

Data management It was long forgotten !!! Though it is a key issue ! Issues :

indexing retrieval replication caching traceability (auditing)

And security !!!

15

57

From computing grids toinformation grids

58

From computing gridsto information grids (1)

Grids for long was lacking most of the tools mandatory to share(index, search, access), analyze, secure, monitor semanticdata (information)

Several reasons : history money difficulty

Why is it so difficult ? Sensitivity but openness Multiple administrative domains, multiple actors,

heterogeneousness but a single global architecture/view/system Dynamicity and unpredictability but robustness Wideness but high performance

59

From computing grids to information grids (2)ex : the Replica Management Problem

Maintain a mapping between logical names for files andcollections and one or more physical locations

Decide where and when a piece of data must be replicated Important for many applications Example: CERN high-level trigger data

Multiple petabytes of data per year Copy of everything at CERN (Tier 0) Subsets at national centers (Tier 1) Smaller regional centers (Tier 2) Individual researchers have copies of pieces of data

Much more complex with sensitive and complex data likemedical data !!!

60

From computing grids toinformation grids (3)

some (still…) open issues

Security, security, security (incl. privacy,monitoring, traceability…)) at a semantic level

Access protocols (incl. replication, caching,migration…)

Indexing tools Brokering of data (incl. accounting) (Content-based) Query optimization and execution Mediation of data Data integration, data warehousing and analysis

tools Knowledge discovery and data mining

16

61

Functional View of GridData Management

Location based ondata attributes

Location of one ormore physical replicas

State of grid resources, performance measurements and predictions

Metadata Service

Application

Replica LocationService

Information Services

Planner:Data location, Replica selection,Selection of compute and storage nodes

Security and Policy

Executor:Initiatesdata transfers andcomputations

Data Movement

Data Access

Compute Resources Storage Resources62

Security:Why Grid Security is Hard

Resources being used may be extremely valuable & theproblems being solved extremely sensitive

Resources are often located in distinct administrative domains Each resource may have own policies & procedures

Users may be different The set of resources used by a single computation may be

large, dynamic, and/or unpredictable Not just client/server

The security service must be broadly available & applicable Standard, well-tested, well-understood protocols Integration with wide variety of tools

63

1) Easy to use

2) Single sign-on

3) Run applicationsftp,ssh,MPI,Condor,Web,…

4) User based trust model

5) Proxies/agents (delegation)

1) Specify local access control

2) Auditing, accounting, etc.

3) Integration w/ local systemKerberos, AFS, license mgr.

4) Protection from compromisedresources

API/SDK with authentication, flexible message protection,

flexible communication, delegation, ...Direct calls to various security functions (e.g. GSS-API)Or security integrated into higher-level SDKs:

E.g. GlobusIO, Condor

User View Resource Owner View

Developer View

Grid Security : various views

64

Grid security :requirements

AuthenticationAuthorization and delegation of authorityAssuranceAccountingAuditing and monitoringTraceabilityIntegrity and confidentiality

17

65

Query optimization andexecution

Old wine in new bottles ? Yes and no : it seems the problem has not changed

but the operational context has so changed thatclassical heuristics and methods are not morepertinent

Key issues : dynamicity, unpredictability,adaptability

Very few works have specifically addressed thisproblem

Use mobile agents ?

66

Service OrientedArchitecture

Open Grid Service ArchitectureWSRF : Web Service Resource FrameworkEverything is resources, WS-ResourcesAccess through services

67

© Globus Tutorial

68

BibliographyThe Grid 2: Blueprint for a New Computing

Architecture. Ian Foster, Carl Kesselman

Grid Computing: The Savvy Manager’sGuide. Pawel Plaszczak, Richard Wellner Jr

Documents

(Mark Ellisman, UCSD) - irit.frGeorges.Da-Costa/cours/grid/CoursGrille1.pdf · We need to get to one micron to know location of every cell. ... w . en s- ly of r/~d p zSCH EDUL tui