Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
INFSORI508833
Enabling Grids for EsciencE
www.euegee.org
Grids and Grid Applications
C. Loomis (LALOrsay)EGEE Induction (ClermontFerrand)March 22, 2005
Grid Intro. – C. Loomis – 22/03/2005 2
Enabling Grids for EsciencE
INFSORI508833
Acknowledgements
• Based on previous presentations by:– Dave Berry (NeSC) & David Fergusson
• Contains material from:– Andrew Grimshaw (Univ. of Virginia)– Bob Jones (EGEE Tech. Director)– Mark Parsons (EPCC)– EDG Training Team– Roberto Barbera (INFN)– Ian Foster (Argonne National Laboratories)– Jeffrey Grethe (SDSC)– The National eScience Centre– M. Petitdidier (EGAPP presentation)– O. Gervasi (EGAPP presentation)
Grid Intro. – C. Loomis – 22/03/2005 3
Enabling Grids for EsciencE
INFSORI508833
Contents
• EGEE: Enabling Grids for EsciencE• Introduction to Grid Computing
– Motivation– Expectations & Constraints– Historical Perspective– Grid Architectures– Converging Technologies
• Grid Applications (eScience)– Characteristics of eScience– EGEE application areas– Typical Scenarios
• Summary/Questions
Grid Intro. – C. Loomis – 22/03/2005 4
Enabling Grids for EsciencE
INFSORI508833
Goal of Grid Computing
• Goal in one sentence:– Allow scientists from multiple domains to use, share, and
manage geographically distributed resources transparently.
• Simple statement, many consequences:– Not specific to a particular application.– Jobs, policies cross administrative & political domains.– Sharing requires a means for accounting.– Transparency implies standardized services & APIs.– Access control for data and services.– Dynamic and heterogeneous resources.
Grid Intro. – C. Loomis – 22/03/2005 5
Enabling Grids for EsciencE
INFSORI508833
Grid Actors
• Users– Scientists with tasks requiring computational resources.
• Virtual Organizations– People from different institutions with common goals.– Share computational resources to achieve those goals.
• System Administrators– People responsible for keeping an institute's resources running.– Ensuring efficient and correct use of available resources.
• Real Organizations– Institutes, funding agencies, governments, ...
• Standards Bodies– OASIS, GGF, W3C, IETF, ...
Grid Intro. – C. Loomis – 22/03/2005 6
Enabling Grids for EsciencE
INFSORI508833
Grid Vision
Grid
“M
iddl
ewar
e”
Grid Intro. – C. Loomis – 22/03/2005 7
Enabling Grids for EsciencE
INFSORI508833
Grid Vision
Grid
“M
iddl
ewar
e”Grid technology allows scientists:
• access resources universally
• interact with colleagues
• analyse voluminous data
• share results
Grid technology allow scientists:– access resources universally– interact with colleagues– analyze voluminous data– share results
Grid Intro. – C. Loomis – 22/03/2005 8
Enabling Grids for EsciencE
INFSORI508833
Grid Vision
Grid
“M
iddl
ewar
e” Includes traditional resources:
– raw compute power– storage (disk, tape, ...)– network connectivity
Resources are:– heterogeneous– dynamic
Grid Intro. – C. Loomis – 22/03/2005 9
Enabling Grids for EsciencE
INFSORI508833
Grid Vision
Grid
“M
iddl
ewar
e” Detectors produce huge amounts
of data for analysis.
Nontraditional resources:– scientific instruments– conferencing technologies
video audio chat
Grid Intro. – C. Loomis – 22/03/2005 10
Enabling Grids for EsciencE
INFSORI508833
Grid Vision
Grid
“M
iddl
ewar
e” Access to data:
– data files and datasets– databases– replica metadata– application metadata
Manage data:– transfer and copy data– locate relevant data
Grid Intro. – C. Loomis – 22/03/2005 11
Enabling Grids for EsciencE
INFSORI508833
Grid Vision
Grid
“M
iddl
ewar
e” Services:
– highlevel services to facilitate use of grid e.g. job brokering
– applicationspecific services e.g. portals
Grid Intro. – C. Loomis – 22/03/2005 12
Enabling Grids for EsciencE
INFSORI508833
Grid Vision
Grid
“M
iddl
ewar
e” What is the grid?
• Middleware: service interoperability highlevel services
• Resources: provided by participants shared for efficient use
Grid Intro. – C. Loomis – 22/03/2005 13
Enabling Grids for EsciencE
INFSORI508833
Scientific Motivation
• Avoid reinventing the wheel:– Many computational tasks are common.– Highlevel, standardized services avoid duplication.– Scientists concentrate on results rather than tools.
• Resource needs grow with time:– Start small for testing.– Push limits for ultimate sensitivity.– Grid APIs make finding and using additional resources easier.
• Data access:– Find and access existing data more easily.– Share results for others to build upon.
Grid Intro. – C. Loomis – 22/03/2005 14
Enabling Grids for EsciencE
INFSORI508833
Economic Motivation
• Use of computing resources varies with time.– Analysis rush before major conferences.– Endofquarter financial analyzes.– July and August holidays.
• Current solutions:– Buy peak needed capacity; idle in nonpeak periods.– Buy average capacity; delay results.
• Grid solution:– Share resources to timeshift availability.– Buy average capacity but get timely results!– Improve reliability with automatic failover.
Grid Intro. – C. Loomis – 22/03/2005 15
Enabling Grids for EsciencE
INFSORI508833
Grid Projects WorldwideAccess GridDISCOMDOE Science GridCondorESG (Earth System Grid)Fusion CollaboratoryGlobusGrADSoft (Grid Application Development Software)Grid CanadaGRIDS (Grid Research Integration Development & Support Center)GriPhyN (Grid Physics Network)iVDGL (International Virtual Data Grid Laboratory)Music GridNASA Information Power GridNCSA Alliance Access Grid
AstroGridAVO (Astrophysical Virtual Observatory)CombechemCrossGridDAME (Distributed Aircraft Maintenance Environment)DAMIEN (Distributed Applications and Middleware for Industrial Networks)DataTAGDiscovery NetDutchGridEDG (European DataGrid)EGSO (European Grid of Solar Observations)GEODISE (Grid Enabled Optimisation & Design Search for Engineering)
GRIA (Grid Resources for Industrial Applications)GridIrelandGridLab (Grid Application Toolkit and Testbed)GridPPLCG (LHC Computing Grid)MyGridNGIL (National Grid for Learning Scotland)NorduGrid (Nordic Testbed for Wide Area Computing and Data Handling)PIONIER GridReality GridScotGrid
ApGridApBioNetGrid Forum KoreaPRAGMA (Rim Applications and Grid Middleware Assembly)Grid Datafarm for Petascale Data Intensive ComputingGridbus Project
Grid Intro. – C. Loomis – 22/03/2005 16
Enabling Grids for EsciencE
INFSORI508833
Major European Grid Projects• European Funded
– European DataGrid– CrossGrid– DataTAG– LHC Computing Grid– GridLab– EUROGRID– DEISA– EGEE
• National Grid Efforts– INFN Grid– NorduGrid– UK eScience Programme
Grid Intro. – C. Loomis – 22/03/2005 17
Enabling Grids for EsciencE
INFSORI508833
Family Tree
EDG
Globus Condor
LCG Alien
EGEE
...
PanEuropean testbed.
Complete, functional set of services.
Significant productions demonstrated.
Subset of EDG services.
Improved robustness, scalability.
Worldwide production service.
Reengineering:RobustnessStandard interfaces
Expanded set of services.
Grid Intro. – C. Loomis – 22/03/2005 18
Enabling Grids for EsciencE
INFSORI508833
Underlying Technology
• Relative CPU, storage, and network capability impacts computing architecture.
Gilder’s Law(32X in 4 yrs)
Storage Law (16X in 4yrs)
Moore’s Law(5X in 4yrs)
Perfo
rman
ce p
er D
olla
r Spe
nt Optical Fibre(bits per second)
Chip capacity(numb. transistors)
Data Storage(bits per sq. inch)
0 1 2 3 4 5
9 12 18
Doubling Time(months)
Triumph of Light – Scientific American. George Stix, January 2001
Gilder’s Law(32X in 4 yrs)
Storage Law (16X in 4yrs)
Moore’s Law(5X in 4yrs)
Perfo
rman
ce p
er D
olla
r Spe
nt Optical Fibre(bits per second)
Chip capacity(numb. transistors)
Data Storage(bits per sq. inch)
0 1 2 3 4 5
9 12 18
Doubling Time(months)
Triumph of Light – Scientific American. George Stix, January 2001Number of Years
Grid Intro. – C. Loomis – 22/03/2005 19
Enabling Grids for EsciencE
INFSORI508833
Historical Perspective
• Local Computing– All computing resources at single site.– People move to resources to work.
• Remote Computing– Resources accessible from distance.– All significant resources still centralized.
• Distributed Computing– Resources geographically distributed.– Specialized access; largely data transfers.
• Grid Computing– Resources and services geographically distributed.– Standard interfaces; transfers of computations and data.
Grid Intro. – C. Loomis – 22/03/2005 20
Enabling Grids for EsciencE
INFSORI508833
LCG Architecture
submit
Broker
...
Resource
Resource
InfoSys.
User
submit
publish descriptionsquery
Batchlike architecture.
Stable, wellmaintained resources.
Secured via Public Key Infrastructure (PKI)
Heavy support infrastructure.
Can handle large range of resources.
Grid Intro. – C. Loomis – 22/03/2005 21
Enabling Grids for EsciencE
INFSORI508833
“PeertoPeer” Architecture
Peertopeer architecture (BOINC, XtremeWeb)
Volatile resources.
Limited security (client identifies server).
Lightweight infrastructure.
Handles limited types of resources.
App. DB
...
Resource
ResourceUser pull task
...
Grid Intro. – C. Loomis – 22/03/2005 22
Enabling Grids for EsciencE
INFSORI508833
Service Oriented Architecture
Existing LCG system is largely serviceoriented.
EGEE evolving to a clean SOA:standard interfacesstandard technologies
Client Service
Registry
querylocation
publish description
interaction
Grid Intro. – C. Loomis – 22/03/2005 23
Enabling Grids for EsciencE
INFSORI508833
Convergence of Technologies
• Web Services– Clean, complete specification of service APIs.– Supported technology:
Good support within commercial sector. Adequate support within opensource community.
– Very active ➔ proposed standards rapidly evolving.
• EGEE Service Evolution– Plain web services:
Avoid “proprietary” protocols and interfaces. Fairly stable, will ease further evolution.
– Adopt WSRF and/or WS* standards as appropriate.
• Expect uservisible changes in APIs.
Grid Intro. – C. Loomis – 22/03/2005 24
Enabling Grids for EsciencE
INFSORI508833
eScience
• eScience: Pushing frontiers of scientific discovery by exploiting advanced computational methods.
• Use grid technology to:– Generate, curate, and analyze research data.– Develop and explore models and simulations.– Facilitate sharing of data, results, and resources.
Grid Intro. – C. Loomis – 22/03/2005 25
Enabling Grids for EsciencE
INFSORI508833
EGEE Applications• Biomedical Applications
– imaging, diagnosis, treatment– genome and protein studies
• HighEnergy Physics– simulation of particle interactions– analysis of detector data
• Earth Science– observing terrestrial conditions– natural resources
• Computational Chemistry– simulation of chemical properties
Grid Intro. – C. Loomis – 22/03/2005 26
Enabling Grids for EsciencE
INFSORI508833
Typical Scenarios
• Batch Use– Use the grid as a huge computational resource.– Simulation plays an important role in nearly all fields.
• Portal Use– Use grid for load balancing of standardized applications.– Provides easy interface which hides grid complexities.
• Agent Use– Centralized control and monitoring of a large production.– “Agent” jobs contact central database when started.
• Interactive Use– Provide improved response time for intensive calculations.– Debugging of applications in situ.
Grid Intro. – C. Loomis – 22/03/2005 27
Enabling Grids for EsciencE
INFSORI508833
Badly Adapted Uses
• Multisite parallel jobs:– Can't guarantee simultaneous start of all processes.– Can't easily control bandwidth and latencies between processes.– Can expose MPIenabled site as a grid resource.
• Tasks requiring realtime response.– Same as above: start up and response latencies, bandwidth.– If start up latency OK, can use “agent” model.
Grid Intro. – C. Loomis – 22/03/2005 28
Enabling Grids for EsciencE
INFSORI508833
Summary
Grid technology attractive to many scientific endeavors:• Provides means of sharing resources to:
– reduce overall hardware cost– reduce response times– improve reliability
• Standardized, highlevel APIs:– allow services to interoperate effectively– allow scientists to concentrate on science rather than tools
• EGEE:– improving the technology– deploying a powerful, worldwide grid
Grid Intro. – C. Loomis – 22/03/2005 29
Enabling Grids for EsciencE
INFSORI508833
Questions
• Questions?