View
5
Download
0
Category
Preview:
Citation preview
Introduction to Grids
Dr Kevin Vella
Department of Computer ScienceUniversity of Malta
2
Overview
� Introduction
� Grid Architecture� Grid Applications� Grid Projects
3
Grid Computing� An analogy with the electrical power grid
� It is pervasive, just log (or plug) in and use anywhere� It is a utility, you ask for resources and you get what you asked
for� You do not know or care where the resources are coming from
(power station/data centre)
� The Web enables seamless sharing of information� Developed to enable information sharing between researchers
� The Grid enables seamless sharing of information, compute power, storage, databases and applications� Revolutionising the way research is conducted� Empowering researchers in remote areas with limited facilities
4
Grid Computing� Grid Computing: controlled sharing of geographically
distributed resources that are owned and administered by several organisations� Supercomputers (Roadrunner – c. 1 petaflop/s)� Big Science equipment (telescopes - SKA, particle
accelerators – LHC)� Digital archives
� Grid: distributed supercomputer with access to huge data resources and world-class equipment (EGI/EGEE, OSG, WLHCG)
� Agreed-upon standards and protocols are crucial (OGF, OGSA)
� Compliant middleware (often based on web service standards) needs to be present on all participating nodes (GLite, Globus)
� Batch jobs/interactive web-based grid applications
5
Related Technologies
� Distributed computing� Cluster computing� Meta-computing� Application service provision (ASP)� Software-as-a-Service (SaaS)� Utility computing� Cloud computing� P2P computing� Web services, WSRF
Grid technologies complement existing distributed technologies by extending the distribution across organisational boundaries
6
Resources and Services� The grid needs to schedule access by multiple users (human or
machine) to a wide range of distributed shared resources� Multiple servers or peers
� Standard protocols guarantee interoperability across a range of systems (hardware, OS, programming language)
� Services are defined solely in terms of� Protocol� Behaviour
thus abstracting away internal heterogeneity of resources� All resources are exposed as services
� New generation grid middleware uses web service standards (see WSRF)
� Standard services include� Data access� Resource discovery� Access to computational resources
7
Virtual Organisations� A virtual organisation is a community whose members
are sharing a set of resources/services� Multiple institutions may be involved (e.g. universities and
research labs, business consortia)� Sophisticated rules govern sharing within a VO� Users and resources may be geographically dispersed� VOs are long-lived and dynamic
� VOs tend to be domain-specific (e.g. LHC, Biomed)� The LHC VO in EGEE enables European physicists to
analyse immense datasets produced by the CERN LHC experiment using several supercomputers across Europe
� A grid generally contains several VOs
8
Overview
� Introduction� Grid Architecture
� Grid Applications� Grid Projects
9
Grid Architecture
Application
Collective (Groups)
Resource (Services)
Connectivity (TCP/IP, IPv6)
Fabric (Physical resources)
10
Fabric Layer� The fabric layer exposes local resources located at
various sites to be shared across the Grid� Inspect and change resource’s state through a standard protocol
� Richer functionality enables more sophisticated sharing at higher layers� Advance reservation and co-scheduling
� Simpler functionality simplifies integration with existing management interfaces to resources
� Resource capabilities vary� Computational resources� Storage resources (file operations, free space)� Network resources (bandwidth reservation, load inspection)� Code repositories (source and object, e.g. CVS)� Catalogs (query and update, e.g. databases)
� ‘Exactly-once’ semantics are required
11
Connectivity Layer� The connectivity layer provides communication
and authentication protocols� Currently TCP/IP is used for communication� Authentication requirements
� Single sign-on (user authenticates only once, at the start of a session)
� Delegation (user delegates rights to a program)� Integration with existing security solutions (e.g.
Kerberos, UNIX)� User-based trust (multiple sites can interact with
each other on behalf of a user, without specific intervention by individual site administrators)
12
Resource Layer� The resource layer enables sharing of individual
resources on the grid by accessing fabric APIs through the connectivity layer
� Resource sharing is done using� Information protocols to obtain resource state
(configuration, load, usage policy)� Management protocols to negotiate access
(advance reservation, QoS, operation to be performed, status monitoring, accounting and payment) e.g. GridFTP, LDAP
� Small and focused set of resource abstractions based on fabric characterisation
13
Collective Layer� The collective layer deals with the coordination of
collections of resources� Persistent services across groups – a ‘session’ between several
parties� Collective state is shared across multiple resources
� Collective layer services include� Directories (e.g. to query resources available on a VO)� Co-allocation, scheduling and brokering� Data replication (e.g. file system caching)� Software discovery� Collaboration� Grid-enabled software run-time environments� Monitoring and diagnostics
� Collective components may include individual resources as well as other collective components
14
Overview
� Introduction� Grid Architecture� Grid Applications
� Grid Projects
15
4 Large Experiments
CERN Large Hadron ColliderThe world’s most powerful particle accelerator
CERN and the Grid
ATLAS
16
Example from LHC: starting from this event
We are looking for this “signature”
Selectivity: 1 in 1013
Like looking for 1 person in a thousand world populations;or for a needle in 20 million haystacks!
• ~100,000,000
electronic
channels
• 0.0002 Higgs
per second
• 15 PBytes of
data a year
• (10 Million
GBytes = 14
Million CDs)
Concorde
(15 km)
Mt. Blanc
(4.8 km)
One year’s data
from LHC would
fill a stack of
CDs 20km high
CERN and the Grid
17
� A wide variety of scientific applications are running on European grids
� High Energy Physics: Large Hadron Collider experiments (ATLAS, CMS, ALICE, LHCb) at CERN
� Biomedical Applications� GPS@ portal: protein sequence similarity searches, sites and
signatures detection, multiple alignment, secondary structure prediction and primary structure analysis
� WISDOM: finding new drugs against malaria, H5N1, etc.� Astrophysics Applications.
� ESA is simulating the forthcoming Planck satellite mission and test the data pipelines, thus providing input to the mission’s hardware requirements.
� Processing data from MAGIC, an imaging atmospheric telescope located on the Canary Islands that is used for astro-particle physics research.
Grid Applications
18
� Earth Science and Geophysics Applications.� Analysis of ozone profiles from the GOME satellite and oil spill
data from the ERS/SAR satellite, facilitating data sharing within the earth observation community.
� Montecarlo simulations for seawater intrusion in a coastal aquifer of the Mediterranean basin
� Other areas such as Computational Chemistry, Financial and Economic Research, Digital Libraries, Fusion Research
Grid Applications
19
Overview
� Introduction� Grid Architecture� Grid Applications� Grid Projects
20
� A pan-EU high-speed research network with full operational support is available� GÉANT / GÉANT-II� EUMEDCONNECT
� An overlying European Grid Infrastructure � EGEE / EGEE-II / EGEE-III� EUMEDGRID, SEEGRID, EUCHINAGRID, EELA
� Fostering collaboration between researchers from Europe and other countries
� Bridging the digital divide among areas with different levels of technological development
The EuroMed Scenario
21
The EUMEDGRID Project� EU-funded project (FP6 SSA)� Principal objectives:
� build the first high performance computing grid extending across the Mediterranean
� foster National Grid Initiatives in the Mediterranean region
“Computing and storage capacity on demand
for researchers in the Mediterranean”
22
EUMEDCONNECT
23
EUMEDGRID
24
GEANT 2
25
EGEETaken from EGEE 2008 report
26
>250 sites
48 countries
>50,000 CPUs
>20 PetaBytes
>10,000 users
>150 VOs
>150,000 jobs/day
Application areas include:
Archeology
Astronomy
Astrophysics
Civil Protection
Comp. Chemistry
Earth Sciences
Finance
Fusion
Geophysics
High Energy Physics
Life Sciences
Multimedia
Material Sciences
…
Taken from EGEE 2008 report
27
EGI: The European Grid Initiative� To ensure long-term sustainability of the
European Grid beyond the fixed-term EGEE projects
� To facilitate integration and interaction between European National Grid Initiatives (NGIs)
28
Collaborating e-Infrastructures
Taken from EGEE 2008 report
29
Information Sources� Foster et al. The Anatomy of the Grid. Intl
J. Supercomputer applications, 2001� EGEE site. www.eu-egee.org� EUMEDGRID site. www.eumedgrid.org� Grid Café. www.gridcafe.org� Vella et al. EUMEDGRID: Grid computing
in Malta and the Mediterranean. CSAW 2006
Recommended