29
EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org Summary report: An EGEE Comparative study: Grids and Clouds – evolution or revolution?Marc-Elian Bégin [email protected] Six² Sàrl, Switzerland www.sixsq.com Session: Exploring Cloud Computing, OGF23 Barcelona, Spain, June 2, 2008

EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

Embed Size (px)

Citation preview

Page 1: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

EGEE-III INFSO-RI-222667

Enabling Grids for E-sciencE

www.eu-egee.org

Summary report:“An EGEE Comparative study: Grids and Clouds – evolution or revolution?”

Marc-Elian Bé[email protected]² Sàrl, Switzerlandwww.sixsq.com

Session: Exploring Cloud Computing, OGF23Barcelona, Spain, June 2, 2008

Page 2: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 2

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Content

• Context of comparative study• Grid: EGEE/gLite• Cloud: Amazon Web Service• Comparison summary• Conclusions• Recommendations

Page 3: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 3

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Context of comparative study

• This presentation is a summary of the report:– “An EGEE Comparative study: Grids and Clouds-

evolution or revolution?”– https://edms.cern.ch/file/925013/3/EGEE-Grid-Cloud.pdf

• Objective:– As cloud computing gains popularity and traction, need to

position grid computing with respect to cloud computing– Compare real implementations and production offerings

EGEE/gLite grid production service Amazon Web Services, with focus on EC2 and S3

• Outcome:– Identified convergence paths and– Recommendations for managing convergence going forward

Page 4: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 4

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Acknowledgment

• Many people provided comments, suggestions and feedback

• Special thanks got to:– Bob Jones, CERN– James Casey, CERN– Charles Loomis, CNRS and Six² partner

Page 5: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 5

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

EGEE – What does it deliver?• Infrastructure operation

– Sites distributed across many countries Large quantity of CPUs and storage Continuous monitoring of grid services & automated site

configuration/management Support multiple Virtual Organisations from diverse

research disciplines

• Middleware– Production middleware distributed under business

friendly open source licence Implements a service-oriented architecture that virtualises

resources

Adheres to recommendations on web service inter-operability and evolving towards emerging standards

• User Support - Managed process from first contact through to production usage– Training– Expertise in grid-enabling applications– Online helpdesk– Networking events (User Forum, Conferences etc.)

Page 6: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 6

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

ArcheologyAstronomyAstrophysicsCivil ProtectionComp. ChemistryEarth SciencesFinanceFusionGeophysicsHigh Energy PhysicsLife SciencesMultimediaMaterial Sciences…

>250 sites48 countries>50,000 CPUs>20 PetaBytes>10,000 users>150 VOs>150,000 jobs/day

Page 7: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 7

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Users and resources distribution

Page 8: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 8

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

European Grid Initiative

• Need to prepare permanent, common Grid infrastructure• Ensure the long-term sustainability of the European e-Infrastructure

independent of short project funding cycles• Coordinate the integration and interaction between National Grid

Infrastructures (NGIs)• Operate the production Grid infrastructure on a European level for a

wide range of user communities

Page 9: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 9

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Grid: EGEE/gLite

EGEE highlights:• Federated but separately administered resources

(multiple sites, countries and continents)• Heterogeneous resources• Distributed, multiple research user communities

grouped in Virtual Organisations (VO)• Mostly publicly funded at local, national and

international levels• Range of data models, ranging from massive data

sources, hard to replicate to transient datasets composed of varied file sizes

Page 10: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 10

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Grid: EGEE/gLite (2)

Provided services:• Basic services (focus of comparison with AWS)

– Computing Element (CE) – Storage Element (SE)

• Higher-level services– Workload Management System (WMS)– File & Metadata Catalog Services– File Transfer Service (FTS)– Virtual Organization Management Service (VOMS)

• For more info: – Bob Jones, EGEE Project Director, CERN, [email protected]

Page 11: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 11

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Amazon Web Services

• EC2 (Elastic Computing Cloud) is the computing service of Amazon– Based on hardware virtualisation (Xen)– Users request virtual machine instances, pointing to an image

(public or private) stored in S3– Users have full control over each instance (e.g. access as root, if

required)– Request can be issued via SOAP and REST

• S3 (Simple Storage Service) is a service for storing and accessing data on the Amazon cloud– From a user’s point-of-view, S3 is independent from the other

Amazon services– Data is built in a hierarchical fashion, grouped in buckets (i.e.

containers) and objects– Data is accessible via SOAP, REST and BitTorrent

Page 12: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 12

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Amazon Web Services (2)

• Other AWS services:– SQS (Simple Queue Service)– SimpleDB– Billing services: DevPay– Elastic IP (Static IPs for Dynamic Cloud Computing)– Multiple Locations

Page 13: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 13

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Costs

• Cost study for computing upgrade at CERN for LHC (by Ian Bird, Tony Cass, Bernd Panzer-Steindel and Les Robertson)

• Cost summary for providing 40 MSI2000 of computing:– Custom data centre construction: 4.4 MCHF (~2.7 M€)– Using EC2: 92 MCHF (~56.9 M€)

• Cost of 4.4 MCHF doesn’t include software license and man-power costs

• Comparison is made difficult by the choice of reference Amazon is using for its EC2 Compute Unit– e.g. “EC2 Compute Unit (ECU) provides the equivalent CPU

capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor”

• Our calculation was for 40 MSI2000 on EC2: 57 MCHF (~35.3 M€)

Page 14: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 14

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

XferCPU

Storage

Costs: EGEE workload in 2007

CPU: 114 Million hours

Data:

25PB stored

11PB transferred

Estimated cost if performed with Amazon’s EC2 and S3: ~38 M€http://gridview.cern.ch/GRIDVIEW/same_index.php http://calculator.s3.amazonaws.com/calc5.html? 17/05/08 $58688679.08

Page 15: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 15

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

High-level deployment of LCG grid resources

Where could the cloud be? Since transferring data across the cloud border costs!

Page 16: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 16

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Can BitTorrent Help

Using BitTorrent, transfers not meteredby cloud if requestingthe same files

Where could the cloud be? Since transferring data across the cloud border costs!

Page 17: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 17

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

• EC2, S3 bandwidth performance summary

• The conclusions from [6] regarding the EC2 -> EC2 transfers are that “basically we’re getting a full gigabit between the instances”.

Performance

Test typeTransfer (MB/sec)

Remarks

EC2 -> EC2 75.0 Using curl on 1-2 GB files, without SSL

S3 -> EC249.8 Using 8 x curl on 1 GB files, with SSL

51.5 Using 8 x curl on 1 GB files, without SSL

EC2 -> S3 53.8 Using 12 x curl on 1 GB files, with SSL

Page 18: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 18

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Performance (2)

• Like AWS, CERN has opted for a storage / compute farms separation

• CERN can deliver a sustained 70 GB/s data throughput between the storage and compute farms

• A large scale performance analysis not available on AWS

Page 19: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 19

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Scale

• Is EC2 (Elastic Computing Cloud) really “elastic”?• Scale of EGEE is already established and well

documented• Scale from AWS is unknown, while latest experiments

seem to indicate good scaling• Both systems now have SLAs in place, including

penalties (partial refund) from Amazon when not honoured

• Elastic IP and Multiple Locations provide building blocks for users to deploy resilient services, while

• EGEE is already massively distributed (>250 sites)

Page 20: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 20

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

AWS Cloud interfaces

No middleware!! Resource-sidegrid middleware?

Page 21: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 21

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Ease of Use

• Key to the success of AWS is the choice of technologies– HTTP(S)/REST and support for ROA (Resource Oriented

Architecture)– Hardware virtualisation (Xen based)– X.509 certificates

• This backs-up the claim from Amazon that AWS requires “no middleware” (for the user!)

• However, the level of service provided by AWS is lower than EGEE

• For EGEE/gLite, several MB are required to use the grid

Page 22: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 22

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Service Mapping

• “Ease of use comes at a cost: ‘The cost of simplicity’”• The basic constructs that EC2 and S3 services offer do

not currently meet all the requirements of grid users and do not replace high-level services provided by gLite – e.g.:– File Transfer Service (FTS)– Workload Management System (WMS)– Grid catalogues such as ARDA Metadata Catalogue (AMGA),

LCG File Catalog (LFC) or GANGA

• Are all users using the grid the same way?• Should we revisit the way the grid is used and

accessed?• Who should be responsible for providing different

levels of functionality

Page 23: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 23

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Collaboration and Virtual Organisations

• Grids are used by large and/or distributed communities of collaborators

• Virtual Organisations support this concept, with services such as VOMS

• Only primitive ACLs are provided by AWS, can we bridge the gap?

• Scientific collaborations include the need for resources to be contributed and “connected” to the grid. Can the cloud be “augmented” by custom data centres

Page 24: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 24

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Application Software Deployment

• Grid application software is often required to be installed at data centres for jobs to execute successfully

• Several operating systems and platforms required to host grid jobs

• Hardware virtualisation could alleviate these burdens– Grid application software can be “baked” in a virtual image– Data centres do not have to provide specific operating system –

defined at the level of the VM

• Hardware virtualisation provides high-level of control to user (e.g. root) and high control and security for hosts

Page 25: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 25

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Interoperability

• Assuming that several cloud computing providers come to be…

• Which interface matter?

BOTH!!!

Page 26: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 26

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Standards

• Since “simple is beautiful”, if the proposed interfaces by cloud services like AWS are to become popular with grid users, they might change the standardisation landscape

• HTTP, REST, Xen and BitTorrent are already largely standardised

• What is left at that level– REST access to storage– Virtual Image formats– Instantiation API (perhaps based on REST)– Metering interfaces (including monitoring)

• A reference open source implementation is missing• What about higher-level services? Which ones?

Page 27: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 27

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Conclusions

• Cloud computing is getting traction, especially with Amazon Web Services (AWS) commercial offering

• Grid (e.g. EGEE) has a larger scope, however, technological choices and simple interfaces like AWS is relevant to the grid world

• The question “what is the usage pattern that will emerge in the coming years?” remains unanswered and will have to be carefully tracked

• None of the resources contributed to the EGEE grid come from commercial offerings, such as Amazon. While this change?

• Technologies such as REST, HTTP, hardware virtualisation and BitTorrent could displace existing accesses to grid resources

Page 28: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 28

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Conclusion (2)

• EGEE has an opportunity to lead the next generation e-Infrastructure by integrating new advancements such as cloud computing

• Hardware virtualisation could lower the operations cost of large infrastructures

• Important that new development is not a distraction from ensuring current production grid continuity

• Roadmap should be defined to include cloud technology in current e-Infrastructures in an incremental and harmonious fashion

Page 29: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  Summary report:An EGEE Comparative study: Grids and Clouds – evolution or revolution?

June 2, 2008 29

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Recommendations

1. Promote/support the development of an open source cloud middleware distribution, based on interfaces similar to current commercial offerings

2. Promote the standardisation of the cloud, with the above mentioned implementation as a potential reference

3. Identify a convergence path between cloud services such as EC2 and S3 and the current EGEE security model based on VOMS

4. Virtualise all key grid services (e.g. information system, metadata catalogues, security service) with the goal of being able to deploy these on EC2-like resources

5. Promote/lobby the need for experiments (i.e. LHC/HEP, Life science) and other grid users to virtualise their application, with the goal of being able to deploy them on EC2-like resources

6. As a follow-on to point 5, promote/lobby the need for all service dependencies that grid user applications have to also be virtualised

7. Launch/support a feasibility study to verify that monitoring of cloud jobs can be performed at the hypervisor level, such that monitoring is independent from the virtualised applications

8. Upgrade current metadata catalogues to support HTTP(S) endpoints and S3-like metadata

9. Explore feasibility of running BitTorrent on grid sites