26
1 e-Science Centre of Excellence The Academic Service Partnership Peter Dew & Joanna Schmidt plus White Rose Grid Teams

The Academic Service Partnership

Embed Size (px)

DESCRIPTION

The Academic Service Partnership. Peter Dew & Joanna Schmidt plus White Rose Grid Teams. Overview. The White Rose Grid as an example of an inter-university collaboration providing an early production Grid service Computing Services roles Delivering the WRG Organisational structure - PowerPoint PPT Presentation

Citation preview

Page 1: The Academic Service Partnership

1

e-Science Centre of Excellence

The Academic Service Partnership

Peter Dew & Joanna Schmidtplus

White Rose Grid Teams

Page 2: The Academic Service Partnership

2

e-Science Centre of Excellence

Overview • The White Rose Grid as an example of an

inter-university collaboration providing an early production Grid service

• Computing Services roles• Delivering the WRG

– Organisational structure– Technical implementation– User management

• Lessons learned • Benefits• Concluding remarks

Page 3: The Academic Service Partnership

3

e-Science Centre of Excellence

About the WRG • The White Rose Grid (WRG) works under the auspices of

the White Rose University Consortium (WRUC) – an association of the three research Universities– Leeds, York & Sheffield – Employs complementary skill bases to support both:

larger projects than can be delivered by any one University, and a broad research agenda

– WRUC featured as a model of collaboration and enterprise in the HEFCE White Paper

• Why Grid?– Enhances the competitive position of the three

Universities to attract funding– Enables inter-enterprise computing resources

optimisation and an increase in the effectiveness of service delivery to researchers

Page 4: The Academic Service Partnership

4

e-Science Centre of Excellence

WRG aims • to strengthen e-Science research (using

experience gained from e-Science projects such as DAME, HYDRA, or gViz)– initial focus

• decision support (engineering, health, social science)• scientific visualisation

• to support and enlarge new scientific communities including bio-technology, aerospace, tissue engineering and healthcare

• to assess and grow, in collaboration with YF, regional demand for Grid technology

Page 5: The Academic Service Partnership

6

e-Science Centre of Excellence

Commitment• Senior staff from the three Universities (White

Rose Grid Executive: Chief Exec of White Rose Univ Consortium - M Doxey; P Dew and K Brodlie – Leeds; J Austin – York; P Fleming – Sheffield)

• Senior Computing Services staff (C Cartledge – Sheffield, S Chidlow – Leeds) & Computing staff from Comp Science Dept (A Turner – York)

• White Rose Grid staff • WRG Project Teams (Computing Services staff &

Computer Science staff)• IT Vendors – Esteem involving Sun &

Streamline

Page 6: The Academic Service Partnership

7

e-Science Centre of Excellence

Computing Services roles• To provide a stable, well-managed and

responsive HPC service• To promote effective use of HPC facilities across

the three Universities through a variety of training including joint seminars and user group meetings

• To offer user-support & training in basic HPC techniques (e.g. parallel programming), Globus and e-Science applications

• To support an early production Grid service under the leadership of WRG technical staff

Page 7: The Academic Service Partnership

8

e-Science Centre of Excellence

WRG staff responsibilities• Technical developments (Leeds)• Grid training (Sheffield)• Coordination of joint activities (Leeds)• Liaison with e-Science communities

within the WRG, UK e-Science and others (Leeds)

• Business outreach (York)

Page 8: The Academic Service Partnership

9

e-Science Centre of Excellence

Setting up the WRG

• To offer both:– local HPC services (75% resources)– the Grid infrastructure (25%

resources)• Each node specialises in the provision

of a distinct service

• Purposely acquired - with over £3M investment - 4 HPC nodes (in total nearly 500 CPUs)

• A heterogeneous facility comprising 3 clusters of Sun shared-memory systems and 2 Intel processor-based Beowulf clusters

75%

25%

WRG resource allocation

Page 9: The Academic Service Partnership

10

e-Science Centre of Excellence

The WRG architecture General Purpose HPC node

Computer Science node

CFD node

Engineering Application node

Page 10: The Academic Service Partnership

11

e-Science Centre of ExcellenceWRG software & hardware stack

Sun Enterprise™ and Sun Fire™ Servers, Sun StorEdge™ Systems, Intel thin servers

WRG Portal

GPDK Portal interface

Tomcat/Apache

Campus Grid Infrastructure Grid Engine Enterprise Edition

Sun™ HPC Cluster ToolsSun ONE Studio

Solaris™ and Linux Operating Environments

Global Grid InfrastructureMyProxy

Globus Toolkit 2.4

• Software stack composed largely from open source software

Page 11: The Academic Service Partnership

12

e-Science Centre of Excellence

Delivering our Grid • Procedures & resources:

– Strong organisational structure – Computing infrastructure - computer systems,

storage currently being expanded with a Storage Area Network (SAN), networking infrastructure (YHMAN reprocurement underway -implementation due Nov'04), software (includes traditional HPC tools, Grid software - Globus Toolkit, and Portals)

– Mixture of experienced support staff and research staff working in teams

Page 12: The Academic Service Partnership

13

e-Science Centre of Excellence

WRG project teams

Architecture Team

Authentication, Authorisation & Accounting Team

Technical Team

Training Team

Business Outreach Team

WRG USERS

Globus, MyProxy, portals

stable service

user management, usage account.X.509 digital certificates

HPC techniques, Grid access & applications

working with regional companies & Universities

Page 13: The Academic Service Partnership

14

e-Science Centre of Excellence

Access to the WRG • Users baffled by Globus (due to its

novelty and lack of easy documentation) • Preferred way of access:

– using Grid portals developed by the WRG – running Grid-enabled applications e.g. the

gViz project has developed Grid-enabled IRIS Explorer modules

• Portals need to be developed

Page 14: The Academic Service Partnership

15

e-Science Centre of Excellence

The DAME XTO portalEnables aeronautical engineers to identify abnormal behaviour in aircraft engines by performing DSP analyses of vibration data from onboard sensors.

Page 15: The Academic Service Partnership

16

e-Science Centre of Excellence

User management • Includes:

– user registration – user authorisation for access to resources– user validation & approval of request for a digital certificate– accounting for resource usage – documentation of procedures

• These schemes and user administration processes were developed taking into consideration: – the distributed nature of WRG – the cultural differences in registering and managing users

at the three sites– the existence of two distinct classes of users: local and

WRG as well as including other academic and commercial partners

Page 16: The Academic Service Partnership

17

e-Science Centre of Excellence

Managing information• New registration forms needed to be developed• Documentation for users, e.g.

– how to register & obtain a digital certificate– how to access the WRG systems– further local user documentation

• Documentation for system administrators such as:– registering users – propagation of Grid distinguished names between

systems and mapping out local UNIX user names in the grid-map files

– producing usage accountings reports• Development of Web pages

Page 17: The Academic Service Partnership

18

e-Science Centre of ExcellenceUser registration & authorisationLocal user registration

Completion of the Application Form for WRG Resources

Approval of the new WRG project and the user by PI

Allocation of local & remote WRG usernames

User registration at remote site 1

User registration at remote site 2

User registration details

Request for user registration

A centralised database of WRG users

Approval of the new WRG project and the user by the local WRG Executive member

Validation of the new WRG user by Comp Services

Industrial partners

University users

User requests a digital certificate

User validation by Computing Service

Approval of a UK CA digital certificate

Email requesting update of grid- map files

Local Computing Service

Authorisation

Obtaining a digital certificate

Page 18: The Academic Service Partnership

19

e-Science Centre of Excellence

Digital certification• Globus requires personal X.509v3 digital certificates• WRG systems support certificates from the UK e-Science Grid

Certification Authority (CA) • The e-Science CA at Rutherford Appleton Laboratory (RAL) is

being run as part of the Grid Support Centre • Registration Authorities (RA) were established at the three

Universities• Training of User Administration staff

– courses available at RAL– a short introduction to digital certificates issued by the UK e-

Science CA available at: http://www.grid-support.ac.uk/ca/ • At present the RA interface to the CA system will only work

reliably with Netscape 4.79• All certificates issued will expire after one year

Page 19: The Academic Service Partnership

20

e-Science Centre of Excellence

Lessons learned • New user management procedures must be

fully endorsed by Computing Services (e.g. the WRG local identity had to be modified several times to take this into account)

• Continuous staff training is required (due to rapidly changing technology – e.g. GLOBUS)

• The local system administrators must be involved with their own Globus & other Grid fabric’s installations

Page 20: The Academic Service Partnership

21

e-Science Centre of ExcellenceBenefits to Computing

Services• Added dynamic to support issues• Enlarged support team memberships• Broaden knowledge by working in

collaboration with other sites• More interesting job specifications for

system and user support staff (i.e. include a research approach)

Page 21: The Academic Service Partnership

22

e-Science Centre of Excellence

Overall lessons learned • Complexity due to:

– geographically distributed support teams (lack of full understanding how the three sites work)

– large number of support staff involved (new issues may cause confusion who is doing what)

– innovative technology (lack of good understanding of new implementations and software dependency/interoperability; lack of good documentation)

– human interaction factor (caused by misunderstandings etc)– communication issues within a VO (due to its size)– constantly posed questions of ownership and trust (due to

crossing organisational boundaries)– distributed resource management (e.g. software revisions) – software licensing issues ( need licensing for a Grid) – increased exposure to security issues – lack of a central Help Desk

Page 22: The Academic Service Partnership

23

e-Science Centre of ExcellenceAddressing issues within WRG

• Many problems resolved through: – effective organisational structure led by the

WRG Executive – Computing Services staff involvement and

expertise – research element (portal development,

Globus installations) led by Computer Sciences staff but with the involvement of Computing Services’ practical approach and using their well-established support infrastructure

Page 23: The Academic Service Partnership

24

e-Science Centre of Excellence

WRG Evolution

e-Science Grid

WUN Grid Companies

WRGCom-

Services

Academic ServiceInfrastructure

C

C

C

“BuyServices”

WRG

Page 24: The Academic Service Partnership

25

e-Science Centre of Excellence

Concluding remarks• The WRG serves as a test-bed Grid environment • Addresses a large variety of problems and issues,

including key sociological constraints (human interactions, ownership, trust) reflected in global Grids

• Computing services staff expertise is vital to Grid success

• Many gaps (see the e-Science Gap Analysis at http://www.nesc.ac.uk/technical_papers/UKeS-2003-01/index.html )

• Need to continue to work with Grid users and increase the number of Grid applications and enhance outreach

Page 25: The Academic Service Partnership

26

e-Science Centre of Excellence

References• WRG web site - http://www.wrgrid.org.uk/• DAME XTO portal –

http://iri02.leeds.ac.uk:8080/damexto/damexto

• P M Dew, J G Schmidt, M Thompson, P Morris The White Rose Grid: practice and experience – in the proceedings of the All Hands conference

• e-Science Gap Analysis http://www.nesc.ac.uk/technical_papers/UKeS-2003-01/index.html

Page 26: The Academic Service Partnership

27

e-Science Centre of Excellence

Thank you for your attention