Upload
theodora-alexina-watson
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
1
EGEE Grid in Asia
Simon C. Lin
Academia Sinica Grid Computing Centre
Taipei, Taiwan
16 November 2007
Do-Son ACGrid School in Hanoi, Vietnam
2
e-Science Reminder
• Definition• “e-Science is about global collaboration in key areas of
science and the next generation of infrastructure that will enable it.” (by John Taylor, http://www.e-science.clrc.ac.uk)
• Objectives• Support research by e-Science, on data intensive
sciences and cross disciplinary collaboration
• Why e-Science is necessary in Asia• The global infrastructure is establishing quickly• Take advantage of sharing and collaboration to
bridge the gap between Asia and the world • To address the challenge of regional cooperation
ISGC2007
Enabling Grids for E-sciencE
INFSO-RI-508833
Collaborating e-Infrastructures
Potential for linking ~80 countries
TWGRID
“Production” = Reliable, sustainable, with commitments to quality of service
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
The EGEE project
• Flagship European grid infrastructure project, now in 2nd phase with 91 partners in 32 countries
• Objectives– Large-scale, production-quality
grid infrastructure for e-Science – Attracting new resources and
users from industry as well asscience
– Maintain and further improvegLite Grid middleware
• Structure
EGEE: 1 April 2004 – 31 March 2006
EGEE-II: 1 April 2006 – 31 March 2008
– Leveraging national and regional grid activities worldwide
– Funded by the EC at a level of ~37 M Euros for 2 years
– Support of related projects for infrastructure extension, application, specific services
• EGEE-III: 1 April 2008 – 31 March 2009– Reaching self-sustainable state
EGEE07, Budapest, 1-5 October 2007
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
240 sites45 countries41,000 CPUs5 PetaBytes>10,000 users>150 VOs>100,000 jobs/day
ArcheologyAstronomyAstrophysicsCivil ProtectionComp. ChemistryEarth SciencesFinanceFusionGeophysicsHigh Energy PhysicsLife SciencesMultimediaMaterial Sciences…
ISGC2007 6
Enabling Grids for E-sciencE
INFSO-RI-508833
EGEE Applications
• HEP: scale & performance testing, 4000 users worldwide, ~10PByte/year, strict deadlines
• Life Sciences: diverse community, secured access, data encryption, complex workflows
• Earth Sciences: large community, integration of geospatial services for diverse data sources and formats
• Computational Chemistry: development of license models, advanced MPI usage, liaison to GEMS project
• Astronomy & Astrophysics: access to vast databases and catalogs, large sensor networks, support PLANCK, MAGIC & AUGER
• Fusion: liaison with major Fusion projects (e.g.ITER), EU initiatives (e.g. EUFORIA) and interoperability between grids and supercomputers
• Grid Observatory: engage computer science community and improve grid reliability/usage
ISGC2007
Enabling Grids for E-sciencE
INFSO-RI-508833
High Energy Physics
Large Hadron Collider (LHC):• One of the most powerful instruments ever
built to investigate matter• 40 Million Particle collisions per second• 4 Experiments: ALICE, ATLAS, CMS, LHCb • ~15 PetaBytes/year from the 4 experiments• First beams in 2007
Mont Blanc(4810 m)
Downtown Geneva
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
translation / step=2.0 Å
quaternion / step =20 degree
torsion / step= 20 degree
number of energy evaluation
=1.5 X 106
max. number of generation
=2.7 X 104
run number =50
2D compound library
3D structure
“drug-like”
Lipinski’s RO5
ionizationtautermization
3D structure library
structure generationenergy minimization
308,585 (6 known drugs)
8 structures (including 1 original type)
Targets Compound
selection
Grid Data Challenge
Drug Analysis: Modeling Complex
Molecular docking (Autodock)~137 CPU years, 600 GB data
Data challenge on EGEE, Auvergrid, TWGrid~6 weeks on ~2000 computers
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
Modified from DDT vol. 3, 4, 160-178(1998)
focused library
screening focused library hit rate * cost
To improve hit rate$
Modeling as a complement to HTS in drug discovery
Can Grid help?
10
History of Grid Drug Discovery on Avian Flu
• 1st WISDOM data challenge on Malaria (autumn in 2005)
• pre-activity before the 1st EGEE user forum (1 month work during the Christmas holiday in 2005)
• DIANE/GANGA technology • Contacting biologists for the user case
• 1st EGEE user forum (March 2006)
• Where the biologist (application users) and grid engineers (resource providers) met
• 1st avian flu data challenge
• 2 weeks for preparation • 6 weeks for real execution started from April 2006
• data analysis and post process
• Long process in collecting the distributed data
• In-vitro test • 2nd avian flu data challenge
• Development phase addressing the issues • Deployment and test the new environment • Start the production from end of August 2007
11
ASGC
Large Hadron Collider (LHC)
Avian Flu Drug DiscoveryGrid Application Platform
Worldwide Grid Infrastructure
Asia Pacific Regional Operation Center
12
TWGrid Introduction
• Consortium Initiated and hosted by ASGC in 2002
• Objectives
• Gateway to the Global e-Infrastructure & e-Science Applications
• Providing Asia Pacific Regional Operation Services
• Fostering e-Science Applications collaboratively in AP
• Dissemination & Outreach
• Taiwan Grid/e-Science portal
• Providing the access point to the services and demonstrate the activities and achievements
• Integration of Grid Resources of Taiwan
• VO of general Grid applications in Taiwan
NTCU
13
EGEE Asia Federation is
• Extending the gLite Infrastructure, currently led by ASGC
• Engaging more user communities to join worldwide e-Science collaboration
• Building regional e-Infrastructure and e-Science application
• Conducting and supporting a production e-Infrastructure
• Working together to provide better user support• Conducting more business and industry
cooperations for new business model and opportunity
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
Production Infrastructure
• AsiaPacific Regional Operation Center (APROC) Mission– Provide deployment support facilitating Grid expansion– Maximize the availability of Grid services
• Supports EGEE sites in Asia Pacific since April 2005– 21 production sites, 8 countries– 9 sites joined EGEE since last year
•Resources–2,047 CPU cores, and 500 TB disk space currently–Will have 3500 CPU Cores and close to 2 PB disks by end of 2007–Provide 3.5 Milion KSI2K-hours in last 12 months
Enabling Grids for E-sciencE
EGEE-II INFSO-RI-031688
Joining EGEE Infrastructure
• Contact APROC: http://www.twgrid.org/aproc/join/newrc/
• If domestic CA is not available
– Register as a ASGCCA RA
– Obtain user and host certificates
• Dedicated an administrator with Unix experience
• Allocate servers
• Study user guide and installation manual
• Send configuration file to APROC for review before deployment
• Complete registration and certification process
16
Long Term Operations
• Establish domestic CA if none exists
• Increase availability and resource levels
• Establish domestic operations structure • Operations procedures
• Tools: monitoring and notification, ticketing system
• User and administrator support
• Training for administrators and users
• Collaborate with APROC in Regional operations
• Support VOs of application development and production service separately
17
Grid Application of Interests in Asia
18
EUAsiaGrid
• Identify and engage scientific communities which can benefit from the use of state-of-art Grid technologies;
• Disseminate EGEE middleware in Asian countries by means of public events and written and multimedia material;
• Provide training resources and organise training events for potential and actual Grid users;
18
•Support the scientific applications and create a human network of scientific communities by building on and leveraging the e-Science Grid infrastructure.
19
Work Packages of EUAsiaGrid
• WP1: Project administrative and technical management • WP2: Requirement capture and coordination policy definition
• To collect from the scientific communities of the Asian countries their computing and storage requirements,
• To develop a model for the promotion of sustainable National Grid initiatives• To define a roadmap towards a common e-Science Asian Grid infrastructure
• WP3: Support of scientific applications• To give support to EGEE applications, selected on the basis of already existing
collaborations between EU and Asian partners• To identify new user communities which could profit of the Asian e-Infrastructure. • To provide support for adaptation of the regional applications on top of the gLite MW
• WP4: Dissemination• To enhance the awareness about the EUAsiaGrid project and the Grid paradigm in Asia• To facilitate the information and experience exchange for the potential new research
communities and encourage them to use the e-Infrastructure for their applications • To promote EUAsiaGrid as a Grid service facilitator to the user communities among Asia
• WP5: Training• To train the technical personnel to manage the e-Infrastructure and the user applications
by using the Grid tools effectively • To foster the use of the Grid e-Infrastructure by the scientific communities in the Asian
countries
20
Summary
• e-Science envisages a whole new way of doing collaborative science
• For the sustainable Grid e-Infrastructure, we have to focus more on community building rather than just offering technologies.
• Asia Pacific Region has great potential to adopt the e-Infrastructure : • More and more Asia countries will deploy Grid system
and take part in the e-Science world• However, applications of and for the Asia Pacific
scientists are largely in lack which is crucial!!• Extending from EGEE Asia Federation to EUAsiaGrid, we
are widening the uptake of e-Science, by the close collaboration regionally and internationally