18
EuroCAMP, Malaga, October 19, 2006 DEISA requirements for federations and AA Jules Wolfrat SARA www.deisa.org

EuroCAMP, Malaga, October 19, 2006 DEISA requirements for federations and AA Jules Wolfrat SARA

Embed Size (px)

Citation preview

EuroCAMP, Malaga, October 19, 2006

DEISA requirements for federations and AA

Jules Wolfrat

SARA

www.deisa.org

EuroCAMP, Malaga, October 19, 2006 2

Outline

• Introduction to DEISA

• AA and User administration

• Federation issues

EuroCAMP, Malaga, October 19, 2006 3

DEISA objectives

• To enable Europe’s terascale science by the integration of Europe’s most powerful supercomputing systems.

• Enabling scientific discovery across a broad spectrum of science and technology is the only criterion for success.

• DEISA is an European Supercomputing Service built on top of existing national services. This service is based on the deployment and operation of a persistent, production quality, distributed supercomputing environment with continental scope.

• The integration of national facilities and services, together with innovative operational models, is expected to add substantial value to existing infrastructures.

• Main focus is High Performance Computing (HPC).

EuroCAMP, Malaga, October 19, 2006 4

BSC Barcelona Supercomputing Centre Spain

CINECA Consortio Interuniversitario per il Calcolo Automatico Italy

CSC Finnish Information Technology Centre for Science Finland

EPCC/HPCx University of Edinburgh and CCLRC UK

ECMWF European Centre for Medium-Range Weather Forecast UK (int)

FZJ Research Centre Juelich Germany

HLRS High Performance Computing Centre Stuttgart Germany

IDRIS Institut du Développement et des Ressources France

en Informatique Scientifique - CNRS

LRZ Leibniz Rechenzentrum Munich Germany

RZG Rechenzentrum Garching of the Max Planck Society Germany

SARA Dutch National High Performance Computing The Netherlands

and Networking centre

Participating Sites

EuroCAMP, Malaga, October 19, 2006 5

The DEISA supercomputing environment(21.900 processors and 145 Tf in 2006, more than 190 Tf in 2007)

• IBM AIX Super-cluster

– FZJ-Julich, 1312 processors, 8,9 teraflops peak

– RZG – Garching, 748 processors, 3,8 teraflops peak

– IDRIS, 1024 processors, 6.7 teraflops peak

– CINECA, 512 processors, 2,6 teraflops peak

– CSC, 512 processors, 2,6 teraflops peak

– ECMWF, 2 systems of 2276 processors each, 33 teraflops peak

– HPCx, 1600 processors, 12 teraflops peak

• BSC, IBM PowerPC Linux system (MareNostrum) 4864 processeurs, 40 teraflops peak

• SARA, SGI ALTIX Linux system, 416 processors, 2,2 teraflops peak

• LRZ, Linux cluster (2.7 teraflops) moving to SGI ALTIX system (5120 processors and 33 teraflops peak in 2006, 70 teraflops peak in 2007)

• HLRS, NEC SX8 vector system, 646 processors, 12,7 teraflops peak.

• Systems interconnected with dedicated 1Gb/s network – currently upgrading to 10 Gb/s – provided by GEANT and NRENs

EuroCAMP, Malaga, October 19, 2006 6

How is DEISA enhancing HPC services in Europe?

• Running larger parallel applications in individual sites, by a cooperative reorganization of the global computational workload on the whole infrastructure, or by the operation of the job migration service inside the AIX super-cluster.

• Enabling workflow applications with UNICORE (complex applications that are pipelined over several computing platforms)

• Enabling coupled multi-physics Grid applications (when it makes sense)

• Providing a global data management service whose primary objectives are:

– Integrating distributed data with distributed computing platforms

– Enabling efficient, high performance access to remote datasets (with Global File Systems and striped GridFTP). We believe that this service is critical for the operation of (possible) future European petascale systems

– Integrating hierarchical storage management and databases in the supercomputing Grid.

• Deploying portals as a way to hide complex environments to new users communities, and to interoperate with other existing grid infrastructures.

EuroCAMP, Malaga, October 19, 2006 7

The most basic DEISA services

• UNIfied access to COmputing REsources (UNICORE). Global access to all the computing resources for batch processing, including workflow applications (in production)

• Co-scheduling service. Needed to support grid applications with synchronous access to resources, as well as high performance data movement

• Global data management. Integrating distributed data with distributed computing platforms, including hierarchical storage management and databases. Major highlights are:

– High performance remote I/O and data sharing with global file systems, using full network bandwidth (in production)

– High performance transfers of large data sets, using full network bandwidth (end 2006)

GridFTP

Co-scheduled, parallel datamover tasks

EuroCAMP, Malaga, October 19, 2006 8

Basic services: workflow simulations using UNICORE

UNICORE supports complex simulations that are pipelined over several heterogeneous platforms (workflows).

UNICORE handles workflows asa unique job and transparently moves the output – input data along the pipeline.

UNICORE clients that monitorthe application can run in laptops.

UNICORE has a user friendly graphical interface. DEISA has developed a command line interface for UNICORE.

UNICORE infrastructure including all sites has full production status.

EuroCAMP, Malaga, October 19, 2006 9

Linux SGI

SARA (NL)

LRZ (DE)LRZ (DE)

DEISA Global File System integration in 2006(based on IBM’s GPFS)

CINECA (IT)FZJ (DE)

ECMWF (UK) IDRIS (FR)

AIX IBM domain

RZG (DE)

BSC (ES)

LINUX Power-PC

CSC (FI)

HPC Common Global File System similar architectures / operation systems

High bandwidth (10 Gbit/s)

High Performance Common Global File System various architectures / operating systems

High bandwidth (up to 10 Gbit/s)

EuroCAMP, Malaga, October 19, 2006 10

Enabling science

• The DEISA Extreme Computing Initiative: identification, deployment and operation of a number of « flagship » applications in selected areas of science and technology.

• Applications are selected on the basis of scientific excellence, innovation potential and relevance criteria (the application must require the extended infrastructure services)

• European call for proposals: May-June every year (first one took place in 2005)

• Evaluation June -> September.

• We had in 2005 56 Extreme Computing Proposals and in 2006 40

• 29 projects were retained for operation in 2005-2006. For the 2006 call 23 projects are retained. Full information on DEISA Web server (www.deisa.org).

EuroCAMP, Malaga, October 19, 2006 11

Extreme Computing proposals

• Bioinformatics 4

• Biophysics 3

• Astrophysics 11

• Fluid Dynamics 6

• Materials Sciences 11

• Cosmology 3

• Climate, Environment 5

• Quantum Chemistry 5

• Plasma Physics 2

• QCD, Quantum computing 3

Profiles of applications in operation in 2005 – 2006

• Huge parallel applications running in single remote nodes (dominant)

• Data Intensive applications of different kinds.

• Workflows (about 10%)

EuroCAMP, Malaga, October 19, 2006 12

AA and User Administration

• Users authenticate with login/passwd at home organization or through UNICORE.

• For GPFS and LL-MC authZ is based on POSIX uids and gids

• Uid/gid for DEISA users have to be synchronized on all sites

• Each site has local administration, e.g. LDAP, NIS, passwd replication. It wasn’t feasible to couple these systems directly

• A separate DEISA administration system is built based on LDAP

BSC CINECA CSC ECMWF EPCC FZJHLRS IDRIS LRZ RZG

SARA

SARA

EuroCAMP, Malaga, October 19, 2006 13

User Administration (1)

• Each partner is responsible for the registration of users affiliated to the partner (home organization)

• Other partners update local user administration with data from other sites on a daily basis. Based on trust between partners!

LDAP serverSite A

DEISA user addedto LDAP server at site A

Administrator at site B creates local account based on ldap query

HPC system atSite B

EuroCAMP, Malaga, October 19, 2006 14

User Administration (1)

• Around 20 attributes used for the registration of users using existing object classes and a DEISA defined schema

• Information in LDAP not only used for creation and maintenance of user accounts on system. Contains additional information too, e.g.

– Phone number, email address, Science field, Nationality, Status, Project

• Additional information needed to comply with requirements partners

– Nationality because of export regulations for some of the systems in use

• To avoid overlap between DEISA uid numbers and local numbers each site uses reserved ranges

• Policies for administrators formulated, e.g. if user is to be deactivated.

EuroCAMP, Malaga, October 19, 2006 15

X.509 certificates

• UNICORE AuthN and AuthZ is based on X.509 certificates

– AuthZ based on Subject Name mapping to uids in UUDB (like the gridmapfile)

– UUDB is maintained at each site. So sites can decide if user can get access through UNICORE, e.g. based on the project the user is working on. Subject names are distributed using the LDAP system.

– Subject name can be mapped to more than one uid, the user can specify with UNICORE which uid to use

EuroCAMP, Malaga, October 19, 2006 16

UNICORE AA

IDB

TSI

UUDB

Certificate 2

Certificate 3

Certificate 4

Certificate 5

Certificate 1

Login B

Login C

Login D

Login E

Login A

Typical UNICORE

User

User Certificate

User Login

Client

NJS

Gateway

User Certificate

AJO

EuroCAMP, Malaga, October 19, 2006 17

Config for

FZJ users

DEISA FZJ gateway

DMZ

FZJ NJS

intranet

CNE users

DEISA CNE gateway

DMZ

CNE NJS

intranet

RZG users

DEISA RZG gateway

DMZ

RZG NJS

intranet

IDR users

DEISA IDR gateway

DMZ

IDR NJS

intranet

CSC users

DEISA CSC gateway

DMZ

CSC NJS

intranet

SARA users

DEISA SARA gateway

DMZ

SARA NJS

intranet

BSC users

DEISA BSC gateway

DMZ

BSC NJS

intranet

LRZ users

DEISA LRZ gateway

DMZ

LRZ NJS

intranet

RZG users

EuroCAMP, Malaga, October 19, 2006 18

Federation issues

• Internally – X.509 based AA alone not enough for sites. Access to additional user

attributes needed, e.g. uid, nationality• Discussion on deployment of portal software. Sites don’t accept

access to their systems based on a shared account– Currently concept of VO not deployed. Users are managed on individual

level or project level.– How to make it more dynamically

– User attributes are replicated to local systems, error prone.

• Interoperability with other (grid) infrastructures– Public Key authN based on X.509 certs issued by IGTF accredited CAs

– will work with any other relying party.– AuthZ will be difficult – deploying VOMS may help here, but internally

support from UNICORE needed. – Work to a common attribute schema?!