47
Managed by XtreemOS IP project is funded by the European Commission under contract IST-FP6-033576 XtreemOS: a Linux-based Operating System for Large Scale Dynamic Grids Christine Morin XtreemOS Scientific Coordinator INRIA Rennes - Bretagne Atlantique [email protected]

XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

Managed by

XtreemOS IP project is funded by the European Commission under contract IST-FP6-033576

XtreemOS: a Linux-based Operating Systemfor Large Scale Dynamic Grids

Christine MorinXtreemOS Scientific Coordinator

INRIA Rennes - Bretagne [email protected]

Page 2: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

2

What is XtreemOS?

Page 3: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

3

Linux-based Operating System

with native Virtual Organization support

for Next Generation Grids

Page 4: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

4

Data Centers

Internetof the

Future

Service Infrastructures

Cloud Computing

NextGeneration

Grids

Page 5: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

5

Next Generation Grids

"A fully distributed, dynamically reconfigurable, scalableand autonomous infrastructure to provide locationindependent, pervasive, reliable, secure and efficientaccess to a coordinated set of services encapsulatingand virtualizing resources (computing power, storage,instruments, data, etc.) in order to generateknowledge"

Page 6: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

6

VO A

VO B

Organization 1

Site 1

Organization 1

Site 2

Organization 2

Site 4

Organization 3

Site 3

VO

Page 7: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

7

Site 1

Site 2 Site 4

Site 3

VO A

VO B

Organization 1

Organization 1

Organization 3

Organization 2

VO Admin

VO Admin

WAN

Site admin

Site admin

Site admin

Site admin

Page 8: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

8

What are the Actors’ Needs?

Page 9: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

9

UsersEnd users - Service Administrators

– Ease of use• Do not want to care with Grid issues

• Want to work with familiar interfaces

• Want to use their non Grid-aware legacy applications

• Simple login as a Grid user in a VO

– Secure and reliable application/service execution

– High performance

– Ubiquitous access to services, applications & data

Page 10: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

10

Administrators

Site administrators– Ease of management– Autonomous management of local resources– Should not be impacted by every single change in

a VO

VO administrators– Ease of management– Flexibility in VO policies– Accounting

Site admin

VO Admin

VO Admin

Site adminSite admin

Page 11: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

11

Developers’ Needs

Ease of development of Grid applications– Reuse existing code

Stable API Conformance to standard API

– Familiar API Posix– Grid application standards

Page 12: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

12

Operating System

Application

Operating System

HardwareSingle computer

Set of integrated services(user account, process, file, memory segment, sockets, access rights)

Page 13: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

13

Why a Grid Operating System?

Page 14: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

14

Middleware Approach

Grid Middleware

OS

Hardware

Page 15: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

15

Example: Globus Toolkit

Page 16: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

16

Grid Operating System

Grid OS

Page 17: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

17

Grid Operating System

A comprehensive set of cooperating system services

providing a stable interface

for a wide-area dynamic distributed infrastructure

composed of heterogeneous resources

spanning multiple administrative domains

Page 18: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

18

XtreemOS Grid OS

Page 19: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

19

XtreemOSA Grid OS based on Linux with Native VO Support

XtreemOS

Page 20: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

20

Application Spectrum

Wide range of applications…– Grid aware distributed applications– Grid unaware (legacy) applications executed in a Grid

… in different domains– E-business

• Services…– Scientific applications

… XtreemOS is an OS!

Page 21: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

21

XtreemOS Fundamental Properties

Virtual Organization

Scal

abili

ty

Tran

spar

ency

Application Management

Data Management

Page 22: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

22

Scalability

Scale– Thousands of nodes in thousands sites in a wide area

infrastructure– Thousands of users

Consequences of scale– Heterogeneity

• Node hardware & software configuration• Network performance

– Multiple administrative domains– High churn of nodes

Page 23: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

23

XtreemOS Service Scalability

Scalability with the number of entities & their geographicaldistribution– Avoid contention points & save network bandwidth

(performance)– Run over multiple administrative domains (security)

Adaptation to evolving system composition (dynamicity)– Run with partial vision of the system– Self-managed services

• Transparent service migration– Critical services highly available

• No single point of failure

Page 24: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

24

XtreemOS Fundamental Properties

Virtual Organization

Scal

abili

ty

Tran

spar

ency

Application Management

Data Management

Page 25: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

25

TransparencyUser’s Point of View

Bring the Grid to standard Linux users– Feeling to work with a Linux machine

• Standard way of launching applications

• ps command to check status of own jobs

– No limit on the kind of applications supported• Interactive applications

– Grid-aware user sessions• Grid-aware shell taking care of Grid related issues

– VO can be built to isolate or share resources• Parameter defined by VO administrator

Page 26: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

26

TransparencyApplication & Application Developer’s Point of View

Make Grid executions transparent– Hierarchy of jobs in the same way as Unix process hierarchy– Same system calls: wait for a job, send signals to a job– Processes in a job treated as threads in a Unix process

Files stored in XtreemFS Grid file system– Posix interface and semantics to access files regardless of their

location

Transparent fault tolerance to applications Clusters transparent to applications

– Single System Image

Page 27: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

27

XtreemOS Services

Page 28: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

28

XtreemOSA VO-aware OS based on Linux

XtreemOSInfrastructure for highly available & scalable services

AEM VOM XtreemFS

XtreemOS API (based on SAGA & Posix)

Extensions to Linux for VO support & checkpointing

Secu

rity

Page 29: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

29

Virtual Organization Management

Objectives– To allow secure interaction between users and resources

• Authentication, authorization, accounting Challenges

– Interoperability with diverse VO frameworks and securitymodels

– Flexible administration of VOs• Flexibility of policy languages• Customizable isolation, access control and auditing

– Scalability of management of dynamic VOs– Embedded support for VOs in the OS– No compromise on efficiency, backward compatibility

Page 30: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

30

Use Cases

VO Admin

Site admin

VO user

Manage VO lifecycle

Manageusers

Manageresources

ManageVO policies

Registerresources

ManageNode policies

Manage relationships

with VOs

Register with a VO

Manageuser policies

Logonto a VO

Page 31: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

31

Security in XtreemOS

VO-centric security architecture– Grid level security services

• Global entities: VO, users, nodes (identified by public keycertificates)

– Node (OS) level services• Local entities: OS users (uid), OS resources (files (inode),

process (pid))– Hierarchical policy management

• Resource access control• Resource usage

Interoperability with third party security infrastructures– Kerberos, LDAP, Shibboleth…

Single-Sign-On

Page 32: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

32

System-Level VO Support

Policies specified by a VO finally checked & ensured atresource nodes by the local instance of the OS– Standard Linux unaware of VOs– Isolation & access control mainly rely on user accounts,

process id, file permission bits What is needed for Linux OS to be able to enforce VO

policies– OS kernel should deal with VO & VO users identities– Identity information should be exploited in standard access

control mechanisms– Linux OS should supply identity information to Grid level

services (XtreemFS, AEM) NO modification of Linux kernel

– Mapping of VO level identities & policies into local ones fullyrecognized by Linux

Page 33: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

33

System-Level VO Support

VO-customizable, dynamic mapping of Grid users ontolocal accounts– Integration of Grid user management into Linux using

• Pluggable Authentication Modules (PAM) Multiple low level authentication technologies into a common high

level API• Name Service Switch (NSS)

Interfacing with the Grid authentication services– Development of PAM modules to accommodate multiple VO

models• Authentication, authorization, session management

User space credential translation– NS-Switch

Access control & logging– Caching of authentication data related to a process within

the kernel

Page 34: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

34

CredentialDistribution Authority

(CDA)

Identity service (IS)

Attribute service (AS)

VO membership (X-VOMS)

VO policy service (VOPS)

Accounting Service (ACS) VO Admin

VO scope

VO-centric Security Architecture

Resource Certification Authority (RCA)

Account Mapping Service(AMS)

AEM

XtreemFS

PAMmodule

User & VO isolation

enforcement

Node policy service(NOPS)

Site adminVO user

Site scopeResource

scope

XtreemOS services XOS-Cert

Page 35: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

35

Key Contributions

Maximum transparency– Grid unaware applications & tools can be used without being

modified or recompiled Integration of Grid level authentication with

system level authentication– Creation of dynamic on-the-fly mappings for Grid users in a

clean & scalable way– No centralized Grid wide data base

Grid user mappings invisible to local users VO are easier to setup and manage

– No grid map file needed– User management does not necessitate any resource

reconfiguration

Page 36: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

36

XtreemOSApplication Execution Management

XtreemOSInfrastructure for highly available & scalable services

AEM VOM XtreemFS

XtreemOS API (based on SAGA & Posix)

Extensions to Linux for VO support & checkpointing

Secu

rity

Page 37: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

37

Application Execution Management

Objectives– Start, monitor, control applications– Discover, select, allocate resources to applications

Challenges– Deal with a large variety of resources with changing

conditions over time– Cost to obtain system information and take appropriate

decisions has to be orders of magnitude less than in Gridmiddleware-based systems

– Take advantage of accurate information for betterscheduling control

Page 38: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

38

AEM Architecture

AEMClient

Com

mun

icat

ion Job Manager

Execution Manager

Resource ManagerCom

mun

icat

ion

Com

mun

icat

ion

Client node

SEDA concept

ResourceMatching

Job Directory

ResourceSelectionService

DistributedServices

VO scope

(JobID, @JobOwner…)

Job Description

(JSDL)

XOSD

Job Manager

Execution Manager

Resource ManagerCom

mun

icat

ion

Com

mun

icat

ion

Resource 2

ResourceDescription

(GLUE)

Resource 1

Job Manager

Execution Manager

Resource ManagerCom

mun

icat

ion

Com

mun

icat

ion

Page 39: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

39

Main Features

No global job scheduler Distributed management of jobs No assumption on local node RMS

– AEM can be used without any batch system Resource discovery based on overlay networks

– Structured and unstructured– Multi-criteria and range of values queries

Page 40: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

40

Advanced Features

Flexible monitoring Accounting Reservation

– Nodes with a local resource manager– Co-allocation of resources

Checkpoint/restart mechanisms for grid jobs Migration of grid jobs when the user agreement cannot be

met anymore Interactive applications support Support for external workflow engines

Page 41: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

41

XtreemOSData Management

XtreemOSInfrastructure for highly available & scalable services

AEM VOM XtreemFS

XtreemOS API (based on SAGA & Posix)

Extensions to Linux for VO support & checkpointing

Secu

rity

Page 42: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

42

Data Management

Objectives– Providing to users a global view of their files & transparent

access to data through a Grid file system Challenges

– Efficient location-independent access to data throughstandard Posix interface in a Grid environment• Grid users from multiple VO• Data storage in different administrative domains

– Autonomous data management with self-organizedreplication and distribution

– Consistent data sharing– Advanced meta data management

Page 43: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

43

XtreemFS Grid File System

• Object-based• Replicated• Parallel

Page 44: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

44

Conclusion

XtreemOS is not yet another Grid middleware– Operating system for large scale wide-area platforms

distributed over multiple administrative domains• Comprehensive set of cooperating services• Stable Posix interface

– Grid-aware Linux distribution

Native Virtual Organization Support– Flexible & scalable VO management– Multi-VO & short-term VO support

Secure, reliable, efficient application/service execution & easeof use and management

Attractive in the context of the new emerging computingmodels

Page 45: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

45

Get Involved!

Download the first XtreemOS public release in afew days (GPL/BSD)– http://www.xtreemos.eu

– Open development

[email protected] to register in the pioneeruser group

XtreemOSBeta Users

Page 46: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

46

XtreemOS consortium

– http://www.xtreemos.eu

PARIS project-team @ INRIA Rennes - Bretagne Atlantique

– http://www.irisa.fr/paris

Acknowledgements

Page 47: XtreemOS: a Linux-based Operating System for Large Scale ...Next Generation Grids "A fully distributed, dynamically reconfigurable, scalable and autonomous infrastructure to provide

Managed by

XtreemOS IP project is funded by the European Commission under contract IST-FP6-033576

Thank you for your Attention

http://www.xtreemos.eu