16
Christian Delbé Kryzysztof Kurowski and Mariusz Mamonski (Poznan), Gabor Szemes and George Kampis, (Collegium Budapest) Walter De Back and Lazlo Gulyas (Aitia) Christian Delbé (ActiveEon) Extending ProActive for QosCosGrid: Support for Advance Reservation and Multi-Cluster Allocation

Christian Delbé Kryzysztof Kurowski and Mariusz Mamonski (Poznan), Gabor Szemes and George Kampis, (Collegium Budapest) Walter De Back and Lazlo Gulyas

Embed Size (px)

Citation preview

Page 1: Christian Delbé Kryzysztof Kurowski and Mariusz Mamonski (Poznan), Gabor Szemes and George Kampis, (Collegium Budapest) Walter De Back and Lazlo Gulyas

Christian Delbé

Kryzysztof Kurowski and Mariusz Mamonski (Poznan),

Gabor Szemes and George Kampis, (Collegium Budapest)

Walter De Back and Lazlo Gulyas (Aitia)

Christian Delbé (ActiveEon)

Extending ProActive for QosCosGrid:

Support for Advance Reservation and Multi-Cluster Allocation

Page 2: Christian Delbé Kryzysztof Kurowski and Mariusz Mamonski (Poznan), Gabor Szemes and George Kampis, (Collegium Budapest) Walter De Back and Lazlo Gulyas

Christian Delbé

QosCosGrid Project• EU 6th Framework Programme STREP Project

• 2,5 years, ends in 03/2009 • 11 partners (2 private companies) from 10 countries

Aim at providing Quasi-Opportunisitc Supercompting for COmplex Systems on GRIDs

1. Quasi (i.e. not really) opportunistic• Reservations, predictable performances

2. Framework for Complex systems on Grids• Very broad application class with widely varying requirements (no

implicit restrictions on applications)

3. For the Grids…

Page 3: Christian Delbé Kryzysztof Kurowski and Mariusz Mamonski (Poznan), Gabor Szemes and George Kampis, (Collegium Budapest) Walter De Back and Lazlo Gulyas

Christian Delbé

QosCosGrid Project Status

• GRMS Grid Scheduler• Reservation and orchestration of resources• Specific XML job description (Job Profile)

• Resources needs, processes affinity,…

GRMS Portal

OpenDSP/LSF OpenDSP/LSF

JobProfile

Cluster 1 Cluster N

Page 4: Christian Delbé Kryzysztof Kurowski and Mariusz Mamonski (Poznan), Gabor Szemes and George Kampis, (Collegium Budapest) Walter De Back and Lazlo Gulyas

Christian Delbé

QosCosGrid Project Status

• GRMS Grid Scheduler• Reservation and orchestration of resources• Specific XML job description (Job Profile)

• Resources needs, processes affinity,…

• Programming Framework• Fault-tolerant cluster-to-cluster message passing libraries based

on Open MPI (FORTRAN/C/C++) and ProActive (Java)

• 9 Use Cases • Written in C/MPI and in Java/ProActive• Benchmarked on multi-clusters testbed

Page 5: Christian Delbé Kryzysztof Kurowski and Mariusz Mamonski (Poznan), Gabor Szemes and George Kampis, (Collegium Budapest) Walter De Back and Lazlo Gulyas

Christian Delbé

QosCosGrid/ProActive challenge 1 : Deployment

1. Preserve ProActive deployment properties (Nodes, Virtual Nodes,…)

2. Provide end-users JobProfile as a single description (No explicit deployment descriptor)

3. Avoid need for direct connection on remote clusters machine (ssh,…)

• Provides a GRMS deployment process ? Unfortunalty …• The main process must be connected to deployed processes during the

execution

• Provides a 2 steps submission, i.e. submit the main process that will submit rest of the application ? Unfortunalty …• Sub-jobs are not supported by GRMS (reservation and accounting)

Submit ProActive application as a whole with a specific asynchronous deployment process

Page 6: Christian Delbé Kryzysztof Kurowski and Mariusz Mamonski (Poznan), Gabor Szemes and George Kampis, (Collegium Budapest) Walter De Back and Lazlo Gulyas

Christian Delbé

Deployment for QosCosGrid : ProActive Node Coordinator

GRMS Portal

OpenDSP/LSF OpenDSP/LSF

ProActive Node Coordinator

1.Submit Job Profile

2.Create reservation and submit QCG-PA Wrappers

3.Start QCG-PA Wrappers1. Main is started and

registered to the PNC2. Runtimes are started

and registered to the PNC

JobProfile

WRAPmain

WRAPrt

WRAPrt WRAP

rt

Page 7: Christian Delbé Kryzysztof Kurowski and Mariusz Mamonski (Poznan), Gabor Szemes and George Kampis, (Collegium Budapest) Walter De Back and Lazlo Gulyas

Christian Delbé

Deployment for QosCosGrid : ProActive Node Coordinator

GRMS Portal

OpenDSP/LSF OpenDSP/LSF

ProActive Node Coordinator

1.Submit Job Profile

2.Create reservation and submit QCG-PA Wrappers

3.Start QCG-PA Wrappers1. Main is started and

registered to the PNC2. Runtimes are started

and registered to the PNC

main

rt

rtrt

Page 8: Christian Delbé Kryzysztof Kurowski and Mariusz Mamonski (Poznan), Gabor Szemes and George Kampis, (Collegium Budapest) Walter De Back and Lazlo Gulyas

Christian Delbé

• QCG is multi-clusters without restriction on applications • Communication must be possible from anywhere to anywhere

• But clusters are usually behind a firewall and/or a NAT

• Use ProActive’s RMISSH on port 22. Unfortunatly…• Does not deal with NATs

Provide a new protocol that supports NATs : RMIQCG

QosCosGrid/ProActive challenge 2 : Connectivity

Page 9: Christian Delbé Kryzysztof Kurowski and Mariusz Mamonski (Poznan), Gabor Szemes and George Kampis, (Collegium Budapest) Walter De Back and Lazlo Gulyas

Christian Delbé

Extending inter-cluster communications for QosCosGrid

• RMIQCG uses SOCKS protocol instead of SSH• SOCKS server deployed on the front-end node• One port must be externally available

• Single proxy per cluster implies contention…

Page 10: Christian Delbé Kryzysztof Kurowski and Mariusz Mamonski (Poznan), Gabor Szemes and George Kampis, (Collegium Budapest) Walter De Back and Lazlo Gulyas

Christian Delbé

Benchmarks Testbed• Testbed portal @

node2.qoscosgrid.man.poznan.pl/gridsphere/gridesphere

Page 11: Christian Delbé Kryzysztof Kurowski and Mariusz Mamonski (Poznan), Gabor Szemes and George Kampis, (Collegium Budapest) Walter De Back and Lazlo Gulyas

Christian Delbé

UseCase 9Distributed MultiAgent Simulation

Active Object

Network Communcation

• Cellular Automata

1. Partition

2. Deploy

3. Iterate

Page 12: Christian Delbé Kryzysztof Kurowski and Mariusz Mamonski (Poznan), Gabor Szemes and George Kampis, (Collegium Budapest) Walter De Back and Lazlo Gulyas

Christian Delbé

UseCase 9Distributed MultiAgent Simulation

• On 8 machines (8 and 4+4)

Scalability issue with RMIQCG

Page 13: Christian Delbé Kryzysztof Kurowski and Mariusz Mamonski (Poznan), Gabor Szemes and George Kampis, (Collegium Budapest) Walter De Back and Lazlo Gulyas

Christian Delbé

Conclusion

• Important external contribution • Dedicated QCGProActive version (based on 3.9)• Ongoing integration in official ProActive 4

• Provides solutions to scalability problems inherent to RMIQCG• Similar solutions are studied in the OASIS team

• Successful partnership between QosCosGrid and ActiveEon• Support for QCGProActive deployment design• Support for upgrade from 3.2 to 3.9• Support for use cases application and design

Page 14: Christian Delbé Kryzysztof Kurowski and Mariusz Mamonski (Poznan), Gabor Szemes and George Kampis, (Collegium Budapest) Walter De Back and Lazlo Gulyas

Christian Delbé

Thank you !Questions ?

Page 15: Christian Delbé Kryzysztof Kurowski and Mariusz Mamonski (Poznan), Gabor Szemes and George Kampis, (Collegium Budapest) Walter De Back and Lazlo Gulyas

Christian Delbé

• EU 6th Framework Programme STREP Project

• 2,5 years, ends in 03/2009 • 2 800 000 Euro• Strong QCG Consortium: 11

partners (2 private companies) from 10 countries

QosCosGrid Project

Page 16: Christian Delbé Kryzysztof Kurowski and Mariusz Mamonski (Poznan), Gabor Szemes and George Kampis, (Collegium Budapest) Walter De Back and Lazlo Gulyas

Christian Delbé

UseCase 8

Active Object Active Object

Network Communcation