37
Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Embed Size (px)

Citation preview

Page 1: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Systems and Environments for High

Performance Java Computing

Vladimir Getov

4 January 2006

Page 2: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Part 1: Introduction

Relevant publications:

V. Getov. Java in High-Performance Computing – Guest Editorial. FGCS, vol. 18(2), v-vi, Oct. 2001.

M. Philippsen, R. Boisvert, V. Getov, R. Pozo, J. Moreira, D. Gannon, G. Fox. JavaGrande – High Performance Computing with Java. Proceedings of PARA 2000 Conference, LNCS, Springer, vol. 1947, 20-36, 2001.

High Performance Computing Using Java Technology – Tutorial lecture. ACM JavaGrande and JavaOne2001 Conferences, San Francisco, June 2001.

Page 3: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Some Important Facts About Current Computer Systems The gap between processing speed and

memory access speed continues to grow; So does the gap between high-level

programming models and underlying hardware architectures;

The wide variety of hardware architectures makes it particularly difficult to achieve portable high performance;

The pace of innovation is such that investment in tuning for one machine may not pay off before that machine is obsolete.

Page 4: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Pros/Cons of using Java

Pros: Java offers tremendous potential for portability and heterogeneous execution - Bytecode Representation, RMI, Object Serialization

Cons: Java still suffers a significant performance penalty and as with any new language, the thought of rewriting existing codes brings reluctance and lack of enthusiasm.

Page 5: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Background Observations

Java thread model is insufficient Java thread model is insufficient Message Passing model is important to Message Passing model is important to

supportsupport Performance is criticalPerformance is critical

Many applications need “high” performanceMany applications need “high” performance Proper numerical computing Proper numerical computing

Complex, arrays, performance, reproducibilityComplex, arrays, performance, reproducibility

Page 6: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Part 2: High Performance Java Relevant publications:

V. Getov. A Mixed-Language Programming Methodology for High Performance Java Computing. In: R. Boisvert and P. Tang (Eds.) The Architecture of Scientific Software. Kluwer Academic Publishers, 333-347, 2001.

Q. Lu, V. Getov. Mixed-Language High-Performance Computing for Plasma Simulations. Scientific Programming, vol. 11(1), 57-66, 2003.

Page 7: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Mixed-Language Programming with Java

Java is a highly-portable language

Java adheres to the “Write once, run anywhere” philosophy

Java has a well-established collection of scientific library bindings

Java’s execution speed is suitable for HPC

C/Fortran are highly-portable languages

C/Fortran adhere to the “Write once, run anywhere” philosophy

C/Fortran have well-established scientific libraries

C/Fortran execution speeds are suitable for HPC

Page 8: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

So, What Language to Use?

Java is a highly-portable language

Java adheres to the “Write once, run anywhere” philosophy

C/Fortran have well-established scientific library bindings

C/Fortran execution speeds are suitable for HPC

Utilize Java for its portability and standardization,

but focus on using Java as a wrapper for porting

of native code in the form of shared libraries. This involves the least amount of work and

guarantees maximum performance on different platforms.

Page 9: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Difficulties in binding a native library to Java Data formats in Java and C differ:

sizes of primitive types; C pointers; multidimensional arrays; C structures;

Still different Java native method interfaces exist;

A native interface is inadequate for calling existing library functions.

Page 10: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

JCI Block Diagram

Page 11: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Legacy libraries bound to Java so far

Library Lang. Func. C JavaMPI C 128 4434 439BLACS C 76 5702 489BLAS F77 21 2095 169PBLAS C 22 2567 127PB-BLAS F77 30 4973 241LAPACK F77 14 765 65ScaLAPACK F77 38 5373 293

Page 12: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Mixed-language programming based on JVM

Page 13: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

NPB EP kernel on IBM SP2

Page 14: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Mixed-language programming based on HPCJ

Page 15: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

NPB IS kernel on IBM SP2

Page 16: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Part 3: Message Passing in Java Relevant publications:

B. Carpenter, V. Getov, G. Judd, A. Skjellum, G. Fox. MPJ: MPI-like Message Passing for Java. Concurrency: Practice and Experience, vol. 12 (11), 1019-1038, 2000.

 V. Getov, P. Gray, V. Sunderam. Aspects of Portability and Distributed Execution for JNI-Wrapped Message Passing Libraries. Concurrency: Practice and Experience, vol. 12 (11), 1039-1050, 2000.

V. Getov, M. Philippsen. Java Communications for Large-Scale Parallel Computing. Proceedings of SciComp'01 Conference, LNCS, Springer, vol. 2179, 33-45, 2001.

Page 17: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Message Passing - Motivation

The existing communication packages in The existing communication packages in Java - RMI, API to BSD sockets - are Java - RMI, API to BSD sockets - are optimized for Client/Server programmingoptimized for Client/Server programming

The symmetric model of communication The symmetric model of communication is captured in the MPI standard - MPI-1 is captured in the MPI standard - MPI-1 and MPI-2and MPI-2

An MPI-like message-passing API An MPI-like message-passing API specification is needed to enable the specification is needed to enable the development of portable JavaGrande development of portable JavaGrande applicationsapplications

Page 18: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Early MPI-like Efforts - 1

mpiJava - mpiJava - Modeled after the C++ binding Modeled after the C++ binding for MPI. Implementation through JNI wrappers for MPI. Implementation through JNI wrappers to native MPI software. to native MPI software.

JavaMPI - JavaMPI - Automatic generation of wrappers Automatic generation of wrappers to legacy MPI libraries. C-like implementation to legacy MPI libraries. C-like implementation based on the JCI code generator. based on the JCI code generator.

MPIJ - MPIJ - Pure Java implementation of MPI Pure Java implementation of MPI closely based on the C++ binding. A large closely based on the C++ binding. A large subset of MPI is implemented using native subset of MPI is implemented using native marshaling of primitive Java types. marshaling of primitive Java types.

Page 19: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Early MPI-like Efforts - 2

JMPI - JMPI - MPI Soft Tech Inc. have announced a MPI Soft Tech Inc. have announced a commercial effort under way to develop a commercial effort under way to develop a message passing environment for Java.message passing environment for Java.

OthersOthers Existing ports - Existing ports - Linux, Solaris (both WS Linux, Solaris (both WS

clusters and SMPs), AIX (both WS clusters and clusters and SMPs), AIX (both WS clusters and SP2), Windows NT clusters, Origin-2000, SP2), Windows NT clusters, Origin-2000, Fujitsu AP3000, and Hitachi SR2201.Fujitsu AP3000, and Hitachi SR2201.

Java + MPI codes - Java + MPI codes - growing variety growing variety including full applicationsincluding full applications

Page 20: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

MPJ API Specification

Builds on the MPI-1 Specification and the Builds on the MPI-1 Specification and the Java Specification.Java Specification.

Immediate standardization for common Immediate standardization for common message passing programs in Javamessage passing programs in Java

Basis for conversion between C, C++, Basis for conversion between C, C++, Fortran and Java.Fortran and Java.

Eventually, support for aspects of MPI-2 as Eventually, support for aspects of MPI-2 as well as possible improvements to the Java well as possible improvements to the Java language.language.

Page 21: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Multidimensional arrays

In Java an “n-dimensional array” is equivalent to a one-dimensional array of (n - 1)-dimensional arrays.

In MPJ, message buffers are always one-dimensional arrays, but element type may be an object, which may have array type - hence multidimensional arrays can appear as message buffers.

Page 22: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Java multidimensional arrays

A[0][0] A[0][1] A[0][2] A[0][3]A[0]

A[1]

A[2]

A[3]

A[2][0] A[2][1] A[2][2] A[2][3]

A[1][0] A[1][1] A[1][2] A[1][3]

A[3][0] A[3][1] A[3][2] A[3][3]

Array ofArrays

Page 23: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Java multidimensional arrays

A[0][0] A[0][1] A[0][2] A[0][3]

A[1][0] A[1][1]

A[2][0] A[2][1] A[2][2] A[2][3]

B[3][0] B[3][1] B[3][2]

A[0]

A[1]

A[2]

A[3]

B[0]

B[1]

B[2]

B[3]

Java multidimensional arrays are not indivisible objects: couldhave intra-array aliasing and "partial overlaps" with other arrays

Page 24: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Naming Conventions

All MPI classes belong to the package mpi. Conventions for capitalization, etc, in class

and member namesgenerally follow the recommendations of Sun's Java code conventions

consistent with the MPI C++ binding

Page 25: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Error codes

Unlike the C and Fortran interfaces, the Java interfaces to MPIcalls will not return explicit error codes.

Instead, the Java exception mechanism will be used to report errors

Page 26: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Ping-Pong Timings

Message length (bytes)

1e+1 1e+2 1e+3 1e+4 1e+5 1e+6

Execu

tion t

ime (

sec)

1e-4

1e-3

1e-2

Java/LAM-MPIC/LAM-MPIC/IBM-MPI

Page 27: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Part 4: Grid Systems and Environments Relevant publications:

V. Getov, G. von Laszewski, M. Philippsen, I. Foster. Multi-Paradigm Communications in Java for Grid Computing. Communications of the ACM, vol. 44(10), 118-125, Oct. 2001.

V. Getov, M. Gerndt, A. Hoisie, A. Malony, B. Miller (Eds.) Performance Analysis and Grid Computing. Kluwer Academic Publishers, 2003.

V. Getov, S. Newhouse, O. Rana, E. Sharakan. Developing Grid Services with Jini and JXTA. Proc of ICCC 2004, 1402-1408, ICCC Press, 2004 (best paper award).

V. Getov, A. Puliafito, O. Rana. Computational Grid and Web Services: Concepts, Functionalities, and Comparisons. Proc of ICCC 2004, 10-15, ICCC Press, 2004.

Page 28: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Roadmap of Communication Frameworks

Page 29: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

What is the Grid ?

Benefits

Increased productivity by reducing the total cost of ownership

Any-type, anywhere, anytime services by/for all Infrastructure for dynamic virtual organisations Backbone for the next generation Internet

services

“A Grid provides an abstraction for resource sharing and collaboration across

multiple administrative domains…” (Source: NGG Expert Group, 16 June 2003 “European Grid Research 2005-2010)

e-Science

Industry & Business

GridsGrids

Page 30: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Java in Grid Computing

Main motivation - need to solve bigger Main motivation - need to solve bigger problems with resource requirements beyond problems with resource requirements beyond the current limitsthe current limits

Recent advances in computer communications Recent advances in computer communications make it possible to couple geographically make it possible to couple geographically distributed resources - Grid computingdistributed resources - Grid computing

In contrast with low-level approaches Java can In contrast with low-level approaches Java can support a single object-oriented communication support a single object-oriented communication framework for Grande applicationsframework for Grande applications

Page 31: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Example Application:An Advanced Scientific Instrument

Avatar

Virtual Reality Cave

Scientist

Advanced Photon Source

Electronic Library

and Databases

Computing Portal Clients

Supercomputer

Page 32: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Lightweight Grid Platform

New generic approach to designing the next generation Grid systems with dynamic properties – components-based design.

To develop a lightweight Grid platform suitable for resource limited devices, to support our design.

To provide a design that will allow for the efficient integration of mobile devices into the Grid.

To provide enhanced security, centralized management and monitoring, roaming, fault tolerance and a high level of autonomy in this mobile wireless environment.

Page 33: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Challenges in Mobile Grids

Limited available resources. Increased power consumption sensitivity. Increased heterogeneity and software non-

interoperability. Unpredictable long periods of complete

disconnectivity. Unreliable, low-bandwidth and high latency

communication links. Very frequent, dynamic and unpredictable

changes to the network layout.

Page 34: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Hybrid Environment: Virtual “Cluster” Approach

Page 35: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Clustered Approach Benefits

Single point of entry to the wireless cluster. Centralized cluster management and

monitoring. Encapsulation of heterogeneity and

dynamicity. Masking of internal failures and silent

recovering locally without affecting the regular Grid operation.

Page 36: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Ten Reasons to Use Java in High-Performance Computing

LanguageLanguageClass Class

LibrariesLibrariesComponents Components DeploymentDeploymentPortabilityPortability

MaintenanceMaintenancePerformancePerformanceGadgetsGadgets IndustryIndustryEducationEducation

Page 37: Systems and Environments for High Performance Java Computing Vladimir Getov 4 January 2006

Acknowledgements

Bryan Carpenter (Uni Syracuse) Susan Flynn-Hummel (IBM - T.J. Watson) Gregor von Laszewski (Argonne NL) Sava Mintchev (Uni Westminster) Jose Moreira (IBM - T.J. Watson) Michael Philippsen (Uni Karlsruhe) Antonio Puliafito (Uni Messina) Omer Rana (Uni Cardiff) Eric Sharakan (Sun Microsystems) Mary Thomas (San Diego Supercomputer Center) Experiments - CTC, IBM - T.J. Watson, SDSC,

Southampton and Westminster Universities