View
217
Download
0
Embed Size (px)
Citation preview
SEKE-2007 July 10, 2007SEKE-2007 July 10, 2007
Improving Separation of Concerns in the Development of
Scientific Applications
Florida International UniversityFlorida International University
Miami, Florida, U.S.A.Miami, Florida, U.S.A.
Rosa M. Badia and Jorge Ejarque
Barcelona Supercomputing CenterBarcelona, Spain
S. Masoud Sadjadi, Juan Martinez, Tatiana Soldo, Luis
Atencio
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 2
Outline
Motivation Background
GRID superscalar TRAP/J
Case Study: Matmul Transparent Grid Enablement Results Related Work Conclusions Future Work
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 3
Motivation High performance computing (HPC) is gaining popularity
in solving complex scientific applications.
The current HPC programming standards (e.g., MPI, Open MP, and Grid Computing toolkits) are not targeted for scientists to develop their scientific applications For example, Weather Research and Forecast is
200,000+ lines of code in FORTRAN 90 that uses MPI and Open MP
This lack of separation of concerns has resulted in scientific applications with rigid code, which entangles non-functional concerns (e.g., the parallel code and the platform-specific code) into functional concerns (i.e., the core business logic).
Effectively, this tangled code hinders the maintenance and evolution of these applications.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 4
Transparent Grid Enablement: Goals To separates the task of developing the business logic of
a scientific application from the task of improving its performance.
To increase the level of modularity of code by separating crosscutting parallel programming related code from the business logic of the scientific application.
To develop an automatic (or semi-automatic) Grid enablement process that requires no manual modifications to the business logic of the scientific application and hence “transparent” to the scientists and their sequential code.
TGE achieves this goal by integrating two existing software tools, namely, TRAP/J and GRID superscalar.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 5
Background: GRID superscalar• Inspired by the superscalar processors, GRID superscalar
provides an easy programming paradigm for developing parallel programs.
Similar to superscalar processors that provide out-of-order and parallel execution of machine instructions by bookkeeping their dependencies, GRID superscalar provides parallelism to the functions of a program written in a high-level programming language such as Java.
GRID superscalar enables the development of applications for a computational Grid by hiding details of job deployment, scheduling, and dependencies and enables the exploitation of the concurrency of these applications at runtime.
In TGE, actual gridification of the application is obtained through GRID superscalar and the GRID superscalar calls are woven transparently into the scientific application using TRAP/J.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 6
Background: GRID superscalar
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 7
Background: TS, TRAP, and TRAP/J• Transparent Shaping is a programming model that
enables software adaptation through interception and redirection of interactions among different part of a software system without the need to manually modify the code.
• TRAP (Transparent Reflective Aspect Programming) is an extension of Transparent Shaping for object-oriented programming languages.
• TRAP/J is a realization of TRAP in Java that enables static and dynamic adaptation in Java programs at startup and runtime, respectively.
• Other realizations: TRAP/C++, TRAP/BPEL, and TRAP.NET.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 8
Background: TRAP/J• Using TRAP/J, we can insert generic
hooks/interceptors at important/sensitive points in a Java program.
• Later, we can use these hooks to intercept and redirect the flow of control to a new code.
Flow of Control in the Original Application
Invoke Original Task
Execute the Original Task
Flow of Control in the Adapt-Ready
ApplicationInvoke Original Task
Execute the Original Task
Execute the New Task
Adapt? YesNoTRAP/J
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 9
Background: TRAP/J• TRAP/J allows crosscutting concerns to be
separated from the functional logic not only at development time, but also at run time.
Before TRAP/J
AfterTRAP/J
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 10
Outline
Motivation Background
GRID superscalar TRAP/J
Case Study: Matmul Transparent Grid
Enablement Results Related Work Conclusions Future Work
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 11
Case Study: Matmul Matmul is a simple matrix multiplication
program written in Java.
It uses a sequential matrix multiplication algorithm, which computes C = A.B, where A, B, and C are matrices of size NxN.
This typical “row by column” sequential algorithm involves O(N3) operations.
A B C
X =
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 12
Transparent Grid Enablement
C00=C00+A00*B00C00=C00+A01*B10
C01=C01+A00*B01C01=C01+A01*B11
C10=C10+A10*B00C10=C10+A11*B10
C11=C11+A10*B01C11=C11+A11*B11
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 13
Transparent Grid Enablement
A B C
X =Finer-Grain Parallelism: Adaptive code for maximum parallelism of 9.
X =
Coarser-Grain Parallelism: Adaptive code for maximum parallelism of 4.
CBA
Scientist HPC Expert
Matmul Sequential Application
New parallel approach
TRAP/JGRID
Superscalar
Adapt-Ready/Grid-Enabled Application
Startup-time adaptation
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 14
Transparent Grid EnablementMatmul Application
Matmul IDL multiply_acc()
MultiplyMatrices
delegate
GRID superscalar
TRAP/J
Adapt-ready Matmul Application
multiply_acc()
multiply_acc()
multiply_acc()
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 15
Transparent Grid Enablement
public static void main(String[] args){ . . . Multiply_Matrices(size, args[1], args[2], args[3]); }
public static void Multiply_Matrices(int size, fileC, fileA, fileB){ Block A = new Block(fileA, size); Block B = new Block(fileB, size); Block C = new Block(size); C.Multiply(A,B); C.blockToDisk(fileC);}
Sequential matrix multiplication
interface MATMUL{ void multiply_acc(inout File f3, in File f1, in File f2,
in int size);};
Matmul IDL
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 16
Transparent Grid Enablement
public class Matmul_Del implements DelegateInterface{ public static void Multiply_Matrices(int size, fileC, fileA, fileB) {
GSMaster.On();
for(int i=0;i<num_of_pieces;i++) {
for(int j=0; j<num_of_pieces;j++)
{
for(int k=0; k<num_of_pieces;k++)
{ //Method sent to each node in grid
Matmul.multiply_acc(C[i][j], A[i][k],B[k][j],…);
}
}
}
GSMaster.Off();
MergeFiles(); //Merge files after computation …
Multiply Matrices delegate class
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 17
Outline
Motivation Background
GRID superscalar TRAP/J
Case Study: Matmul Transparent Grid
Enablement Results Related Work Conclusions Future Work
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 18
Results
Matrix Size (N)
Sequential(ms)
Parallel with 4 blocks (ms)
Speedup(S/P)
144 674 61512 0.010957212
288 2031 66096 0.030728032
576 9527 69365 0.137345924
1152 62269 172787 0.360380121
Initially we got:
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 19
Results
Matrix Size (N)
Seq.(ms)
Par. w/ 4 blocks and 2 workers
(ms)
Par. w/ 4 blocks and4 workers
(ms)
Par. w/ 9 blocks and6 workers
(ms)
144 5576 79221 57656 145331
288 14934 86259 62013 146744
576 44755 108107 78096 148240
1152 19318 176464 133058 176464
2304 79837 643925 441891 474215
More disappointment!
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 20
Results
Initially we got:
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 21
Results: Optimization Problem in GS: GS_Off()
freeing resources and deleting temporary files after finishing the calls to the grid methods.
Since all the data is distributed along the nodes, there will be the need for cleanup that wastes extra time.
Solution: Avoiding the cleanup! ;)
Optimizing the use of GridFTP TCP has a slow start You can instruct GridFTP to open more TCP
connections with bigger starting window to compensate for the slow start of TCP.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 22
Results: OptimizationUsing Network File System
<?xml version="1.0" encoding="UTF-8"?><project isSimple="yes" masterBandwidth="100000" masterBuildScript="" masterInstallDir="/home/lion-e/globus2/matmul_java_master" masterName="la-blade-01.cs.fiu.edu" masterSourceDir="/a/lion.cs.fiu.edu./disk/216/e/globus2/matmul_java_master" name="Matmul" workerBuildScript="" workerSourceDir="/a/lion.cs.fiu.edu./disk/216/e/globus2/matmul_java_worker"><disks><disk name="_MasterDisk_"/><disk name="_WorkingDisk_la-blade-02_cs_fiu_edu_"/><disk name="_WorkingDisk_la-blade-03_cs_fiu_edu_"/></disks><directories><directory disk="_MasterDisk_" isWorkingPath="yes" path="/home/lion-e/globus2/matmul_java_master"/></directories><workers><worker Arch="" GFlops="1.0" LimitOfJobs="1" Mem="16" NCPUs="1" NetKbps="100000" OpSys="" Queue="none" Quota="0" deploymentStatus="deployed" installDir="/home/lion-e/globus2/matmul_java_worker" name="la-blade-02.cs.fiu.edu"><directories><directory disk="_WorkingDisk_la-blade-02_cs_fiu_edu_" isWorkingPath="yes" path="/home/lion-e/globus2/matmul_java_worker"/></directories></worker>
<?xml version="1.0" encoding="UTF-8"?><project isSimple="yes" masterBandwidth="100000" masterBuildScript=""masterInstallDir="/home/lion-e/globus2/matmul_java_master"masterName="la-blade-01.cs.fiu.edu" masterSourceDir="/a/lion.cs.fiu.edu./disk/216/e/globus2/matmul_java_master" name="Matmul"workerBuildScript="" workerSourceDir="/a/lion.cs.fiu.edu./disk/216/e/globus2/matmul_java_worker"><disks><disk name="_MasterDisk_"/><disk name="_WorkingDisk_la-blade"/><disk name="_sharedDisk_la-blade"/></disks><directories><directory disk="_MasterDisk_" isWorkingPath="yes" path="/home/lion-e/globus2/matmul_java_master"/></directories><workers><worker Arch="" GFlops="1.0" LimitOfJobs="1" Mem="16" NCPUs="1"NetKbps="100000" OpSys="" Queue="none" Quota="0"deploymentStatus="deployed"installDir="/home/lion-e/globus2/matmul_java_worker" name="la-blade-01.cs.fiu.edu"><directories><directory disk="_WorkingDisk_la-blade" isWorkingPath="yes"path="/home/lion-e/globus2/matmul_java_worker"/><directory path="shared_path" disk="_SharedDisk_la-blade"isWorkingPath="no"/></directories></worker>
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 23
Sequential
Parallelism (4) - 2 workers
Parallelism (4) - 4 workers
Parallelism (9) - 6 workers
1 0.070385378 0.09671153 0.038367588
1 0.17312976 0.240820473 0.101769067
1 0.413987993 0.573076726 0.301909066
1 1.094750204 1.451878128 1.094750204
After the optimizations we got:
Results
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 24
Algorithms Speedup
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
144 288 576 1152 2304
Matrix Size
Sp
eed
up
Sequential
Parallelism (4) - 2 w orkers
Parallelism (4) - 4 w orkers
Parallelism (9) - 6 w orkers
In the Speedup graph shown below, we see that our approach performs almost twice better than the sequential one.
Results
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 25
Outline
Motivation Background
GRID superscalar TRAP/J
Case Study: Matmul Transparent Grid
Enablement Results Related Work Conclusions Future Work
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 26
Related Work Satin
is a Java based programming model for the Grid which allows explicit expression of divide-and-conquer parallelism.
Satin uses marker interfaces to indicate that certain invocation methods need to be considered for potentially parallel (spawned) execution.
Synchronization is also explicitly marked whenever it is required to wait for the results of parallel method invocations.
Higher-Order Components (HOCs) is a component-oriented approach based on a master-
worker schema. HOCs express recurring patterns of parallelism that
are provided to the user as program building blocks, pre-packaged with distributed implementations.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 27
Related Work ASSIST
is a programming environment aimed at providing parallel programmers with user-friendly, efficient, portable, fast ways of implementing parallel applications.
It includes a skeleton based parallel programming language (ASSISTcl, cl stands for coordination language) and a set of compiling tools and run time libraries.
The ensemble allows parallel programs written using ASSISTcl to be seamlessly run on top of workstation networks supporting POSIX and ACE (the Adaptive Communication Environment, which is an extern, open source library used within the ASSISTcl run time support).
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 28
ProActive is a Java GRID middleware library for parallel, distributed
and multi-threaded computing. With a reduced set of simple primitives, ProActive
provides a comprehensive API to simplify the programming of Grid Computing applications: distributed on Local Area Network (LAN), on clusters of workstations, or on Internet GRIDs.
ProActive is only made of standard Java classes, and requires no changes to the Java Virtual Machine, no preprocessing or compiler modification, leaving programmers to write standard Java code.
Architected with interception and reflection, the library is itself extensible, making the system open for adaptations and optimizations.
Current implementation is focusing of the CoreGRID NoE specification of the Grid Component Model (GCM).
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 29
Related Work
None of the above mentioned approaches provide an explicit separation of concerns identifying separate tasks for scientist developers and HPC expert developers.
TGE can be extended to use these works instead or in complement to GRID superscalar and can be used as an enabler for supporting interoperation among the above mentioned approaches.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 30
Outline
Motivation Background
GRID superscalar TRAP/J
Case Study: Matmul Transparent Grid
Enablement Results Related Work Conclusions Future Work
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 31
Conclusions In this work, we have presented an innovative approach to
transparent grid-enablement of scientific applications.
We achieved this goal by combining two of our previously developed toolkits, namely, GRID superscalar and TRAP/J.
Although this work is still in its preliminary stage, we were able to show its effectiveness through a simple case study.
We acknowledge that it may not be easy (and even may be impossible) in some applications to separate the code parallelism from the business logic of the application; however, there are many existing applications that can benefit from TGE.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 32
Future Work• We are applying TGE to a real case study, namely,
hurricane mitigaiton simulation and visualization applications.
• Currently, TGE support static adaptation at startup time. We plan to extend it to support dynamic adaptation.
• Currently, TGE supports self-configuration and self-optimization. We plan to extend TGE to support other autonomic behavior including self-healing and self-protection.
• We plan to extend the self-optimization of self-configuration of TGE so that it can take advantage of more worker nodes becoming available during runtime.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 33
Acknowledgements
This work was supported in part by IBM (SUR and Student Support awards), the National Science Foundation (grants OCI-0636031, REU-0552555, and HRD-0317692), the Spanish CICYT (contract TIN2004-07739-CO2-01), and the BSC-IBM Master R&D Collaboration agreement. This work is part of the Latin American Grid (LA Grid) project.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 34
References [1] S. Masoud Sadjadi, J. Martinez, T. Soldo, L. Atencio, R. M. Badia, and
J. Ejarque. Improving separation of concerns in the development of scientific applications. In Proceedings of The Nineteenth International Conference on Software Engineering and Knowledge Engineering (SEKE'2007), Boston, USA, July 2007.
[2] S. Masoud Sadjadi, Philip K. McKinley, and Betty H.C. Cheng. Transparent shaping of existing software to support pervasive and autonomic computing. In Proceedings of the first Workshop on the Design and Evolution of Autonomic Application Software 2005 (DEAS'05), in conjunction with ICSE 2005, St. Louis, Missouri, May 2005.
[3] S. Masoud Sadjadi. Transparent Shaping of Existing Software to Support Pervasive and Autonomic Computing. A Dissertation submitted to Michigan State University, 2004.
[4] S. Masoud Sadjadi, Philip K. McKinley, Betty H.C. Cheng, and R.E. Kurt Stirewalt. TRAP/J: Transparent generation of adaptable Java programs. In Proceedings of the International Symposium on Distributed Objects and Applications (DOA'04), Agia Napa, Cyprus, October 2004.
[5] Rosa M. Badia, Raül Sirvent, Jesus Labarta, and Josep M. Perez. Programming the GRID: An Imperative Language Based Approach. book chapter in Engineering the Grid, Section 4, Chapter 12 , January 2006.
[6] Philip K. McKinley, S. Masoud Sadjadi, Eric P. Kasten and Betty H.C. Chen. Composing Adaptive Software. Computer. July 2004, pages 56-64.
Improving Separation of Concerns in the Development of Scientific Applications, by Masoud Sadjadi et al., SEKE 2007. 35
Questions/Comments Contact Information:
S. Masoud Sadjadi ([email protected])Autonomic Computing Research Lab. (ACRL)School of Computing and Information Sciences
(SCIS)Florida International University (FIU)
TGE, TRAP/J, TRAP.NET, TRAP/BPEL, ACT/J, and other Transparent Shaping tools can be downloaded from http://acrl.cis.fiu.edu/