Upload
votruc
View
217
Download
2
Embed Size (px)
Citation preview
Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis and J. van Leeuwen
1541
Springer Berlin Heidelberg New York Barcelona Hong Kong London Milan Paris Singapore Tokyo
Bo Kglgstr6m Jack Dongarra Erik Elmroth Jerzy Wagniewski (Eds.)
Applied Parallel Computing Large Scale Scientific and Industrial Problems
4th International Workshop, PARA' 98 Ume~, Sweden, June 14-17, 1998 Proceedings
Springer
Series Editors
Gerhard Goos, Karlsruhe University, Germany Juris Hartmanis, Cornell University, NY, USA Jan van Leeuwen, Utrecht University, The Netherlands
Volume Editors
Bo K~gstrOm Erik Elmroth Ume~t University, Dept. of Computing Science and HPC2N S-901 87 Sweden E-mail: {bo.kagstrom,elmroth} @cs.umu.se
Jack Dongarra University of Tennessee, 107 Ayres Hall Knoxville, TN 37996-1301, USA E-mail: [email protected]
Jerzy Wagniewski Danish Computing Centre for Research and Education DTU, UNI C, Bldg. 304, DK-2800 Lyngby, Denmark Jerzy.Wasniewski @ uni-c.dk
Cataloging-in-Publication data applied for
Die Deutsche Bibliothek - CIP-Einheitsaufilahme
Applied parallel computing : large scale scientific and industrial problem ; 4th international workshop ; proceedings / PARA "98, Ume~, Sweden, June 14 - 17, 1998. Bo K~gstr6m ... (ed.). - Berlin ; Heidelberg ; New York , Barcelona ; Hong Kong ; Lo n d o n , Milan ; Paris ; Singapore ; Tokyo : Springer, 1998
(Lecture notes in computer science ; Vol. 1541) ISBN 3-540-65414-3
CR Subject Classification (1998): G.1-2, G.4, F.1-2, D.1-3, J.1
ISSN 0302-9743 ISBN 3-540-65414-3 Springer-Verlag Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law.
�9 Springer-Verlag Berlin Heidelberg 1998 Printed in Germany
Typesetting: Camera-ready by author SPIN 10693025 06/3142 - 5 4 3 2 1 0 Printed on acid-free paper
Preface
The Fourth International Workshop on Applied Parallel Computing (PARA'98) was held in Ume~, Sweden, June 14-17, 1998. The workshop was organized by the High Performance Computing Center North (HPC2N) and the Department of Computing Science at Ume~ University. The general theme for PARA'98 was Large Scale Scientific and Industrial Problems, focusing on:
- High-performance computing applications in academia and industry, - Tools, languages and environments for high-performance computing, - Scientific visualization and virtual reality applications in academia
and industry, - Future directions in high-performance computing and networking.
The workshop attracted over 140 people representing 18 different countries. PARA'98 was an international forum for idea and competence exchange for specialists in high performance and parallel computing, visualization, and scien- tists from industry and academia involved in solving large scale computational problems. The workshop program included 20 invited presentations and 64 con- tributed presentations that were selected by the PARA'98 steering committee. These proceedings reflect the results of this meeting.
The PARA'98 meeting began with a one-day tutorial followed by a three-day workshop. The tutorial program included three topics: Tools and Languages for High Performance Computing (Dennis Gannon, Indiana University), Projection Based Virtual Environments for Collaboration in Scientific Visualization, Indus- trial Design and Art (Dan Sandin, University of Illinois, Chicago), and Scientific Visualization and Computational Steering, (Chris Johnson, University of Utah). Over 80 people attended the tutorials.
The first three PARA workshops were held at the Technical University of Denmark (DTU), Lyngby (1994, 1995, and 1996). Following PARA'96 an in- ternational steering committee for the PARA meetings was appointed and the committee decided that a workshop should take place every second year in one of the Nordic countries. One important aim of these workshops is to strengthen the ties between HPC centers, academia, and industry in the Nordic countries as well as worldwide. Sweden and Ume~ University organized the 1998 work- shop and the next workshop in the year 2000 will take place at the University of Bergen in Norway.
September 1998 Bo K~gstrSm Jack Dongarra
Erik Elmroth Jerzy Wa~niewski
VI
Organization
PARA'98 was organized by the High Performance Computing Center North (HPC2N) and the Department of Computing Science at Umes University.
Organizing Committee
Conference Chair: Conference Coordinator: Local Organization:
Bo Ks (Umes University, Sweden) Erik Elmroth (Umes University, Sweden) Krister Dackland (Umes University, Sweden) Lena Hellman (Umes University, Sweden) Per Ling (Umes University, Sweden) Mats NylSn (Umes University, Sweden) Peter Poromaa (Umes University, Sweden)
Steering Committee
Petter Bj0rstad Jack Dongarra
BjSrn Engquist
Kristjan Jonasson Bo Ks
Risto Nieminen
Karstein S0rli
Olle Teleman
Jerzy Wa~niewski
University of Bergen (Norway) University of Tennessee and Oak Ridge National Laboratory (USA) PDC, Royal Institute of Technology, Stockholm (Sweden) University of Iceland, Reykjavik (Iceland) Umes University and HPC2N (Sweden), PARA '98 Chairman Helsinki University of Technology, Espoo (Finland) SINTEF, Dept of Industrial Mathematics, Trondheim (Norway) Center for Scientific Computing (CSC), Espoo (Finland) Danish Computing Centre for Research and Education (UNI.C), Lyngby (Denmark), PARA '9~-96 Chairman
Sponsoring Institutions
Swedish Council for High Performance Computing (HPDR) Swedish Natural Science Research Council (NFR) Swedish Research Council for Engineering Sciences (TFR) Umes University (Rector, HPC2N, Department of Computing Science) IBM Sweden
A c k n o w l e d g m e n t s
We acknowledge the enthusiastic work of the steering committee and the local or- ganizing committee. We also acknowledge the following people for the assistance and support in the organization of PARA'98: Inga Bohman, Lena Carneland, Anne-Lie Persson, Inger Sandgren; Anders Backman, Erik Bs TorbjSrn Jo- hansson, Mikael R~nnar,/~ke Sandgren, and BjSrn Torkelsson. PARA'98 would not have been possible without the personal involvement of all these people. We thank Krister Dackland, Isak Jonsson, and Per Ling for their professional assistance in editing the proceedings. Finally, we also would like to thank the sponsoring institutions for their generous financial support.
Table of Contents
Speaker information. 1
Communications Latency Hiding Techniques for a Reconfigurable Optical Interconnect: Benchmark Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A. A#ahi, N.J. Dimopoulos
Multifrontal Solvers Within the PARASOL Environment . . . . . . . . . . . . . . . . 7 P. Amestoy, I. Duff, J.-Y. L'Excellent
Parallelization of a 3D FD-TD Code for the Maxwell Equations Using MPI 12 U. Andersson
Advanced Calculations and Visualization of Enzymatic Reactions with the Combined Quantum Classical Molecular Dynamics Code . . . . . . . . . . . . . . . 20
P. Bata, P. Grochowski, K. Nowidski, T. Clark, B. Lesyng, J.A. McCammon
Memory Access Profiling Tools for Alpha-based Architectures . . . . . . . . . . . 28 S. M. Balle, S.C. Steely, Jr.
Parallelized Block-Structured Newton-Type Methods in Dynamic Process Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
J. Borchardt
Tuning the Performance of Parallel Programs on NOWs Using Performance Analysis Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
M. Bubak, W. Funika, J. Mogci~ski
Numerical Simulation of 3D Fully Nonlinear Water Waves on Parallel Computers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
X. Cai
Fluctuations in the Defect Creation by Ion Beam Irradiation . . . . . . . . . . . . 56 R. Chakarova, L Pdzsit
Parallelisation of an Industrial Hydrodynamics Application Using the P INEAPL Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
T. Christensen, A.R. Krornmer, J. Larsen, L. SOrensen
Hyper-Rectangle Selection Strategy for Parallel Adaptive Numerical Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
R. Ciegis, R. Sablinskas, J. Wa{niewski
1 Bold style indicates the invited speaker. Underline indicates the speaker.
•
Parallelising ~-hzzy Queries for Spatial Da ta Modelling on a Cray T3D . . . . 76 A. Clematis, A. Coda, M. Spagnuolo, S. SpineUo, T. Sloan
Hyper-Systolic Implementat ion of BLAS-3 Routines on the APE100/Quadr ics Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
M. Coletta, T. Lippert, P. Palazzari
Resource Management for Ultra-scale Computat ional Grid Appl ica t ions . . . 88 K. Czajkowski, I. Foster, C. K e s s e l m a n
A ScaLAPACK-Style Algorithm for Reducing a Regular Matr ix Pair to Block Hessenberg-Triangular Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
K. Dackland, B. Kdgstr5m
Parallel Tight-Binding Molecular Dynamics Code Based on Integrat ion of H P F and Optimized Parallel Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
B. Di Martino, M. Celino, M. Briscolini, L. Colombo, S. Filippone, V. Rosato
Parallel Computat ion of Multidimensional Scattering Wavefunctions for Helmholtz/Schroedinger Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
/t. Edlund, I. Bar-On, U. Peskin
New Serial and Parallel Recursive QR Factorization Algorithms for SMP Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
E. Elmroth, F. Gustavson
Visualization of CFD Computat ions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 J. EngstrSm
Improving the Performance of Scientific Parallel Applications in a Cluster of Workstations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
A. Flores, J.M. Garcla
On the Parallelisation of Non-linear Optimisation Algorithms for Ophthalmical Lens Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
E. Fontdecaba Baig, J.M. Cela Espln, J.C. Diirsteler Lopez
Modelica - - A Language for Equation-Based Physical Modeling and High Performance Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
P. F r i t z s o n
Distributed Georeferring of Remotely Sensed Landsa t -TM Imagery Using MPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
J.D. Garcla-Consuegra, J.A. Gallud, G. Sebastidn
Parallel Test Pat tern Generation Using Circuit Parti t ioning in a Shared-Memory Multiprocessor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
C. Gil, Y. Ortega, J.L. Bernier, M.D. Gil
•
Parallel Adaptive Mesh Refinement for Large Eddy Simulation Using the Finite Element Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
D. Golden, N. Hurley, S. McGrath
WSSMP: A High-Performance Serial and Parallel Symmetric Sparse Linear Solver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
A. G u p t a , M. Yoshi, V. Kumar
Recursive Blocked Data Formats and BLAS's for Dense Linear Algebra Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
F. G u s t a v s o n , A. Henriksson, L Yonsson, B. KdgstrSm, P. Ling
Superscalar GEMM-based Level 3 BLAS - The On-going Evolution of a Portable and High-Performance Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
F. Gustavson, A. Henriksson, L Jonsson, B. KdgstrSm, P. Ling
Parallel Solution of Some Large-Scale Eigenvalue Problems Arising in Chemistry and Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
D.L. Harrar II, M.R. Osborne
An Embarrassingly Parallel ab initio MD Method for Liquids . . . . . . . . . . . . 224 F. Hedman, A. Laaksonen
A New Parallel Preconditioner for the Euler Equations . . . . . . . . . . . . . . . . . 230 L. Hemmingsson, A. Kiihiiri
Parti t ioning Sparse Rectangular Matrices for Parallel Computat ions of Ax and ATv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
B. Hendrickson, T.G. Kolda
NetLink: A Modern Data Distribution Approach Applied to Transparent Access of High Performance Software Libraries . . . . . . . . . . . . . . . . . . . . . . . . 248
L Holmqvist, E. LindstrSm
Modernization of Legacy Application Software . . . . . . . . . . . . . . . . . . . . . . . . . 255 J. Howe, S.B. Baden, T. Grimmett, K. Nomura
Parallel Methods for Fluid-Structure Interaction . . . . . . . . . . . . . . . . . . . . . . . 263 C . B . J e n s s e n , T. Kvamsdal, K.M. Okstad, J. Amundsen
Parallel Computing Tests on Large-Scale Convex Optimizat ion . . . . . . . . . . 275 M. Kallio, S. Salo
Parallel Sparse Matrix Computat ions in the Industrial Strength P I N E A P L Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
A.R. Krommer
Massively Parallel Linear Stability Analysis with P_ARPACK for 3D Fluid Flow Modeled with MPSalsa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
R . B . L e h o u c q , A.G. Salinger
•
Parallel Molecular Dynamics Simulations of Biomolecular Systems . . . . . . . 296 A. Lyubartsev, A. Laaksonen
A Parallel Solver for Animal Genetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304 P. Madsen, M. Larsen
Scheduling of a Parallel Workload: Implementat ion and Use of the Argonne Easy Scheduler at PDC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
L. Malinowsky, P. Oster
An Algorithm to Evaluate Spectral Densities of High-Dimensional Stat ionary Diffusion Stochastic Processes with Non-linear Coefficients: The General Scheme and Issues on Implementat ion with PVM . . . . . . . . . . . 315
Y.V. Mamontov, M. Willander
High-Performance Simulation of Evolutionary Aspects of Epidemics . . . . . 322 W. Maniatty, B.K. Szymanski, T. Caraco
A Parallel Algorithm for Computing the Extremal Eigenvalues of Very Large Sparse Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
F. Manne
Technologies for Teracomputing: A European Option . . . . . . . . . . . . . . . . . . . 337 A. Mathis
High Performance Fortran: Status and Prospects . . . . . . . . . . . . . . . . . . . . . . . 345 P. Mehrotra, J. Van Rosendale, H . Z i m a
PAVOR - Parallel Adaptive Volume Rendering System . . . . . . . . . . . . . . . . . 357 M. Meiflner
Simulation Steering with SCIRun in a Distributed Environment . . . . . . . . . 366 M. Miller, C.D. Hansen,, C . R . J o h n s o n
Addressing the Requirements of ASCI-class Systems . . . . . . . . . . . . . . . . . . . . 377 J . H . M i r z a
A Parallel Genetic Algorithm for the Graphs Mapping Problem . . . . . . . . . 379 O.G. Monakhov, E.B. Grosbein
Parallel Wavelet Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385 O. Mr Nielsen
Writing a Multigrid Solver Using Co-array Fortran . . . . . . . . . . . . . . . . . . . . 390 R.W. Numrich, J. Reid, K. Kim
Exploiting Visualization and Direct Manipulation to Make Parallel Tools More Communicative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400
C . M . P a n c a k e
•
Deploying Fault-Tolerance and Task Migration with NetSolve . . . . . . . . . . . . 418 J.S. Plank, H. Casanova, M. Beck, J . D o n g a r r a
Comparison of Implicit and Explicit Parallel Programming Models for a Finite Element Simulation Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
J. Ptaiek, K. Banal, J. Kitowski
Parallel Algorithms for Triangular Sylvester Equations: Design, Scheduling and Scalability Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
P. Poromaa
Fast and Quanti tat ive Analysis of 4D Cardiac Images Using a SMP Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
V. Positano, M.F. SantareUi, L. Landini, A. Benassi
Ab Initio Electronic Structure Methods in Parallel Computers . . . . . . . . . . . 452 S. P S y k k 5
Iterative Solution of Dense Linear Systems Arising from Integral Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460
J. Rahola
Comparison of Parti t ioning Strategies for PDE Solvers on Multiblock Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468
J. Rantakokko
Ship Design Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476 C. Risager, J . W . P e r r a m
Parallelization Strategies for the VMEC Program . . . . . . . . . . . . . . . . . . . . . . 483 L.F. Romero, E.M. Ortigosa , E.L. Zapata, J.A. Jirngnez
Rational Krylov Algorithms for Eigenvalue Computa t ion and Model Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
A. R u h e , D. Skoogh
Solution of Distributed Sparse Linear Systems Using PSPARSLIB . . . . . . . . 503 Y. Saad, M. Sosonkina
Parallelization of the DAO Atmospheric General Circulation Model . . . . . . 510 W. Sawyer, R. Lucchesi, P. Lyster, L. Takacs, J. Larson, A. Molod, S. Nebuda, C. Pabon-Ortiz
Dynamic Performance Callstack Sampling: Merging TAU and DAQV . . . . . 515 S. Shende, A.D. Malony, S.T. Hackstadt
A Parallel Rational Krylov Algorithm for Eigenvalue Computat ions . . . . . . 521 D. Skoogh
xIv
Por table Implementa t ion of Real-Time Signal Processing Benchmarks on HPC Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
J. Suh, V.K. Prasanna
Large Scale Active Networks Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537 K. Swaminathan., R. Radhakrishnan, P.A. Wilsey, P. Alexander
Forward Dependence Folding as a Method of Communicat ion Opt imizat ion in SPMD Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
Z. Szczerbinski
A Parallel Genetic Clustering for Inverse Problems . . . . . . . . . . . . . . . . . . . . . 551 H. Telega, R. Schaefer, E. Cabib
A Parallel Hierarchical Solver for Fini te Element Applicat ions . . . . . . . . . . . 557 C.-A. Thole, A. Supalov, S. Mayer
Parallel Computa t ion and Visualization of 3D, Time-Dependent , Thermal Convective Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565
P. Wang, P. Li
Recursive Formulat ion of Cholesky Algori thm in For t ran 90 . . . . . . . . . . . . . 574 J. Wagniewski, B.S. Andersen, F. Gustavson
High Performance Linear Algebra Package for FORTRAN 90 . . . . . . . . . . . . 579 J. Wa~niewski, Y. Dongarra
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585