Some Experiences on Parallel Finite Element Computations Using IBM/SP2

Some Experiences on Parallel Some Experiences on Parallel Finite Element Computations Finite Element Computations

Using IBM/SP2Using IBM/SP2

Yuan-Sen Yang and Shang-Hsien HsiehYuan-Sen Yang and Shang-Hsien Hsieh

National Taiwan UniversityNational Taiwan University

Taipei, Taiwan, R.O.C.Taipei, Taiwan, R.O.C.

ContentsContents

• Parallel Substructure Method

• Three Issues : – Mesh Partitioning– Nodal Renumbering within Substructures– Solution of Interface DOFs

• Conclusions

Parallel Substructure MethodParallel Substructure Method

• Partition a structure into several substructures.

• Assign each substructure to a processor.

• Matrix assembly & static condensation within each substructure.

CondensationCondensation

Interior NodesInterface Nodes

Parallel Substructure Method Parallel Substructure Method (cont.)(cont.)

• Solve the displacements of interface DOFs.

• Solve the displacements of inter

nal DOFs in each substructure.

• Perform force recovering in eac

h substructure.

RecoveringRecovering

Interior NodesInterface Nodes

Mesh PartitioningMesh Partitioning

• Requirements

– Automatic Partitioning

– Handling regular & irregular meshes.

– Balanced distribution of number of elements.

– Minimization of number of interface nodes.

Experiences Experiences (Mesh Partitioning)(Mesh Partitioning)

• GR, RST, METIS are used in this work.

• Balanced distribution of number of elements is achieved.

• Condensational load are unbalanced.

0 5 10 15 20 25 30

P00

P01

P02

P03

RSTRST

Substructural Nodal RenumberingSubstructural Nodal Renumbering

• Purpose:– To reduce the skyline of substructure matrix.

• Constraint:– Interface nodes must be numbered after internal nodes

• Reversed Cuthill-Mckee (RCM, Liu & Sherman 1975) is modified and used.

Experiences Experiences (Substructure Nodal Renumbering)(Substructure Nodal Renumbering)

• Help to Reduce the conde

nsational loads.

• Rarely balance the conde

nsational loads among pr

ocessors.

0 5 10 15 20

P00

P01

P02

P03

0 5 10 15 20 25 30

P00

P01

P02

P03

Without Substructure Nodal Renumbering

With modified RCM Substructure Nodal Renumbering

30STORY. RST. With 4 processors

RSTRST

Solution of Interface DOFsSolution of Interface DOFs

• Achieving high parallel effici

ency for linear equation solve

r is not an easy task.

• When NP increases

NI increases

Parallel Efficiency decreases

Experiences Experiences (Solution of Interface DOFs)(Solution of Interface DOFs)

• In this work, a sequential direct method(Cholesky decomposition)

is used.

• NI is affected by both NP and the performance of the partitionin

g algorithm.

Partitioning Algorithms NP NI TIRST 4 48 2.2 sRST 8 112 7.0 sGR 8 127 16.5 s

NP : Number of processors.NI : Number of internal nodes.

TI : Time for solving interface DOFs.

ConclusionsConclusions• Mesh partitioning

– Computational loads of each processor is not necessarily proportional to its number of elements.

– Minimization of interface nodes reduces the interface equations and usually improves the parallel efficiency.

• Substructural nodal renumbering – Substructural nodal renumbering always reduces the condensational

loads.

– But rarely balance the condensational loads among procesors.

• Parallel solution of interface DOFs– High-efficiency parallel solvers of interface equations are needed fo

r improving the efficiency of parallel substructure method.

AcknowledgementAcknowledgement

• This research is supported by the National Science Council of R.O.C., under the project Nos. NSC 86-2211-E-002-029 and NSC 87-2211-E-002-034.

• The parallel computations are performed on IBM/SP2 comupters of National Center for High-performance Computing, Hsin-Chu, Taiwan, R.O.C.

IBM/SP2 in NCHC

• Model– IBM POWER2 SuperChip (P2SC)

• Floating Peak Performance– 480-MFLOPS

• Memory– 128 Mbtyes per node

Documents

Some Experiences on Parallel Finite Element Computations Using IBM/SP2