32
Ultra-Skalierbare Multiphysiksimulationen für Erstarrungsprozesse in Metallen (SKAMPY) HPC-Status-Konferenz der Gauß-Allianz, 29. November 2016, Hamburg Harald Köstler , Bauer, Schornbaum, Godenschwager, Rüde, Hammer, Wellein, Hötzer, Nestler Chair for System Simulation Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany

Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

Ultra-Skalierbare Multiphysiksimulationen für

Erstarrungsprozesse in Metallen (SKAMPY)

HPC-Status-Konferenz der Gauß-Allianz, 29. November 2016, Hamburg

Harald Köstler, Bauer, Schornbaum, Godenschwager, Rüde, Hammer, Wellein, Hötzer, Nestler

Chair for System SimulationFriedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany

Page 2: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

2

SKAMPY Project

Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016

Page 3: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

• waLBerla Framework

• SKAMPY Project

3

Outline

Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016

Page 4: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

The waLBerla Framework

Page 5: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

waLBerla Framework

Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016 5

Page 6: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

7

waLBerla Framework

• widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for CFD simulations with

Lattice Boltzmann Method (LBM) • evolved into general framework for algorithms on block-structured grids

• www.walberla.net

Vocal Fold Study(Florian Schornbaum)

Fluid Structure Interaction (Simon Bogner)

Free Surface Flow

Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016

Page 7: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

8

Block-structured Grids

Complex geometry given by surface Add regular block partitioning

Discard empty blocks

Allocate block data

Load balancing

Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016

Page 8: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

• Domain Decomposition & Distribution to Processes:• regular decomposition into blocks containing uniform grids

• grid refinement: octree-like decomposition

9

Block-structured Grids

In most cases, if a regular decomposition of a uniform

grid is used, exactly one block is assigned to each process.

forest of octrees:each block contains a uniform grid

of the same size→ 2:1 balance between

neighboring cells on level transitions

Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016

Page 9: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

• Distributed Memory Parallelization: MPI• data exchange on borders between blocks via ghost layers

• support for overlapping communication and computation

• some advanced models require more complex communication patterns ( e.g. free-surface and fluid-structure interaction)

10

Hybrid Parallelization

receiverprocess

senderprocess

(slightly more complicated for non-uniform domain decompositions, but the same general ideas still apply)

Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016

Page 10: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

SKAMPY ProjectApplication

Page 11: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

Overview

Johannes Hötzer- Institute of Applied Materials – Computational Material Science, KIT, 2016 12

Page 12: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

• ternary eutectic alloys

• directional solidification

• analytically moving

• temperature gradient

• massively parallel phase-field simulations

• large domain sizes

Application Setting

Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016 13

Page 13: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

Overview

Johannes Hötzer- Institute of Applied Materials – Computational Material Science, KIT, 2016 14

Page 14: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

Phase-field model

Johannes Hötzer- Institute of Applied Materials – Computational Material Science, KIT, 2016 15

Page 15: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

16

Microstructure prediction Al-Ag-Cu

Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016

Page 16: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

Pattern features in a Al-Ag-Cu

Johannes Hötzer- Institute of Applied Materials – Computational Material Science, KIT, 2016 17

Page 17: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

Spiral growth in ternary systems

Johannes Hötzer- Institute of Applied Materials – Computational Material Science, KIT, 2016 18

Page 18: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

SKAMPY ProjectPerformance Engineering

Page 19: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

20

Work packages

Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016

Page 20: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

21

Single Node Tuning

Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016

80 x faster compared to original version

Page 21: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

22

Intranode Scaling

Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016

intranode weak scaling on SuperMUC

Page 22: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

23

Single Node Optimization Summary

Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016

𝜙-Sweep 21 %

μ-Sweep 27 %

Complete Program 25%

Single Node Optimizations

• replace/remove expensive operations like square roots and divisions

• pre-compute and buffer values where possible

• SIMD intrinsics

Percent Peak on SuperMUC

Why not 100% Peak?

• unbalanced number of multiplications and addition

• divisions counted as 1 FLOP but they cost 43 times as much as a multiplication or addition

Page 23: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

24

Scaling

Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016

• scaling on SuperMUC up to 32,768 cores

• ghost layer based communication

• communication hiding

Page 24: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

25

Execution-Cache-Memory Model

Julian Hammer, Georg Hager, Gerhard Wellein – RRZE HPC group, FAU Erlangen-Nürnberg, 2016

Page 25: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

• Automatic Layer Conditions Model

26

Execution-Cache-Memory Model

Julian Hammer, Georg Hager, Gerhard Wellein – RRZE HPC group, FAU Erlangen-Nürnberg, 2016

Page 26: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

SKAMPY ProjectSoftware Engineering andParallelization

Page 27: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

• Written in C++ with Python extensions

• Hybridly parallelized (MPI + OpenMP)

• No data structures growing with number of processes involved

• Scales from laptop to recent petascale machines

• Parallel I/O

• Portable (Compiler/OS)

• Automated tests / CI servers

• Open Source release planned

28

Continuous Integration

llvm/clang

Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016

Page 28: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

29

Continuous Integration

Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016

Page 29: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

30

Outlook: Selected Work packages I

Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016

SA2: Steering & Prototyping Interface

• based on Python

SA3: Adaptivity and Dynamic Load Balancing

Page 30: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

31

Outlook: Selected Work packages II

Harald Köstler - Chair for System Simulation, FAU Erlangen-Nürnberg, 2016

MC1: Many-Core Data Structures

MC2: Communication Optimization and Multi-scale Data Reduction

• Communication strategies for heterogeous systems• Problem specific data compression

MC3: Asynchronous Execution and Resilience

Page 31: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

Acknowledgements

• Funded by• Bundesministerium für Bildung und Forschung

• KONWIHR. Bavarian project

• DFG SPP 1648/1 – Software for Exascale computing

• Industry

• Supercomputing centers

http://www.exastencils.org/

43

Page 32: Ultra-Skalierbare Multiphysiksimulationen für ... · waLBerla Framework • widely applicable Lattice-Boltzmann from Erlangen • HPC software framework, originally developed for

Thank you!

Questions?