Upload
halien
View
215
Download
0
Embed Size (px)
Citation preview
Folie 1 > Vortrag > Stefan Melber-Wilkending22.09.2005
Wind-Tunnel Simulation using TAU on a PC-Cluster: Resources and Performance
Stefan Melber-Wilkending / DLR Braunschweig
22.09.2005
Wind-Tunnel Simulation using TAU on a PC-Cluster: Resources and Performance Outline
New Linux PC-Cluster at Braun-schweig (DLR-AS) → Performance Measurements of TAU on PC-Clusters:
Platforms
Results
Example of an application on a PC-Cluster: Wind-Tunnel Simulation
Wind-Tunnel Boundary Condition
Example: Simulation of DLR ALVAST High-Lift Configuration in Low-Speed Wind-Tunnel DNW-NWB
22.09.2005
New Linux PC-Cluster at DLR-ASTechnical Data - General
New Linux PC-Cluster at DLR-AS / Braun-schweig:
For middle-sized CFD-problems
Production-usage for research and contract-work
Size: 276 Opteron 2.6 GHz CPUs
Hardware installation and testing: 09/2005
Open for user-access: 10/2005
22.09.2005
138 Dual-Opteron (AMD) Nodes (V20z, SUN)
CPU-clockspeed: 2.6 GHz
4 GByte DDR1/400 memory
2 x 73 GB Ultra320 SCSI HDs
Management processor („remote power reset“, monitoring, error-analysis ...)
Infiniband HPC interconnect
100 MBit Ethernet interconnect
1 HU - size
SuSE Linux 9.3 professional
New Linux PC-Cluster at DLR-ASTechnical Data - Nodes
22.09.2005
2 Frontends (V40z, SUN)
4x Opteron 2.2 GHz (AMD)
8 GByte DDR1/333 memory
2 x 73 GB Ultra320 SCSI HDs
100MBit Ethernet interconnect
3 HU - size
SuSE Linux 9.3 professional
RAID system 10 TByte
Infiniband switch 144 ports (Voltaire)
PBS Pro queuing-system / MAUI sheduler
New Linux PC-Cluster at DLR-ASTechnical Data - Frontends
22.09.2005
New Linux PC-Cluster at DLR-ASTechnical Data - Setup
22.09.2005
32 Nodes / 64 CPUs Intel Xeon 3.06 GHz NEC-Cluster (DLR-AS):2 GByte RAM / Node, Myrinet 2000 Interconnect
128 Nodes / 256 CPU AMD Opteron 2.0 GHz Cray-Cluster (HWW)4 GByte RAM / Node, Myrinet 2000 Interconnect
192 Nodes / 384 CPUs AMD Opteron 2.4 GHz SUN-Cluster (DLR-AT)4 GByte RAM / Node, Infiniband (Voltaire) Interconnect
36 Nodes / 72 CPUs AMD Opteron 2.2 GHz Cray XD1-Cluster (Cray)4 GByte RAM / Node, RapidArray Interconnect (direct connection between network and Hybertransport-channel on the CPU)
72 Nodes / 144 CPUs AMD Opteron 2.4 GHz Cray XD1-Cluster (Cray)8 GByte RAM / Node, RapidArray Interconnect
New Linux PC-Cluster at DLR-ASPerformance – Compared Systems
22.09.2005
All Clusters running under Linux Operating-System
Compiler: GnuCC 3.2.3
TAU-Code, Version 2004.1.2 with typical settings for complex configurations:
Central discretization
Implicit time integration (LU-SGS)
CFL-number: 5
Multigrid: 3v
Turbulence model: Menter k-ω SST
Low-Mach-number preconditioning
Cache-optimization
Case: glider with laminar-turbulent transition
Free-stream conditions: Ma = 0.078, Re = 1.1e6
Grid: 10 million points, 30 layers
New Linux PC-Cluster at DLR-ASPerformance – Setup
22.09.2005
CPU-Time for 50 cycles [s] for different CPU-numbers
CPUs
6 - 2303 - - 1947 17028 - 1667 1307 1222 1266 112612 1564 1108 881 811 987 74316 1203 760 661 621 669 57232 643 436 347 326 339 30648 - - - - 241 23660 - - - 176 183 165
NEC Xeon Cray Opteron Cray Opteron Cray Opteron SUN Opteron SUN Opteron3.06 Ghz (AS) 2.0 Ghz 2.2 Ghz 2.4 Ghz 2.4 Ghz (AT) 2.6 Ghz (AS)
New Linux PC-Cluster at DLR-ASPerformance – Test Results
22.09.2005
Relative Speedup compared to Cray Opteron-Cluster at HWW
CPUs
6 - 100 - - 118 1358 - 100 128 136 132 14812 71 100 126 137 121 14916 63 100 115 114 114 13332 68 100 126 134 129 143
NEC Xeon Cray Opteron Cray Opteron Cray Opteron SUN Opteron SUN Opteron3.06 Ghz (AS) 2.0 Ghz 2.2 Ghz 2.4 Ghz 2.4 Ghz (AT) 2.6 Ghz (AS)
New Linux PC-Cluster at DLR-ASPerformance – Test Results
22.09.2005
Folie 11 > Vortrag > Stefan Melber-Wilkending
New Linux PC-Cluster at DLR-ASPerformance – Test Results
Speed of TAU on Opteron CPUs is a linear function of CPU clockspeed
Compared to Cray-Opteron 2.0 GHz new cluster is about 1.5 times faster
Compared to NEC Xeon 3.06 GHz (standard cluster at AS-BS) new cluster is about 2.1 times faster
22.09.2005
New Linux PC-Cluster at DLR-ASPerformance – Test Results
Speedup compared to 8 CPUs (memory restrictions of the test-case)
Nearly linear scalability of the TAU-Code up to 60 CPUs
Tested Inter-connects (Myrintet, Infiniband, Rapid-Array) have enough reserve for TAU parallelisation
22.09.2005
Wind-Tunnel simulation using TAU-CodeGeneral
Simulation of a wind-tunnel including test-section and nozzle
Background:
Avoid uncertainties of wind-tunnel corrections
Uncorrected measurements directly comparable to CFD
Validation of wind-tunnel corrections
Extrapolation of wind-tunnel results at free-flight using CFD
DLR project ForMEx (Fortschrittliche Methoden zur Extrapolation von Windkanalergebnissen auf den Freiflug)
„Problem“: Numerical simulation of wind-tunnel including model → big grids (about 20 million points) → HPC-resources needed → new PC-cluster / AS-BS
22.09.2005
Idea: Usage and extension of engine boundary-condition
Wind-tunnel inlet: Total-pressure and -temperature are given
Regulation of flow-speed in wind-tunnel:
Imaginary probe in numerical test-section (same position as in experiment)
Comparison with given Mach-number
Input for static pressure regulation on tunnel-outlet
Applyable for 0 < Ma < 1 TAU-Code
Numerical Wind-Tunnel
ImaginaryProbe
Bound. Cond.
Pressure on Outlet
Wind-Tunnel simulation using TAU-CodeWind-Tunnel Boundary Condition
22.09.2005
Measurements in empty low-speed wind-tunnel DNW-NWB
Database for validation of numerical results
Measurements:
Boundary layer profiles
Static pressure on tunnel-outlet
Wind-Tunnel simulation using TAU-CodeValidation
22.09.2005
DLR-ALVAST half-model in high-lift configuration in DNW-NWB
DLR-ALVAST:
analoge to AIRBUS A320
Half model mounted on peniche
Grids:
Hybrid unstructured
Centaur grid generator
20 million points
Full Navier-Stokes
Chimera-Technique: rotation of model without grid-generation
Wind-Tunnel simulation using TAU-CodePreliminary Results DNW-NWB / ALVAST
22.09.2005
Simulation of complete lift-polars including maximum lift
Geometry variations:
Wing-root geometry (e.g. slat-horn, 16 configurations)
Comparison of wind-tunnel- simulation against free-flight → wind-tunnel-corrections
Influence of peniche height
Wind-Tunnel simulation using TAU-CodePreliminary Results DNW-NWB / ALVAST
22.09.2005
Horse-shoe vortex around peniche
ALVASTTAU
F11Wind-Tunnel
Wind-Tunnel simulation using TAU-CodePreliminary Results DNW-NWB / ALVAST
22.09.2005
Wind-Tunnel simulation using TAU-CodePreliminary Results DNW-NWB / ALVAST
22.09.2005
TAU tested on PC-Linux Clusters:
Good scalability and performance
New Cluster at AS/BS available for production: 10/2005
Implementation of an wind-tunnel boundary condition in TAU:
Validation with empty wind-tunnel measurements
First results of simulation of ALVAST high-lift configuration at DNW-NWB compared to the experiment
Further work: Investigation of half-model influence, variation of geometry, ...
Conclusions
22.09.2005
Special thanks for testing-support and debugging of TAU-parallelisation
W. Hafemann, C. Simmendinger (T-Systems)
N. Gal, Y. Shahar (Voltaire)
J. Redmer, T. Warschko (Linux NetWorx)
Axel Köhler (SUN)
Institute of Propulsion Technology (DLR-AT)
R. Dwight, T. Alrutz (DLR-AS)
M. Wierse (Cray)