© 2013 ANSYS, Inc. November 25, 2014
2
Trends in Engineering Driving Demand for HPC
Increase product performance and integrity in less time • Consider more design variants • Find the ‘optimal’ design • Ensure performance across a range of
conditions
Increased product complexity! • Assess larger, more detailed models • Consider more complex physics • From single component to system
innovations HPC is a key enabler to gain higher-fidelity insight
HPC is a key enabler to assure greater product integrity through robust design
© 2013 ANSYS, Inc. November 25, 2014
3
HPC is a key enabler to amplify engineering productivity
Do more with existing engineering and design teams • Scale-up of HPC to support more simulation workloads • Centralized HPC infrastructure (or cloud) with effective remote access • Managing collaboration and explosive growth of engineering data
Trends in Engineering Driving Demand for HPC
Mainframes Client/Server Workstation/PCs Local Clusters Shared HPC Grid Cloud, Mobile
1990 2000 2010
© 2013 ANSYS, Inc. November 25, 2014
4
• Scale-up of high performance computing • new/emerging architectures • more and bigger workloads
• Common integrated tools and platform for IT efficiency • Efficient centralized infrastructure (or cloud)
• with effective remote access
ANSYS Strategic Focus Scalable and Cost-Effective IT Evolution
• Collaboration hubs with secure and scalable data access
• Support for mobile platforms
© 2013 ANSYS, Inc. November 25, 2014
5
• Scale-up of high performance computing • new/emerging architectures • more and bigger workloads
• Common integrated tools and platform for IT efficiency • Efficient centralized infrastructure (or cloud)
• with effective remote access
ANSYS Strategic Focus Scalable and Cost-Effective IT Evolution
• Collaboration hubs with secure and scalable data access
• Support for mobile platforms
© 2013 ANSYS, Inc. November 25, 2014
6
Scale-up of High Performance Computing A Software Development Imperative
Today’s multi-core / many-core hardware evolution makes HPC a software development imperative.
Source: AnandTech
© 2013 ANSYS, Inc. November 25, 2014
7
2010 - 2012 ►Ideal scaling to 4096 cores (fluids) ►Hybrid parallelization (fluids) ►Network-aware partitioning (fluids) ►DDM for finite antenna arrays (HFSS 14) ►GPU acceleration with DMP(structures)
ANSYS is committed to maintaining performance leadership.
© 2013 ANSYS, Inc. November 25, 2014
8
Partnerships with IT industry leaders, ensuring optimized HPC performance, a roadmap to the future, and wrap-around support
ANSYS IT Industry Partnerships Performance / Platform Support
• ANSYS and Intel – 60% speed-up on “Sandy Bridge” processors; R&D focused on Intel’s Many Integrated Core (Xeon Phi)
• ANSYS and NVIDIA – GPU acceleration of ANSYS Mechanical; very active R&D engagement with NVIDIA across full portfolio
• ANSYS and Cray – Support for extreme scalability of ANSYS CFD, up to 1000’s of cores
ANSYS HPC performance optimizes the utilization of
licenses, hardware, and people
© 2013 ANSYS, Inc. November 25, 2014
9
150 million cell model
Non-reactive species
LES Turbulence
Running on Cray XE6
Scalable at ~18K cells per core
50
100
150
200
250
300
350
400
2048 3072 4096 5120 6144 7168 8192
Sim
ulat
ions
per
day
Number of Cores (10 time steps)
RatingIdeal
Number of Cores Cells/Core Efficiency
2048 73K 100%
4096 36K 98%
5632 26K 88%
6656 22K 81%
7680 19K 74%
8192 18K 70%
ANSYS Fluent 14.5 Scaling Achievement on ~8000 Cores
© 2013 ANSYS, Inc. November 25, 2014
10
Hexa mesh (830.000 cells)
Standard K-Epsilon Turbulence Model
VOF multiphase model (3 phases): Molten Steel Foamy Slag Oxygen 0
1
2
3
4
5
6
7
12 24 36 48 60 72
Spee
dup
cores
ideal speedupmeasured speedup
cores overall time (h)
measured speedup
ideal speedup
12 0.56 1.00 1 24 0.29 1.94 2 36 0.21 2.60 3 48 0.17 3.33 4 72 0.12 4.76 6
Courtesy of MORE S.r.l.
ANSYS Fluent 14.5 Scaling Achievement with Complex Physics
© 2013 ANSYS, Inc. November 25, 2014
11
ANSYS CFX Scaling Achievement in R15.0 (Preview)
R&D effort to improve HPC scaling in CFX • Basic & physics specific scaling areas • Significantly improved scalability
– Up to 89% efficiency at 2048 cores – HPC improvements are “beta” level for R15.0
150M node duct case
6.7X faster 4X faster
© 2013 ANSYS, Inc. November 25, 2014
12
ANSYS CFX Scaling Achievement in R15.0 (Preview)
Courtesy Siemens AG, Müllheim, Germany, Paper GT2013-94639
• Six Stage Axial Compressor • 13M nodes • 14 domains, 12 mixing planes
5X faster
© 2013 ANSYS, Inc. November 25, 2014
13
ANSYS Fluent Scaling Achievement in R15.0 (Preview)
• 111 million cell (truck) model • Pre-release results • Scalable at ~10K cells per core!
0
500
1000
1500
2000
2500
3000
3500
4000
0 2048 4096 6144 8192 10240 12288
Ratin
g
Number of Cores
13.0.014.0.015.0.0
0
200
400
600
800
1000
1200
1400
1600
1800
0 512 1024 1536 2048 2560 3072 3584 4096Ra
ting
Number of Cores
13.0.0
14.0.0
15.0.0
© 2013 ANSYS, Inc. November 25, 2014
14
ANSYS Fluent Scaling Achievement in R15.0 (Preview)
0
100
200
300
400
500
600
700
800
900
1000
0 2048 4096 6144 8192 10240 12288 14336
Ratin
g
Number of Cores
DLR_96M LES Combustion, CRAY-XE6
R15.0
Ideal
Number of Cores R15.0 Efficiency
1024 65.7518 100 2048 131.209 99.78 4096 256.072 97.36 8192 473.856 90.08
10240 551.844 83.93 12288 629.586 79.79 13312 674.268 78.88 14336 728.315 79.12
• Gas phase combustion
• Thickened Flame Model(TFM) with Finite Rate Chemistry
• Pressure based coupled
• 96 million cells, hex-core mesh
• 84% efficiency at 10,240 cores Less than 10,000 cells per core!
© 2013 ANSYS, Inc. November 25, 2014
15
Scaling to 4096 cores on InfiniBand-based Linux Cluster
1024
1536
2048
3072
3584 4096
0.0
500.0
1000.0
1500.0
2000.0
2500.0
3000.0
3500.0
0 512 1024 1536 2048 2560 3072 3584 4096
Solv
er R
atin
g
Number of Cores
Truck_111m Benchmark - ANSYS Fluent 15.0 (Preview)
Intel MPI
Intel MPI DAPL UD
Ideal
ANSYS Fluent Scaling Achievement in R15.0 (Preview)
© 2013 ANSYS, Inc. November 25, 2014
16
539
97
213
66
0
100
200
300
400
500
600Dual Socket CPUDual Socket CPU + Tesla K20x
AN
SYS
Flue
nt T
otal
Tim
e (S
ec)
2 x Xeon X5650, Only 1 Core Used
1.5x
2.5x
Lower is
Better
2 x Xeon X5650, All 12 Cores Used
1.2M Tet cells
Steady, laminar
Coupled PBNS, DP
Pipes Model
ANSYS Fluent 15.0 (Preview) Performance
ANSYS Fluent GPU Scaling Achievement in R15.0 (Preview)
NOTE: Total solution time!
© 2013 ANSYS, Inc. November 25, 2014
17
7070
5883
3180
0
1000
2000
3000
4000
5000
6000
7000
8000
CPU CPU+GPU
AN
SYS
Flue
nt T
ime
(Sec
)
Segregated Solver
1.9x
2.2x Lower
is Better
Coupled Solver
Sedan geometry 3.6M mixed cells Steady, turbulent External aerodynamics Coupled PBNS, DP AMG F-cycle on CPU AMG V-cycle on GPU
Sedan Model
NOTE: Total solution times!
ANSYS Fluent GPU Scaling Achievement in R15.0 (Preview)
© 2013 ANSYS, Inc. November 25, 2014
18
NOTE: Times for solver only
CPU Fluent solver: F-cycle, agg8, DILU, 0pre, 3post GPU nvAMG solver: V-cycle, agg8, MC-DILU, 0pre, 3post
Comparing 16 cores (2 x E5-2680 CPUs) with only 2 of its
cores with 2 GPUs
Solver settings:
0
0.5
1
1.5
Helix (tet 1173K) Airfoil (hex 784K)
2 CPU cores + 2 K20X GPU's
16 CPU cores
1.7x
2.1x
ANSYS Fluent GPU Scaling Achievement in R15.0 (Preview)
Lower is
Better
© 2013 ANSYS, Inc. November 25, 2014
19
ANSYS Fluent GPU Scaling Achievement in R15.0 (Preview)
111M cells External aerodynamics Steady, k-ε turbulence Double-precision solver CPU: Intel Xeon SNB; 12 cores per node GPU: Tesla K40, 4 per node
Truck Body Model
Fluent solution time per iteration (secs)
36
18
144 CPU cores
144 CPU cores + 48 GPUs
2 X
Lower is
Better
© 2013 ANSYS, Inc. November 25, 2014
20
More scalable discrete phase particle tracking • Over 2x for 512-way parallel
More efficient parallel I/O and startup • Case read time reduced significantly at high core
counts • Start-up time for 8192-way parallel reduced
from 30 minutes to 30 seconds
Effective configuration of parallel processes • Use different number of processes for meshing
and solve modes
ANSYS Fluent Other Parallel Enhancements in R15.0 (Preview)
0
1000
2000
3000
4000
5000
6000
7000
16 32 64 128 256 384 512
Ratin
g
246,000 cells, 1 million particles
Hybrid MPI
2Domain
0
1000
2000
3000
4000
5000
6000
7000
1024 2048 4096 6144 8192 9216 10240
Tim
e in
seco
nds
150M cell case read time
15.0.0 14.5.0
Number of Cores
Number of Cores
© 2013 ANSYS, Inc. November 25, 2014
21
INT64 version of MeTiS improved partition quality on large meshes
MPICH2 version available for Cray XE (β)
Increased maximum number of partitions possible
Improved parallel diagnostic output format
ANSYS CFX Other Parallel Enhancements in R15.0 (Preview)
© 2013 ANSYS, Inc. November 25, 2014
22
2048
32 8
128 512
Parallel Enabled (Cores)
Packs per Simulation 1 2 3 4 5
HPC Licensing
Scalable licensing • ANSYS HPC (per-process) • ANSYS HPC Pack
– “Virtually Unlimited” parallel on single job – Best solution for extreme scaling needs – Parallel enabled increases quickly as packs added
• ANSYS HPC Workgroup – Volume access to HPC for many users running
‘everyday’ HPC jobs – 128 to 2048 parallel shared across any number of
simulations (note: new HPC Workgroup 32 & 64)
• ANSYS HPC Enterprise – Similar to HPC Workgroup but deploy and use
anywhere in the world
Single HPC solution for FEA/CFD/FSI and any level of fidelity
© 2013 ANSYS, Inc. November 25, 2014
23
New Parametric Modeling and Licensing at 14.5 ANSYS HPC Parametric Pack
Explore Your Parametric Designs Faster, More Cost Effectively
© 2013 ANSYS, Inc. November 25, 2014
24
Problem Description • Improve mixing while reducing energy • Design objective:
– Optimize the inlet velocities within their operating limits so that both temperature spread at the outlet and pressure drop in the vessel are minimized
• Input Parameters: fluid velocity at the cold and hot inlet (8 Design Points)
• Detail: – K-Epsilon Model with Standard Wall Functions – 52,000 nodes and 280,000 elements – Hardware: HP workstation with dual Intel® Xeon® E5-2687W
(3.10 GHz, 16 cores), 128 GB memory
Licensing Solution • 1 ANSYS Fluent • 2 ANSYS HPC Parametric Packs Result/Benefit • ~4.5x speedup over sequential execution
• Easier and fully automated workflow
Acknowledgment: Paul Schofield and Jiaping Zhang, ANSYS Houston
Example: Mixing Vessel New Parametric Modeling and Licensing at 14.5
© 2013 ANSYS, Inc. November 25, 2014
25
ANSYS 15.0 License Scheme for GPUs - One HPC Task Required to Unlock one GPU!
6 CPU Cores + 2 GPUs 1 x ANSYS HPC Pack 4 CPU Cores + 4 GPUs
Licensing Examples:
Total 8 HPC Tasks (4 GPUs Max)
2 x ANSYS HPC Pack Total 32 HPC Tasks (16 GPUs Max)
Example of Valid Configurations:
24 CPU Cores + 8 GPUs
(Total Use of 2 Compute Nodes)
.
.
.
.
. (Applies to all schemes: HPC, HPC Pack, HPC Workgroup, HPC Enterprise)
© 2013 ANSYS, Inc. November 25, 2014
26
• Scale-up of high performance computing • new/emerging architectures • more and bigger workloads
• Common integrated tools and platform for IT efficiency • Efficient centralized infrastructure (or cloud)
• with effective remote access
ANSYS Strategic Focus Scalable and Cost-Effective IT Evolution
• Collaboration hubs with secure and scalable data access
• Support for mobile platforms
© 2013 ANSYS, Inc. November 25, 2014
28
Flexible Deployment ANSYS Architecture for HPC
Workstation-Based
or
Thin Clients
Job Submission
ANSYS Workbench User Environment/Graphics
Multi-core / Multi-node
Compute Cluster
Graphics Servers File Servers
High-Performance Computing • ANSYS Remote Solve Manager: Workbench-based job submission with full portfolio support for
Platform LSF, PBS Pro, and Microsoft Job Scheduler • Bundled third-party Message-Passing software with optimized performance (Intel MPI, Platform
MPI) on gigE, 10gigE, or InfiniBand cluster fabric.
© 2013 ANSYS, Inc. November 25, 2014
29
• Scale-up of high performance computing • new/emerging architectures • more and bigger workloads
• Common integrated tools and platform for IT efficiency • Efficient centralized infrastructure (or cloud)
• with effective remote access
ANSYS Strategic Focus Scalable and Cost-Effective IT Evolution
• Collaboration hubs with secure and scalable data access
• Support for mobile platforms
© 2013 ANSYS, Inc. November 25, 2014
30 HPC Users Compute Centers
Enabling Simulation Practices • Data Management
• Data remains in the datacenter • Eliminate file transfer bottlenecks • Data centralization for collaboration
• Mobility and Remote Access • Beyond batch processing!
• Centralized Flexible Licensing
Customer Initiatives • Datacenter Consolidation • Private / Public Cloud • Globally Connected R&D • IP Protection & Leverage • Virtual Desktop
Engineering IT is Evolving to Centralized HPC Resources, with Users Remote
Effective Global Collaboration on Consolidated Infrastructure
Optimizing Remote Simulation Workflow
© 2013 ANSYS, Inc. November 25, 2014
31
Current Trends / ANSYS Strategy Spectrum of Private, Hosted and Public Cloud
Hardware: Private cloud or hosted HPC (w/ data security )
Software & User Environment: Seamless usability and remote access
Business: Evolution / Add flexibility of “usage” to traditional license model
Graphic courtesy of IBM
later
© 2013 ANSYS, Inc. November 25, 2014
32
Users (Remote/Mobile)
Enterprise datacenter or cloud
Work in Progress data
Graphics server
Compute cluster Batch jobs
ANSYS Portal Web/Mobile UI
Browser access
Remote Display Interactive jobs
• Remote Execution • Interactive remote
Visualization • Mobile monitoring -
Detach/Attach
• Data/Process Management • Search & Retrieval • Light Weight Visualization • Security & Access Controls • Dashboard to manage
Open - Interface to Schedulers & Resource Managers • Job Management • Job Controls
Data Management Simulation Job Management Mobility & Remote Access
Coming soon
ANSYS High-Performance Computing Roadmap for Remote/Mobile Access
Today @ R15.0 @ R15.0
© 2013 ANSYS, Inc. November 25, 2014
33
Combined usage (on premise + cloud)
Browser access
Cloud Portal Secure account access; data management; job management
Thin client access WIP Data
Graphics servers
Compute cluster Batch jobs
Interactive jobs
ANSYS Cloud – Partner-Enabled Solutions
Enable customers to “Outsource HPC” • Burst or steady-state extension of in-house capacity
Leverage partnerships with HPC Cloud Partners • HPC hosting experts; data security; choice
Cloud License
Partner built, partner owned and operated.
On-Premise license
On-Premise Data
Customer Facility
SW sold and supported by ANSYS Traditional Lease/Perpetual or (Coming) Usage Based
© 2013 ANSYS, Inc. November 25, 2014
34
ANSYS Cloud Customer Experience – Gompute
Remote simulation on the cloud
Secure VPN Connection
Portal Tools / Data Storage
Remote Display
HPC Job Submission
© 2013 ANSYS, Inc. November 25, 2014
35
ANSYS Cloud Customer Experience – Gompute
Set up an ANSYS Cloud account / Evaluate
Run sample ANSYS workloads
Accept HW terms and conditions
© 2013 ANSYS, Inc. November 25, 2014
36
Start Real VNC Viewer on your machine and Login with your Hostname
Secure VPN Connection
In Win7 environment start your ANSYS
Remote Display
HPC Job Submission
ANSYS Cloud Customer Experience – CADFEM Germany
© 2013 ANSYS, Inc. November 25, 2014
37
ANSYS Cloud ANSYS is Building out a Network of Partners
Current Partner Listing‡
‡ Partial list. Partners will continue to be added in time. Pending / target partner.
© 2013 ANSYS, Inc. November 25, 2014
39
Infrastructure / Cloud Spectrum
Ente
rpris
e Re
quire
men
ts /
Pra
ctic
es
Desktop Private Cloud /
Enterprise HPC
Hosted HPC Cloud
Public Cloud
Enterprise SW Portfolio Management
Mobility & remote access
Global Collaboration, Monitoring & Control
Work in Progress Data Management
Job Scheduling & Resource Management
HPC Scale Up & HW Optimization
• Performance at 1000’s of cores • Integrated job and data mgmt • Mobile / web access • Optimized remote graphics • Open support for IT tools
ANSYS Collaboration Platform
• Hosting service experts • True HPC cloud • Full remote simulation experience • Flexible business evolution
ANSYS Cloud Partners
HPC & Cloud ANSYS Strategy
© 2013 ANSYS, Inc. November 25, 2014
40
• Connect with Me – [email protected]
• Connect with ANSYS, Inc.
– LinkedIn ANSYSInc – Twitter @ANSYS_Inc – Facebook ANSYSInc
• Follow our Blog
– ansys-blog.com
Thank You!