Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Lawrence Livermore National Laboratory
Chemical Kinetics Research on HCCI & Diesel Fuels and Computationally Efficient Modeling of High-Efficiency Clean
Combustion Engines
Dan Flowers, Bill Pitz, Marco Mehl, Mani Sarathy, Charlie Westbrook, Salvador Aceves, Nick Killingsworth, Matt McNenly, Tom Piggott, Mark Havstad, Russell Whitesides
DEER ConferenceSeptember 27, 2010 – Detroit, MI
Sponsor: VTP – Team Leaders Gurpreet Singh and Kevin StorkThis work performed under the auspices of the U.S. Department of Energy by
Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344
2LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
Fuel Surrogate Palette for Diesel
n-alkanebranched alkanecycloalkanesaromaticsothers
butylcyclohexanedecalin
hepta-methyl-nonanen-decyl-benzenealpha-methyl-naphthalene
n-dodecanen-tridecanen-tetradecanen-pentadecanen-hexadecane
tetralin
New diesel components this year
New component last year
New component this year
We have developed mechanisms for complex long-chain species, enabling more representative diesel surrogates
3LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
We have developed 2-methyl alkane mechanisms up to C20; branched iso-alkanes are significant components in gasoline and diesel fuels
Includes all 2-methyl alkanes up to C20 which covers the entire distillation range for gasoline and diesel fuels
7,900 species
27,000 reactions
Built with the same reaction rate rules as our successful iso-octane and iso-cetane mechanisms.
4LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
1.0E-05
1.0E-04
1.0E-03
1.0E-02
1.0E-01
0.7 0.9 1.1 1.3 1.5 1.7
Igni
tion
Del
ay T
ime
[s]
1000/T [K]
nC8H18nC9H20nC10H22nC11H24nC12H26nC13H28nc14H30nC15H31nC16H34
13 bar
Stoichiometric fuel/air mixtures
Westbrook et al. Comb. Flame 2009
Our previous work showed that ignition characteristics of normal alkanes depend little on carbon chain length
Temperature Increases
Gre
ater
Rea
ctiv
ity
Low Temperature Heat Release
Negative Temperature Coefficient
High Temperature Heat Release
5LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
However, 2-methyl alkanes show more reactivity with chain length at higher equivalence ratio and pressure:
100
1000
10000
0.90 1.00 1.10 1.20 1.30 1.40 1.50 1.60
Igni
tion
dela
y (u
s)
1000/T (1/K)
φ=3, P=21 atm, 1.355% fuel, O2/Ar
c10h22-2 - sarathy c12h26-2
c13h28-2 c14h30-2
c15h32-2 c16h34-2
c18h38-2 c20h42-2
Reactivity increases with chain length
C20
C10
Biggest fuel effect of chain length found in NTC region
6LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
100
1000
10000
0.90 1.00 1.10 1.20 1.30 1.40 1.50 1.60
Igni
tion
dela
y (u
s)
1000/T (1/K)
(φ=3, P=21 atm, 1.355% fuel, O2/Ar)
nc12h26 c12h26-2
2-methylalkane less reactive than n-alkane
Less NTC behavior in 2-methylalkane
In general, 2-methylalkanes show less negative temperature coefficient (NTC) behavior than n-alkanes
7LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
At rich conditions, n-alkanes show chain-length reactivity sensitivity
100
1000
10000
0.90 1.00 1.10 1.20 1.30 1.40 1.50 1.60
Igni
tion
dela
y (u
s)
1000/T (1/K)
(φ=3, P=21 atm, 1.355% fuel, O2/Ar)
nc10h22 - westbrook c10h22-2 - sarathync12h26 c12h26-2nc13h28 c13h28-2nc14h30 c14h30-2nc15h32 c15h32-2nc16h35 c16h34-2c18h38-2 c20h42-2
Reactivity increases with chain length
n-alkanes
2-methyl alkanes
Effect of chain length for n-alkanes now observed at Diesel Ignition conditions
C20
C10
8LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
As mechanisms continue to grow in sophistication, the application of these mechanisms becomes more computationally intensive
Includes all 2-methyl alkanes up to C20 which covers the entire distillation range for gasoline and diesel fuels
7,900 species
27,000 reactions
Built with the same reaction rate rules as our successful iso-octane and iso-cetane mechanisms.
With full (non-sparse) solvers, numerical cost scales with the (# of Species)3
9LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
Current state-of-the-art mechanism 7900 species needed for the elementary reaction mechanism for 2-
methyl alkanes (Sarathy et al. 2010) 2M-10M fluid cells for full 3D cylinder model (2010, CERFACS/IFP)
Fully-coupled computation cost: 42,000 Pflop If the fastest computer in the world was ideally utilized:
• 30 hrs on Jaguar (ORNL) – 240,000 core 1.76 Pflop/s(Number 1 on Top500 supercomputer list)
60 years on a workstation – six core AMD Opteron 100 Gflop/s
Solving detailed chemical kinetics combined with 3D fluid dynamics is computationally (and financially) expensive!
Combustion chemistry computational cost is the biggest barrier to complete physics-based simulation suitable for engine design.
10LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
Spec
ies J
Reac
tor M
Species K
Reactor N108
10-8
KIVA-ANN
1. Lower-cost models:• Reduced Mechanisms• ANN Ignition Integral
2. New computing architecture
3. Advanced numerics
NVIDIA GTX480
We seek to bring the most physically accurate combustion models for the lowest cost
11LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
Opportunities for 1000x speedup in computational chemistry cost through applied mathematics
Multi-zone ODE
system
Eigen-structure Analysis
Mode Splitting
Pre-conditioners
Sparse Solvers
JacobianReuse
Adaptive Sampling
Perturbation methods
New Computer architectures can further accelerate these gains
12LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
Spec
ies J
Reac
tor M
Species K
Reactor N108
10-8
KIVA-ANN
1. Low-cost models:• Mechanism Reduction• ANN Ignition Integral
2. New computing architecture
3. Advanced numerics
NVIDIA GTX480
We seek to bring the most physically accurate combustion models for the lowest cost
13LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
Originally used for graphics intensive applications:
- video games
CPU: Intel/AMDCores: up to 12Mem: up to 256GBTflop/s: up to 0.13Price: up to $1300
GPU: NVIDIA GTX 480Cores: 480Mem: 1.5 GBTflop/s: 1.35 Price: $500
General Purpose Graphical Processing Units (GPGPUs) bring Tflop/s computing power to the desktop
14LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
GPUs are now programmed with a simple extension to the C language:• New Fermi line offers full C++ support. • NVIDIA currently provides free compilers, debuggers and code profilers
for all platforms (Linux, Mac and Windows).• 3rd party wrappers for most languages (Python, FORTRAN, etc.).
Best algorithms have high arithmetic intensity (i.e. many mathematical operations per memory access):• Researchers performing N-body simulations were early adopters
(molecular dynamics and astrophysics).• Routinely reached +100x speedup.
Computational science on the GPUs was in the news recently:• Georgia Tech Research Institute used GPUs to crack passwords.• Recommend 12-character random passwords to beat today’s GPUs.
NVIDIA’s Compute Unified Device Architecture (CUDA) has made computational science on the GPU viable
15LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
• Images from NVIDIA’s CUDA C Programming Guide Version 3.1, 2010.
Access Type (clock cycles)
Memory AvailabilityEach function (kernel) executes N times for N threads organized in a compute grid of independent thread blocks:
shared (1 - 16)
register (0 - 24)
local (100)
global (100 - 1600)
texture (<5 - 1600)
constant (0 - 24) read-only
Optimal GPU algorithms are designed to exploit the fast shared memory
GPU architecture and memory controllers require fine-scale parallelism for best performance
16LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
Evaluation of thermodynamic properties highlights the algorithm design considerations for the GPU
Low Temperature Polynomial (T<1000K)High Temperature Polynomial (T>1000K)
(Fluid Mechanics Resolution)*(Chemical Kinetics Resolution)
Spee
dup
Rel
ativ
e to
1 C
PU
17LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
Conventional CPU coding• No shared memory• Peak speedup limited to
less than 30x
GPU specific coding• Load ai in shared memory• Peak speedup 55x with no
size limit on performance• Load strategies and variable
reuse have minor impact
Evaluation of thermodynamic properties highlights the algorithm design considerations for the GPU
Further optimization• 65x speedup• Every thread calculates both
temperature branches.• Only assignment is conditional.• No thread divergence.
18LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
• More threads per block: produces greater shared memory reuse between cooperating threads.
• Fewer threads per block: allows more independent blocks to be placed on a multiprocessor effectively hiding memory latency.
Peak GPU performance is a balance between shared memory reuse and multiprocessor occupancy
19LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
Specific heat
Enthapy
Entropy
Enthalpy and entropy calculations have greater speedup, benefiting from more arithmetic operations per memory access
20LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
GPUs are a growth area for bringing scientific computing capability to the desktop
64 bit “no-fun” GPU’s being developed for scientific computing
Democratizing architecture for large-scale computing:• Enable greater physics for engineering design
Puts greater emphasis on programming and numerical methods for effective utilization
21LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
Spec
ies J
Reac
tor M
Species K
Reactor N108
10-8
KIVA-ANN
1. Low-cost models:• Mechanism Reduction• ANN Ignition Integral
2. New computing architecture
3. Advanced numerics
NVIDIA GTX480
We seek to bring the most physically accurate combustion models for the lowest cost
22LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
).,,,(
),,,(
),,,(
1
122
111
NNN
N
N
xxtft
x
xxtft
x
xxtftx
=∂∂
=∂∂
=∂∂
Explicit Update(lower cpu/step)
Implicit Update(more trajectory data)
=
∂∂
∂∂
∂∂
∂∂
∂∂
∂∂
∂∂
∂∂
∂∂
N
NNN
N
N
xf
xf
xf
xf
xf
xf
xf
xf
xf
J
21
2
2
2
1
2
1
2
1
1
1
.
During ignition:∆t (explicit) = 10-12 to 10-15 s∆t (implicit) =10-6 to 10-8 s
Explicittimestep
Spec
ies
com
posi
tion
Time
Implicit timestep
Implicit methods are necessary to integrate the chemical time scales over an engine cycle
23LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
Block Diagonal JacobianMultizone 10x-100x faster
Sparse Solver w/o 3rd bodySingle reacting zone +6x faster
Jacobian construction/solution is more than 95% of the simulation cost – a big speedup is possible with smart solvers
24LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
.
Generalized Minimal RESiduals GMRES Error
Eigenvalue Spectra (200 x 200)A1: fast convergence A2: slow convergence
D
r
Approximate Jacobians can be used to precondition iterative linear system solvers like GMRES
25LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
857-species Λ = 0.992 iter = 42 (GMRES)Λ = 0.874 iter = 31 (GMRES)Λ = 0.252 iter = 9 (GMRES)Λ = 0.124 iter = 8 (GMRES)Λ = 0.018 iter = 3 (GMRES)
Direct reaction sorting shows promise to be a general low-cost preconditioner for the Jacobian
26LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
Chemical mechanisms continue to evolve, adding more a more compounds that are present in real fuels
Develop detailed chemical kineticmodels for another series iso-alkanes:3-methyl alkanes
Validation of 2-methyl alkanes mechanism with new data from shock tubes, jet-stirred reactors, and counterflow flames
Develop detailed chemical kinetic models for alkyl aromatics:
More accurate surrogates for gasoline and diesel Further develop mechanism reduction using functional group
methodn-decylbenzene - Diesel Fuels
27LLNL-PRES- 427539 DEER 2010
Lawrence Livermore National Laboratory
Spec
ies J
Reac
tor M
Species K
Reactor N108
10-8
KIVA-ANN
1. Low-cost models 2. Improving
physical models
Heat
rele
ase r
ate
3. New computing architecture 4. Advanced
numerics
NVIDIA GTX480 image: http://www.nvidia.com/object/product_geforce_gtx_480_us.html
KIVA4+ANN andspray models
LTC sensitivity and gaseous injection
GPU compute tiles for efficient species production rates
Combine reaction sort with 2x2 sort
Continued improvement of physical models and numerical methods will enable utilization of large mechanisms