Upload
others
View
10
Download
0
Embed Size (px)
Citation preview
High-Performance Oracle Platforms
DOAG Jahreskonferenz
November 2013
copyright © 2013 by benchware.ch slide 2
1 Introduction
2 Benchmark Results – CPU Performance
3 Benchmark Results – In-Memory Server Performance
4 Benchmark Results – Storage Performance
5 Benchmark Results – Database Performance
Contents
copyright © 2013 by benchware.ch slide 3
Introduction
Testing an architecture - Calibrate Oracle platform
- Validate performance expectations BEFORE going into production
Health check - Discover performance bottlenecks
Platform evaluation - Challenge vendors marketing messages
- Price performance ratio
Capacity planning
Why benchmarks?
copyright © 2013 by benchware.ch slide 4
Introduction
Benchmark as a Service - Results within one week
Licenses - Project license
- Enterprise license
Discovery Workshop - Technology overview, 1 day
- Audience CTO’s, System Architects, Technical Project Manager
Benchware Services and Licenses
copyright © 2013 by benchware.ch slide 5
Introduction
File System Volume Manager
Database System
Storage System
Operating System Virtualization
Server
Performance of complex Oracle platforms is not predictable N
etw
ork
s (A
pp
licat
ion
, Bac
kup
, In
terc
on
nec
t, …
)
Sto
rage
Ne
two
rk
Oracle Database
Different versions, patches and options, about one hundred configuration parameters.
Server & Operating System
Different server systems, processors and CPU architectures, (x86, IA-64, UltraSparc, SPARC64, Power), #cores, multithreading, main memory, bus architecture. Different operating systems and patches, over one hundred configuration parameters, virtualization of resources.
Volume & File Management
Different volume managers (VxVM, ASM) and file systems (UFS, VxFS, ext3, JFS, ZFS, raw devices), different I/O methods (async, direct), a lot of configuration parameters (#LUNS, queue depth, max i/o unit), software striping and/or mirroring, multipathing.
Storage System
Different storage systems, storage tiers and storage technology: spindle count and speed, RAID management, cache management, server interface technology, storage system options like remote copy, hardware striping and/or mirroring, virtualization of resources.
Storage Network (FC-, IB- or IP-based)
Bandwidth, latency during remote storage mirroring (sync, async) due to switches, hubs and distance.
Application Networks (IP-based)
Bandwidth, latency during remote database mirroring (sync, async) due to switches and sql*net and tcp/ip stack (frame size, etc.).
Syst
em
Man
agem
en
t, O
pe
rati
on
s, S
ecu
rity
,
R
eso
urc
e M
anag
em
en
t
Middleware
Application
System architects have a wide choice of components, technologies and configurations
copyright © 2013 by benchware.ch slide 6
Introduction
No benchmark results available for YOUR platform
No performance metrics for cpu, server and storage
Strict rules for setup - Sizing
- Usable features
Unrealistic hardware configurations
Only one load profile: complete CPU saturation
Unpractical performance metrics - TPC-H: QphH@Size (Query per hour at given database size)
Michel Stonebraker, Keynote at TPC Technology Conference 2009: “In short, TPC has become vendor-dominated, and it is time for TPC to reinvent itself to serve its customer community”
Useless TPC and SAP Benchmarks
copyright © 2013 by benchware.ch slide 7
Introduction
Time Consuming – Complicated – Expensive - Data masking
- Large OLTP user populations
- Interfaces (SOA, ESB)
- Consolidation of database servers
Application as load generator - Requires scalable application to utilize new hardware
- Sometimes mislead to incorrect conclusions about price/performance ratio of new hardware
- PoC provides a snapshot result - any change of application code or application data may change PoC result
Issues with Proof-of-Concepts
copyright © 2013 by benchware.ch slide 8
Introduction
Source: www.bmw.de
Key performance metrics of Oracle platforms should be as self explanatory as key performance metrics used in the automobile industry
copyright © 2013 by benchware.ch slide 9
Introduction
Benchware Performance Suite
- Benchware Loader
- Benchware Monitor
Performance measurement at the interface between application and Oracle database platform
Key Performance Metrics can be used for SLA between IT operation and business
Benchware uses Oracle database software to generate all kinds of loads for cpu, server, storage and database
File System Volume Manager
Database System
Storage System
Operating System Virtualization
Server
Complex architecture of Oracle platforms requires benchmarking D
ataG
uar
d N
etw
ork
Fu
sio
n I/
O In
terc
on
nec
t
Sto
rage
Net
wo
rk
Benchware Monitor
Benchware Loader
copyright © 2013 by benchware.ch slide 10
Oracle Server Performance Server-bound SQL database transactions on in-memory data objects - no I/O operations
OLTP systems
DWH systems
Proof of Efficiency
Key Performance Metrics
Unit
in-memory SQL scalability
virtualization
cc-numa
speed
throughput
service time
[µs] [s]
[bgps]
[tps] [rps]
[MBps]
in-memory pl/sql algorithms
quicksort
Introduction
Oracle CPU Performance CPU-bound operations with typical Oracle data types
OLTP systems
DWH systems
Proof of Efficiency
Key Performance Metrics
Unit
pl/sql basic operations
basic numeric operations, built-in functions
multithreading
virtualization
encryption
speed
throughput
[s]
[ops]
pl/sql algorithms
fibonacci, prime numbers
Library of Oracle benchmark tests
[s] seconds [ms] milli seconds (10-3) [µs] micro seconds (10-6) [ns] nano seconds (10-9)
less important important very important
[bps] buffers gets per second [rps] rows per second [tps] transactions per second [ops] operations per second
[MBps] mega bytes per second [GBps] giga bytes per second [iops] i/o operations per second [qpm] queries per minute
copyright © 2013 by benchware.ch slide 11
Oracle Database Performance Most important database operations with mixed resource usage: CPU, memory, storage
OLTP systems
DWH systems
Proof of Efficiency
Key Performance Metrics
Unit
data load
uncompressed, compressed
scalability speed
throughput
service time
[ms] [s]
[rps] [tps]
[qpm] data scan
data aggregation & reports
OLTP transactions
insert, select, update
Introduction
Oracle Storage Performance I/O-bound operations for all typical Oracle I/O access pattern
OLTP systems
DWH systems
Proof of Efficiency
Key Performance Metrics
Unit
sequential I/O
1 MByte, read and write
data integrity
tiering, pooling
striping
replication
throughput
service time
[ms]
[MBps]
[GBps]
[IOPS] random I/O
8 kByte, read and write
Library of Oracle benchmark tests
[s] seconds [ms] milli seconds (10-3) [µs] micro seconds (10-6) [ns] nano seconds (10-9)
less important important very important
[bps] buffers gets per second [rps] rows per second [tps] transactions per second [ops] operations per second
[MBps] mega bytes per second [GBps] giga bytes per second [iops] i/o operations per second [qpm] queries per minute
copyright © 2013 by benchware.ch slide 12
Introduction
Public benchmark results: www.benchware.ch/benchmarks
copyright © 2013 by benchware.ch slide 13
1 Introduction
2 Benchmark Results – CPU Performance
3 Benchmark Results – In-Memory Server Performance
4 Benchmark Results – Storage Performance
5 Benchmark Results – Database Performance
Contents
copyright © 2013 by benchware.ch slide 14
CPU Performance
CPU performance has a huge impact on - Oracle license (core factor) and maintenance cost - even with
Unlimited License Agreement (ULA)
- performance of most database operations
- compute intensive algorithms
Why measure CPU performance?
copyright © 2013 by benchware.ch slide 15
CPU Performance
CPU SPARC T5 E7-8870 Westmere
E5-2690 Sandy Bridge
POWER 7
Launch Date 2013 2011 2012 2012
Clock rate [GHz] 3.6 2.4 – 2.8 2.9 – 3.8 4.0
#cores per socket 16 10 8 8
Multithreading 8-fold 2-fold 2-fold 4-fold
Performance numbers from other Benchmarks SPARC T5 E7-8870 Westmere
E5-2690 Sandy Bridge
POWER 7
SPECint_base2006 (speed)
-
36.4
55.4
not available similar cpu P7 4.14 GHz
29.3
Oracle CPU speed in sys.aux_stats$ 1’407 3’074 2’605 1513
CPU Architecture
Remark:
Oracle has an internal estimation about CPU speed in sys.aux_stats$, but none estimation about CPU throughput. The Oracle speed estimation does either correlate with SPECint_base2006 numbers nor with Benchware performance results in Oracle 11g Release 2.
copyright © 2013 by benchware.ch slide 16
CPU Performance
Server SPARC T5-4 E7-8870 Westmere
E5-2690 Sandy Bridge
POWER 7
#sockets 4 4 2 4
#cores 64 40 16 32
#threads 512 80 32 128
Performance numbers from other Benchmarks SPARC T5-4 E7-8870 Westmere
E5-2690 Sandy Bridge
POWER 7
SPECint_rate_base2006 (throughput)
not available T5-8 result divided by 2
~ 1’745
~ 1’000
668
not available similar cpu P7 4.14 GHz
635
Server Configuration
copyright © 2013 by benchware.ch slide 17
CPU Performance
Oracle Enterprise Edition SPARC T5-4 E7-8870 Westmere
E5-2690 Sandy Bridge
POWER 7
Total number of cores 64 40 16 32
Oracle core license factor x 0.5 x 0.5 x 0.5 x 1.0
Oracle license cost (list price 25th of September 2013)
Enterprise Edition (47’500)
Partition Option (11’500)
Diagnostic Pack (5’000)
Tuning Pack (5’000)
1’520’000
184’000
160’000
160’000
950’000
230’000
100’000
100’000
380’000
92’000
40’000
40’000
1’520’000
184’000
160’000
160’000
Total Oracle license cost 2’208’000 1’380’000 552’000 2’208’000
Oracle Licensing
copyright © 2013 by benchware.ch slide 18
CPU Performance
0
2'000
4'000
6'000
8'000
10'000
12'000
14'000
16'000
18'000
1 2 4 8 16 32 64 128
E5-2690 2.9
P7 4.0
E7-8870 2.4
T5-4 3.6
Number of processes
Thro
ugh
pu
t in
[M
op
s]
Oracle CPU performance: arithmetic ADD, data type SIMPLE_INTEGER
16 cores, 32 threads 536 Mops per core
32 cores, 128 threads 354 Mops per core
40 cores, 80 threads 352 Mops per core
64 cores, 512threads 252 Mops per core
Single thread speed: • E5 360 kops • E7 360 kops • P7 200 kops • T5 200 kops
copyright © 2013 by benchware.ch slide 19
CPU Performance
0
20
40
60
80
100
120
140
n = 39 n = 40 n = 41 n = 42
E5-2690 2.9
P7 4.0
E7-8870 2.4
T5 3.6
Single process
Spee
d in
[se
c]
Oracle CPU performance: calculation of fibonacci numbers (recursive)
copyright © 2013 by benchware.ch slide 20
CPU Performance
Performance of P7 does not justify high Oracle license cost
SPARC seems to gain on performance at least for throughput oriented applications
8 socket Intel E7 does not scale well, only factor 2 compared to Intel E5
2 socket Intel E5 provides fastest single thread performance but limited scalability
Reviewing CPU Performance
copyright © 2013 by benchware.ch slide 21
1 Introduction
2 Benchmark Results – CPU Performance
3 Benchmark Results – In-Memory Server Performance
4 Benchmark Results – Storage Performance
5 Benchmark Results – Database Performance
Contents
copyright © 2013 by benchware.ch slide 22
Server Performance
Applications tend to operate in memory as much as possible to avoid slow I/O operations - Some vendors build complete concepts on this idea, e.g. SAP HANA
Memory capacity of servers has become cheap
List price for 1 TByte memory: - x86 server: ~ 25’000 USD for 16 GByte DIMM
- x86 server: ~ 60’000 USD for 32 GByte DIMM
- Risc server: ~ 55’000 USD for 16 GByte DIMM
Why measure Server Performance?
Remarks:
Currently (September 2013) commercial systems may have following RAM capacities:
• based on Intel x86 2 TByte RAM
• based on Intel Itanium 8 TByte RAM
• based on IBM Power 16 TByte RAM
• based on Sun SPARC 32 TByte RAM
copyright © 2013 by benchware.ch slide 23
Server Performance
Oracle recognized this trend and provides specific features for in-memory processing - Different Cache types for object pinning
- Parallel SQL even for large in-memory objects
- New 12c Release 2 In-Memory Option
These tests are useful to determine performance capabilities of 2 socket server (Oracle SE versus Oracle EE)
Why measure Server Performance?
copyright © 2013 by benchware.ch slide 24
Server Performance
0
20
40
60
80
100
120
140
160
1 2 4 8 16 32 64 128
E5-2690 2.9
P7 4.0
E7-8870 2.4
T5-4 3.6
Number of processes
Thro
ugh
pu
t in
[M
rps]
Oracle in-memory SQL performance: full table scan, row store
16 cores, 32 threads 4.3 Mrps per core 32 cores, 128 threads
1.7 Mrps per core
SPARC T5-4 in-memory net data scan rate : • ~ 39 GBps • 138’900’000 rps
Intel E5-2690 in-memory net data scan rate: • ~ 20 GBps • 70’000’000 rps
40 cores, 80 threads 3.2 Mrps per core
64 cores, 512 threads 2.1 Mrps per core
Intel E7-8870 in-memory net data scan rate: • ~ 37 GBps • 131’800’000 rps
copyright © 2013 by benchware.ch slide 25
Server Performance
0
200'000
400'000
600'000
800'000
1'000'000
1'200'000
1'400'000
1 2 4 8 16 32 64 128
E5-2690 2.9
P7 4.0
E7-8870 2.4
T5 3.6
Number of processes
Thro
ugh
pu
t in
[tp
s]
Oracle in-memory SQL performance: primary key access, 1 row per transaction
40 μs 40 μs
40 μs
40 μs
40 μs
16 cores, 32 threads 32’000 tps per core
61 μs
Intel Xeon E5: • > 500’000 tps • < 65 μs service time
Times Ten: • ~ 2 μs service time
40 cores, 80 threads 16’775 tps per core,
186 μs
104 μs
82 μs
80 μs
80 μs
Intel Xeon E7: • > 672’000 tps • < 190 μs service time
64 cores, 512 threads 18’293 tps per core
107 μs
84 μs
32 cores, 128 threads 27’962 tps per core
118 μs
72 μs
52 μs
40 μs
SPARC T5-2: • ~ 1’200’000 tps • < 110 μs service time
70 μs
67 μs
copyright © 2013 by benchware.ch slide 26
Server Performance
Performance of P7 does not justify high Oracle license cost
8 socket Intel E7 does not scale well compared to Intel E5
2 socket Intel E5 provides fastest single thread performance and best service times but limited scalability
SPARC T5 provides high throughput with good service times
Reviewing Server Performance
copyright © 2013 by benchware.ch slide 27
1 Introduction
2 Benchmark Results – CPU Performance
3 Benchmark Results – In-Memory Server Performance
4 Benchmark Results – Storage Performance
5 Benchmark Results – Database Performance
Contents
copyright © 2013 by benchware.ch slide 28
Storage Performance
Storage performance is essential not only for overall Oracle database performance, but also for system management tasks like backup, recovery and archiving
Oracle uses all I/O pattern, but different o/s calls dependent upon the - operating system
- system load (Oracle changes system call dependent on load)
Why measure Storage performance?
copyright © 2013 by benchware.ch slide 29
Storage Performance
Oracle sequential read
- User process(es): full table scan, full index scan
- Temp segment
- Backup, restore, recovery RMAN, Export, Data Pump
- ARCH: reading online REDO logfile
Oracle random read
- User process(es)
Oracle sequential write
- Temp segment
- Backup, restore RMAN, Export, Data Pump
- LWGR: small block size
- ARCH: writing archived REDO logfile
- RVWR: flashback log file writer
- CTWR: block change tracking file
Oracle random write
- DBWR process(es)
Why measure Storage performance?
copyright © 2013 by benchware.ch slide 30
Storage Performance
Server internal PCI attached flash technology
CPU’s Main Memory
Database Buffer Cache
PCI-based connection
Serv
er S
yste
m
Flash Modules
• mirrored ASM failure groups • complete DB in flash • read from flash • write to flash
copyright © 2013 by benchware.ch slide 31
Storage Performance
External FC attached flash technology
Serv
er S
yste
m
Sto
rage
Sys
tem
CPU’s Main Memory
Database Buffer Cache
FC-based network IB-based network PCI-attached
Flash Modules
Flash Controller & Switched Network
Frondend Controller & Cache
Flash Modules
Flash Modules
Flash Modules
• conventional storage system • scalable hardware architecture • mature software tools
• no changes for IT engineering
and IT operation
copyright © 2013 by benchware.ch slide 32
Storage Performance
0
2'000
4'000
6'000
8'000
10'000
12'000
14'000
16'000
18'000
20'000
1 2 4 8 16 32
PCI Flash
FC Flash 4
nodes
Number of processes
Thro
ugh
pu
t in
[M
Bp
s]
Sequential read, multiple processes
Similar performance for one database server, but FC flash provides more scalability
copyright © 2013 by benchware.ch slide 33
Storage Performance
Sequential read, multiple processes
CPU CPU Physical Physical Physical Physical Physical Physical REDO Hitrate Hitrate Elap
busy sys read read read write write write write db flash exa flash time
Run Tst Code #N #J #T [%] [%] [iops] [bps] [MBps] [iops] [bps] [MBps] [iops] [%] [%] [s]
---- ---- ------ ---- ----- ---- ---- ---- --------- --------- -------- --------- --------- -------- --------- -------- --------- -----
14 1 STO-12 1 1 8 14 3 4716 601032 4696 2 1 0 0 0 0 272
2 STO-12 1 2 8 17 4 5294 675396 5277 5 5 0 2 0 0 309
3 STO-12 1 4 8 18 5 5401 688771 5381 3 4 0 0 0 0 303
4 STO-12 1 8 8 18 5 5387 687070 5368 3 3 0 0 0 0 324
5 STO-12 1 16 8 19 5 5386 687199 5369 3 2 0 2 0 0 329
PC
I Fla
sh
Legend: #N number of RAC nodes #J number of jobs #T number of threads (PX) [s] elapsed time in seconds [iops] i/o operations oper second [bps] blocks per second [MBps] mega byte per second
CPU CPU Physical Physical Physical Physical Physical Physical REDO Hitrate Hitrate Elap
busy sys read read read write write write write db flash exa flash time
Run Tst Code #N #J #T [%] [%] [iops] [bps] [MBps] [iops] [bps] [MBps] [iops] [%] [%] [s]
---- ---- ------ ---- ----- ---- ---- ---- --------- --------- -------- --------- --------- -------- --------- -------- --------- -----
28 6 STO-12 4 4 8 5 1 11468 1459486 11402 7 0 0 0 0 0 286
7 STO-12 4 8 8 7 1 15637 1990867 15554 17 12 0 1 0 0 311
9 STO-12 4 16 8 8 2 17985 2289876 17890 17 12 0 2 0 0 319
13 STO-12 4 32 8 8 2 18622 2371178 18525 19 11 0 2 0 0 333
FC F
lash
copyright © 2013 by benchware.ch slide 34
Storage Performance
0
100'000
200'000
300'000
400'000
500'000
600'000
700'000
800'000
900'000
1 2 4 8 16 32 64 128 2 node 4 node
PCI Flash
FC Flash
Number of processes
Thro
ugh
pu
t in
[io
ps]
Random read
100 μs
123 μs
188 μs 436 μs
1’668 μs
841 μs
567 μs
552 μs
564 μs
418 μs
388 μs
copyright © 2013 by benchware.ch slide 35
Storage Performance
Random read
CPU CPU Physical Physical Physical Physical Physical Physical REDO Hitrate Hitrate Elap
busy sys read read read write write write write db flash exa flash time
Run Tst Code #N #J #T [%] [%] [iops] [bps] [MBps] [iops] [bps] [MBps] [iops] [%] [%] [s]
---- ---- ------ ---- ----- ---- ---- ---- --------- --------- -------- --------- --------- -------- --------- -------- --------- -----
15 10 STO-62 1 1 1 3 2 31747 31745 248 11 23 0 2 0 0 51
11 STO-62 1 2 1 6 4 64039 64037 500 59 68 1 5 0 0 51
12 STO-62 1 4 1 12 7 124897 124895 976 164 168 1 10 0 0 52
13 STO-62 1 8 1 23 13 228361 228359 1784 356 351 3 19 0 0 57
14 STO-62 1 16 1 45 27 382520 382518 2988 629 607 5 31 0 0 68
15 STO-62 1 32 1 77 45 506050 506047 3954 854 812 6 41 0 0 103
16 STO-62 1 64 1 95 51 534207 534206 4174 879 832 7 41 0 0 196
17 STO-62 1 128 1 98 51 508957 508962 3976 810 764 6 40 0 0 305
18 STO-62 1 256 1 98 48 508889 508888 3976 793 730 6 61 0 0 310
PC
I Fla
sh
Legend: #N number of RAC nodes #J number of jobs #T number of threads (PX) [s] elapsed time in seconds [iops] i/o operations oper second [bps] blocks per second [MBps] mega byte per second
CPU CPU Physical Physical Physical Physical Physical Physical REDO Hitrate Hitrate Elap
busy sys read read read write write write write db flash exa flash time
Run Tst Code #N #J #T [%] [%] [iops] [bps] [MBps] [iops] [bps] [MBps] [iops] [%] [%] [s]
---- ---- ------ ---- ----- ---- ---- ---- --------- --------- -------- --------- --------- -------- --------- -------- --------- -----
5 84 STO-62 1 1 1 0 0 2222 2226 18 15 12 0 1 0 0 305
85 STO-62 1 2 1 1 0 4375 4389 34 19 12 0 2 0 0 305
86 STO-62 1 4 1 1 0 8832 8864 69 15 12 0 1 0 0 306
87 STO-62 1 8 1 1 0 18227 18279 143 16 13 0 1 0 0 306
88 STO-62 1 16 1 1 1 36825 36896 288 17 14 0 1 0 0 307
89 STO-62 1 32 1 3 1 70686 70759 553 21 16 0 2 0 0 308
90 STO-62 1 64 1 4 1 134721 134796 1053 30 20 0 3 0 0 307
91 STO-62 1 128 1 11 7 205312 205392 1605 31 18 0 3 0 0 288
5 27 STO-62 2 256 1 14 8 416264 416434 3254 57 31 0 8 0 0 284
5 44 STO-62 3 384 1 17 8 609607 609860 4765 81 44 0 11 0 0 291
13 11 STO-62 4 736 1 27 13 806051 807746 6311 134 73 1 20 0 0 313
FC F
lash
copyright © 2013 by benchware.ch slide 36
Storage Performance
PCI flash - fast solution for database servers with performance problems
- database consolidation for small number of databases
- provides fastest I/O service times
- RAC is not supported
FC Flash with Cache - delivers better scalability
- RAC is supported
Reviewing Storage Performance
copyright © 2013 by benchware.ch slide 37
1 Introduction
2 Benchmark Results – CPU Performance
3 Benchmark Results – In-Memory Server Performance
4 Benchmark Results – Storage Performance
5 Benchmark Results – Database Performance
Contents
copyright © 2013 by benchware.ch slide 38
Database Performance
Projects need understandable key performance metrics for capacity planning - Data load
- Data scan
- Data aggregation
- OLTP transactions
- Time windows for certain operations
Why measure Database performance?
copyright © 2013 by benchware.ch slide 39
Database Performance
0
5'000
10'000
15'000
20'000
25'000
0 20 40 60 80 100
PCI Flash
FC Flash
Transaction size in rows per transaction [rpt]
Load
rat
e in
[rp
s]
Database transactional load, single process, different transaction size
PCI Flash: • can execute up to
4’000 commit per second
• avg service time < 250 μs per SQL insert statement
LGWR stress test
FC Flash with cache: • can execute up to
4’762 commit per second
• avg service time < 205 μs per SQL insert statement
copyright © 2013 by benchware.ch slide 40
Database Performance
Database transactional load, single process, different transaction size
Legend: #N number of RAC nodes #J number of jobs #T number of threads (PX) [rps] rows per second [tps] transactions per second [iops] i/o operations per second [s] time in seconds [ms] time in milli seconds [μs] time in micro seconds
TX CPU Throughput Throughput SQL service Physical Physical Physical REDO REDO REDO REDO REDO Elap
size busy rows/sec txn/sec time write write write size writes svt sync sync svt time
Run Tst Code #N #J #T [rpt] [%] [rps] [tps] [s] [iops] [bps] [MBps] [MBps] [iops] [ms] writes [us] [s]
---- ---- ------ ---- ----- ---- ----- ---- ----------- ----------- ----------- --------- --------- -------- ------ ------- ------ ------ -------- -----
22 1 DBL-11 1 1 1 1 5 4.006E+03 4.006E+03 2.493E-04 4112 670 14 7 4008 15 7 14 312
2 DBL-11 1 1 1 2 5 5.388E+03 2.694E+03 3.704E-04 2831 747 14 7 2694 13 1 87 232
3 DBL-11 1 1 1 4 4 6.720E+03 1.680E+03 5.928E-04 1882 842 15 8 1683 11 1 75 186
4 DBL-11 1 1 1 5 4 7.102E+03 1.420E+03 7.018E-04 1700 878 16 8 1424 14 1 187 176
5 DBL-11 1 1 1 10 4 8.065E+03 8.060E+02 1.239E-03 1185 963 17 9 810 17 1 137 155
6 DBL-11 1 1 1 20 4 8.681E+03 4.340E+02 2.292E-03 879 986 17 9 439 24 2 86 144
7 DBL-11 1 1 1 50 4 9.058E+03 1.810E+02 5.476E-03 679 1022 17 9 184 23 1 178 138
8 DBL-11 1 1 1 100 4 9.259E+03 9.300E+01 1.073E-02 638 1028 17 9 95 29 1 190 135
PC
I Fla
sh
TX CPU Throughput Throughput SQL service Physical Physical Physical REDO REDO REDO REDO REDO Elap
size busy rows/sec txn/sec time write write write size writes svt sync sync svt time
Run Tst Code #N #J #T [rpt] [%] [rps] [tps] [s] [iops] [bps] [MBps] [MBps] [iops] [ms] writes [us] [s]
---- ---- ------ ---- ----- ---- ----- ---- ----------- ----------- ----------- --------- --------- -------- ------ ------- ------ ------ -------- -----
43 1 DBL-11 1 1 1 1 1 4.762E+03 4.762E+03 2.052E-04 6206 640 33 8 2026 119 2 1114 315
2 DBL-11 1 1 1 2 1 7.222E+03 3.611E+03 2.687E-04 5944 847 39 10 1929 86 3 499 225
3 DBL-11 1 1 1 4 1 1.035E+04 2.588E+03 3.678E-04 5394 1171 48 12 1716 62 1 2105 157
4 DBL-11 1 1 1 5 1 1.169E+04 2.338E+03 4.105E-04 5187 1306 53 13 1651 56 1 766 139
5 DBL-11 1 1 1 10 1 1.491E+04 1.491E+03 6.320E-04 4452 1607 63 16 1376 46 1 1415 109
6 DBL-11 1 1 1 20 1 1.747E+04 8.740E+02 1.054E-03 3079 1829 70 18 872 44 2 604 93
7 DBL-11 1 1 1 50 1 2.006E+04 4.010E+02 2.299E-03 1813 2053 78 20 409 52 1 472 81
8 DBL-11 1 1 1 100 1 2.083E+04 2.080E+02 4.378E-03 1167 2157 80 20 214 70 1 515 78
FC F
lash
wit
h
Cac
he
copyright © 2013 by benchware.ch slide 41
Database Performance
0
50'000
100'000
150'000
200'000
250'000
1 2 4 8 16 32 64 128 256 512
PCI Flash
FC Flash
Number of processes
Load
rat
e in
[tp
s]
Oracle OLTP select performance, 1 row per transaction
187 μs
174 μs
164 μs
174 μs
228 μs
346 μs
181 μs
741 μs
PCI flash: • ~ 180’000 look up tx • < 350 μs • system is CPU bound
Same test with all data in memory: • > 500’000 tps • < 65 μs service time
copyright © 2013 by benchware.ch slide 42
Database Performance
0
100'000
200'000
300'000
400'000
500'000
600'000
700'000
1 2 4 8 16 32 64 128 256 512 2 nodes 4 nodes
PCI Flash
FC Flash
Number of processes
Load
rat
e in
[tp
s]
Oracle OLTP select performance, 1 row per transaction
346 μs
2.4 ms
1.5 ms
1.5 ms
copyright © 2013 by benchware.ch slide 43
Database Performance
Oracle OLTP select performance, 1 row per transaction
CPU Throughput Throughput SQL service Physical Physical REDO Hitrate Hitrate Physical Physical Elap
busy rows/sec txn/sec time read write write db flash exa flash read write time
Run Tst Code #N #J #T [%] [rps] [tps] [s] [iops] [iops] [iops] [%] [%] [MBps] [MBps] [s]
---- ---- ------ ---- ----- ---- ---- ----------- ----------- ----------- --------- --------- --------- -------- --------- -------- -------- -----
18 1 DBX-12 1 1 1 3 5.320E+03 5.320E+03 1.871E-04 4033 26 6 0 0 32 0 50
2 DBX-12 1 2 1 5 1.132E+04 1.132E+04 1.737E-04 8608 36 12 0 0 67 0 47
3 DBX-12 1 4 1 9 2.364E+04 2.364E+04 1.637E-04 18854 50 25 0 0 147 0 45
4 DBX-12 1 8 1 17 4.433E+04 4.433E+04 1.741E-04 39954 70 46 0 0 312 0 48
5 DBX-12 1 16 1 33 8.512E+04 8.512E+04 1.811E-04 82951 116 89 0 0 648 1 50
6 DBX-12 1 32 1 62 1.351E+05 1.351E+05 2.278E-04 137238 164 140 0 0 1072 1 63
7 DBX-12 1 64 1 92 1.811E+05 1.811E+05 3.456E-04 187886 205 187 0 0 1468 1 94
8 DBX-12 1 128 1 99 1.677E+05 1.677E+05 7.412E-04 197388 190 175 0 0 1542 0 203
Legend: #N number of RAC nodes #J number of jobs #T number of threads (PX) [rps] rows per second [tps] transactions per second [iops] i/o operations per second [s] time in seconds
PC
I Fla
sh
CPU CPU Throughput Throughput SQL service Physical Physical REDO Hitrate Hitrate Physical Physical Elap
busy sys rows/sec txn/sec time read write write db flash exa flash read write time
Run Tst Code #N #J #T [%] [%] [rps] [tps] [s] [iops] [iops] [iops] [%] [%] [MBps] [MBps] [s]
---- ---- ------ ---- ----- ---- ---- ---- ----------- ----------- ----------- --------- --------- --------- -------- --------- -------- -------- -----
7 1 DBX-12 1 1 1 0 0 8.050E+02 8.050E+02 1.194E-03 1652 50 0 0 0 13 1 82
2 DBX-12 1 2 1 1 0 1.535E+03 1.535E+03 1.224E-03 3101 63 1 0 0 24 1 86
3 DBX-12 1 4 1 1 0 3.070E+03 3.070E+03 1.228E-03 6071 65 1 0 0 48 1 86
4 DBX-12 1 8 1 1 0 6.212E+03 6.212E+03 1.185E-03 11982 55 1 0 0 94 1 85
5 DBX-12 1 16 1 2 1 1.427E+04 1.427E+04 1.051E-03 26563 78 2 0 0 208 1 74
6 DBX-12 1 32 1 4 1 3.352E+04 3.352E+04 8.679E-04 59094 86 12 0 0 462 1 63
7 DBX-12 1 64 1 6 2 6.925E+04 6.925E+04 8.052E-04 111404 110 8 0 0 871 1 61
8 DBX-12 1 128 1 10 3 1.320E+05 1.320E+05 8.139E-04 186584 123 13 0 0 1458 1 64
9 DBX-12 1 256 1 19 9 1.690E+05 1.690E+05 1.280E-03 208241 96 17 0 0 1627 1 100
10 DBX-12 1 512 1 21 7 2.209E+05 2.209E+05 1.884E-03 246663 109 20 0 0 1927 1 153
20 DBX-12 2 1024 1 38 11 4.305E+05 4.305E+05 1.912E-03 480650 173 49 0 0 3755 1 157
30 DBX-12 4 2048 1 41 9 5.928E+05 5.928E+05 2.814E-03 662053 241 68 0 0 5172 1 228
FC F
lash
copyright © 2013 by benchware.ch slide 44
Database Performance
PCI flash - fast solution for database servers with performance problems
- database consolidation for small number of databases
- provides fastest I/O service times
- RAC is not supported
FC Flash with Cache - combines advantages of DRAM cache with flash modules
- delivers more scalability
- RAC is supported
Reviewing Database Performance
copyright © 2013 by benchware.ch slide 46
Benchware Ltd
For more information please contact Manfred Drozd Benchware AG Dipl.-Inform. Seestrasse 18 Co-founder & Managing Partner CH-8800 Thalwil (Switzerland) [email protected] [email protected] +41 79 334 68 68 +41 44 722 16 16 www.benchware.ch
Manfred Drozd studied Computer Science at the University of Paderborn (Germany). He observed the relational database technology from its beginnings when he started his career in 1980 as a programmer developing a relational database system. A life science company in Basle hired him in 1984 to implement Oracle Version 3.1 at the R&D data center. Between 1986 and 1990 he managed several database development teams. From 1990 to 2001 Manfred Drozd was an employee of Oracle Corp. Switzerland, ultimately founding and heading the consulting practice Server Technology & Performance Architecture. Currently he is working as an independent consultant designing, implementing, benchmarking and optimizing Oracle database platforms. Since 1993 Manfred Drozd has been focusing on Oracle performance and architecture. On behalf of customers he periodically runs performance tests in the benchmark centers of the hardware vendors. He also holds training courses and public seminars about scalable Oracle systems and Oracle performance tuning. He is a frequent speaker at SOUG (Swiss Oracle User Group) and DOAG (Deutsche Oracle Anwendergruppe) events. Over the past 12 years Manfred Drozd and his team have developed benchmark tools to identify Oracle platform key performance metrics. Benchmarking helps to understand platform performance based on factual knowledge. Manfred Drozd is an advocator of an holistic Performance by Design approach. Oracle Database platforms are built from the bottom up with a complete calibration of all technology layers focusing on the performance and availability requirements of applications. He used this approach very successfully for the architecture of large OLTP and Data Warehouse systems in the telecommunication and financial industry.