Upload
hillary-nicholson
View
214
Download
0
Embed Size (px)
Citation preview
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1
© 2002
“Marvel” EV7 for OpenVMS:
Proof Points from Live Customer Production Systems
Tech Update, September 2003
Steve Lieman, OpenVMS Performance Group,
“Marvel” EV7 for OpenVMS:
Proof Points from Live Customer Production Systems
Tech Update, September 2003
Steve Lieman, OpenVMS Performance Group,
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 2
© 2002
Marvel Performance Characterization ProjectMarvel Performance Characterization Project
Unique OpenVMS approach Proof points Live customer systems Pre-release based on customer benchmarks Early adopter mission critical production systems … and now mainstream production systems First use of proof points with Marvel creates the
foundation and infrastructure for future work
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 3
© 2002
How much benefit for you???How much benefit for you???
How much improvement will you see when you upgrade your largest most heavily loaded OpenVMS systems to Marvel EV7?
0
10000
20000
30000
40000
50000
60000
70000
GS 160 931 MHz Marvel 800 MHz
7.3-1
GS 160 GS1280
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 4
© 2002
Want even more detail?Want even more detail?
The electronic version of this presentation contains extensive notes pages for your further study, reflection, and review.
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 5
© 2002
Which performance tests inspire the most confidence for you?Which performance tests inspire the most confidence for you?
Chip speed, cache size, memory bandwidth? Heavily tuned industry standard tests?
Customer developed benchmark tests?
How well do these help you predict the actual benefit that you will achieve in your situation?
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 6
© 2002
… Which performance tests inspire the most confidence for you?… Which performance tests inspire the most confidence for you?
A Unique OpenVMS alternative to traditional methods Production Proof Points
– from live Mission Critical Systems – A growing series of proof points– Each backed with detailed & extensive hard data – Taken from early adopters & now mainstream users– Showing before & after proofs in detail– Running applications & software similar to your usage
Bottom Line: The unique OpenVMS approach to performance (using live production proof points) provides the highest predictive value
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 7
© 2002
Definition of HeadroomDefinition of Headroom
Headroom helps explain performance on live customer systems
Predicted height of roofline of maximum throughput Actual throughput PLUS estimated spare capacity
Point of Maximum Throughput happens when load increases until it levels off, but in recently upgraded live systems, this does not typically happen immediately.
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 8
© 2002
Performance & Peace of MindPerformance & Peace of Mind
Raising the Roof A long-standing OpenVMS tradition
Marvel EV7 creates an especially strong upward step
Why is this 25 year long series of systematic increases in OpenVMS headroom so important a factor for you to consider?
Why are headroom comparisons between OpenVMS systems running on older and new servers so revealing of future value?
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 9
© 2002
4P head-to-head test – application Y4P head-to-head test – application Y
Node(s) : WILDFIRE 4P and MARVEL 4P
IO directio(# 1) IO directio(# 2)
23:00:00(27-Oct-2002)
22:00:00(27-Oct-2002)
21:00:00(27-Oct-2002)
20:00:00(27-Oct-2002)
19:00:00(27-Oct-2002)
18:00:00(27-Oct-2002)
20,000
19,000
18,000
17,000
16,000
15,000
14,000
13,000
12,000
11,000
10,000
9,000
8,000
7,000
6,000
5,000
4,000
3,000
2,000
1,000
0
20,000
19,000
18,000
17,000
16,000
15,000
14,000
13,000
12,000
11,000
10,000
9,000
8,000
7,000
6,000
5,000
4,000
3,000
2,000
1,000
0
Appx 2X more powerful @4p
Marvel finishes here
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 10
© 2002
16P head-to-head test – Application Y16P head-to-head test – Application Y
Node(s) : WILDFIRE 16P and MARVEL 16P
IO directio(# 1) IO directio(# 2)
22:00:00(31-Oct-2002)
21:00:00(31-Oct-2002)
20:00:00(31-Oct-2002)
19:00:00(31-Oct-2002)
18:00:00(31-Oct-2002)
24,000
22,000
20,000
18,000
16,000
14,000
12,000
10,000
8,000
6,000
4,000
2,000
0
24,000
22,000
20,000
18,000
16,000
14,000
12,000
10,000
8,000
6,000
4,000
2,000
0
More than 3X more powerful
@16p
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 11
© 2002
Application Y’s SMP Scaling CurveApplication Y’s SMP Scaling Curve
0
400
800
1200
1600
2000
0 4 8 12 16
Throughput compared to linear scaling
Further scaling past 16p likely
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 12
© 2002
SMP ScalingSMP Scaling
0
400
800
1200
1600
2000
0 4 8 12 16
EV7 Y Curve
EV68 Z Curve
EV7 Z Curve
EV7 X Curve
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 13
© 2002
Early VMS on Marvel EV7 Results Look StrongEarly VMS on Marvel EV7 Results Look Strong Better than Wildfire in every case Especially strong for SMP scaling Large drop in MPsynch Big jump in maximum projected headroom Maximum gains from 1.4 X to 3.5 X
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 14
© 2002
Gains in VMS OS Scaling = Greater TPSGains in VMS OS Scaling = Greater TPS
TPS
This varies with CPU
model
# of CPUs (this also varies by workload)
7.2-1H1
7.3
7.3-1
Linear scaling
Throughput
Point of Maximum Throughput
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 15
© 2002
VMS on Marvel EV7 Scaling GainsVMS on Marvel EV7 Scaling Gains
# of CPUs (this varies by workload)
Wildfire Linear scaling
Wildfire Scaling
Throughput TPS
Marvel linear
scaling
Marvel Scaling
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 16
© 2002
1.4 X to 3.5X boost in maximum headroom
1.4 X to 3.5X boost in maximum headroom
0
10000
20000
30000
40000
50000
60000
70000
GS 160 931 MHz Marvel 800 MHz
7.3-1
GS 160 GS1280
More than 2X increase in headroom in this case
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 17
© 2002
Comparing the Relative Performance of the ES47 to the ES45Comparing the Relative Performance of the ES47 to the ES45
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
1.0 GHz ES47 vs 1.0 GHz ES45
1.0 GHz ES47 vs 1.25 GHz ES45
cache intense
chip speed
OLTP workload
RMS1 test
Rdb1 Test
Memory intense
NOTE: Rdb1 Test and RMS1 test are based on VMS customer workloads
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 18
© 2002
Upgrade Path for Maxed out ES45 Systems that need more scalingUpgrade Path for Maxed out ES45 Systems that need more scaling
For ES45 systems that have reached their maximum throughput and capacity, an ES80 or a GS1280 will prove to be an an excellent and effective upgrade path.
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 19
© 2002
Factors determining size of gainFactors determining size of gain
– Current alpha server, current speed CPU– Number of CPUs– Type of workload and its SMP scalability– Mix and intensity of Spinlock usage– Current operating system version– Current versions of Oracle, TCPIP, & your application– Current bottleneck or limiting factor
Best to Focus on Idea of Marvel’s impact on your predicted Headroom
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 20
© 2002
What to Expect with Marvel EV7 What to Expect with Marvel EV7
Best server platform ever for VMSBest SMP scaling ever for VMSBest throughput and headroom ever for VMSMore VMS applications will get useful scaling
results to 12-16 CPUs and beyondExcellent out-of-the-box performance with
further opportunities for tuning
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 21
© 2002
Proof Points of Olympic ProportionsProof Points of Olympic Proportions
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 22
© 2002
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 23
© 2002
Background SlidesBackground Slides
Passing the Baton
Upgrade to EV7EV68 performance
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 24
© 2002
Passing the Baton
What happened with other live production systems?
Let’s take a look using data captured with T4 automated collection &
viewed with our internal timeline visualizer (TLViz)
Bottom Line: Massive increase in maximum OpenVMS headroom
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 25
© 2002
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 26
© 2002
Background SlidesBackground Slides
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 27
© 2002
16 CPU GS1280 Memory Latency16 CPU GS1280 Memory Latency
Average 170 ns
70
172 136 172
172 136 70 136
172 136 172
172
208
208
244 208 208
5 CPUs <= 136 ns 6 CPUs <= 172 ns 5 CPUs <= 244 ns
EV67 GS320: local latency ~330 ns; remote ~960 ns
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 28
© 2002
Performance Improvements in V7.2-2 and V7.3Performance Improvements in V7.2-2 and V7.3
V7.2-2 and V7.3 (and Penguin)– Dedicated-CPU lock manager– Process scheduling, idle loop– MUTEX without SCHED Spinlock– SYS$RESCHED (used by DECthreads and Oracle)– SYS$GETJPI – MailBox driver
V7.3– Fibre fastpath – SCSI fastpath
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 29
© 2002
Performance Improvements in V7.3-1Performance Improvements in V7.3-1
AST Delivery Mailboxes RMS Global Buffer Locking Reduce IOLOCK8 usage by Fibre/SCSI Improved IO Completion for RAMdisk, Mailbox & Shadowing IO Reduced Balance Slot size Timer Queue Processing Distributed Interrupts for Fast Path Drivers Various NUMA Changes
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 30
© 2002
Performance Improvements beyond V7.3-1Performance Improvements beyond V7.3-1
LAN– Fastpath LAN drivers– Fastpath PEdriver– TCPIP
Scaling changes– Remove WSMAX and BALSETCNT restrictions
XFC– Alleviate SMP bottlenecks with very high cache rates
Continued reduction of SCHED Spinlock usage
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 31
© 2002
LAN and PE FastpathLAN and PE Fastpath
LAN Drivers– Move off of IOLOCK8 to LAN device specific
spinlocks– Allow device interrupts to CPUs other than the
primary PEdriver
– Move off of IOLOCK8 to PE specific spinlocks– Allow a specific CPU to be chosen for PEdriver
processing
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 32
© 2002
TCPIP PerformanceCurrent Synchronization MechanismsTCPIP PerformanceCurrent Synchronization Mechanisms
Single Threaded – One user/operation in execution at any instance
Needed to guarantee synchronization of internal kernel data structures
– True regardless of the number of CPUs or users Synchronization achieved using global single
Spinlock: IOLOCK8– Contention with other IOLOCK8 users
DECnet, LAN drivers, SCS, etc….. Everybody!
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 33
© 2002
TCPIP PerformanceFuture Synchronization MechanismsTCPIP PerformanceFuture Synchronization Mechanisms
Multiple dynamic spinlocks– No more IOLOCK8
Queue KRP (kernel request packet)– Handled by fork thread on non-primary CPU– Similar to dedicated lock manager
Improve concurrency– Multiple concurrent network I/O
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 34
© 2002