34
Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 © 2002 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech Update, September 2003 Steve Lieman, OpenVMS Performance Group,

© 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Embed Size (px)

Citation preview

Page 1: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1

© 2002

“Marvel” EV7 for OpenVMS:

Proof Points from Live Customer Production Systems

Tech Update, September 2003

Steve Lieman, OpenVMS Performance Group,

“Marvel” EV7 for OpenVMS:

Proof Points from Live Customer Production Systems

Tech Update, September 2003

Steve Lieman, OpenVMS Performance Group,

Page 2: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 2

© 2002

Marvel Performance Characterization ProjectMarvel Performance Characterization Project

Unique OpenVMS approach Proof points Live customer systems Pre-release based on customer benchmarks Early adopter mission critical production systems … and now mainstream production systems First use of proof points with Marvel creates the

foundation and infrastructure for future work

Page 3: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 3

© 2002

How much benefit for you???How much benefit for you???

How much improvement will you see when you upgrade your largest most heavily loaded OpenVMS systems to Marvel EV7?

0

10000

20000

30000

40000

50000

60000

70000

GS 160 931 MHz Marvel 800 MHz

7.3-1

GS 160 GS1280

Page 4: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 4

© 2002

Want even more detail?Want even more detail?

The electronic version of this presentation contains extensive notes pages for your further study, reflection, and review.

Page 5: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 5

© 2002

Which performance tests inspire the most confidence for you?Which performance tests inspire the most confidence for you?

Chip speed, cache size, memory bandwidth? Heavily tuned industry standard tests?

Customer developed benchmark tests?

How well do these help you predict the actual benefit that you will achieve in your situation?

Page 6: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 6

© 2002

… Which performance tests inspire the most confidence for you?… Which performance tests inspire the most confidence for you?

A Unique OpenVMS alternative to traditional methods Production Proof Points

– from live Mission Critical Systems – A growing series of proof points– Each backed with detailed & extensive hard data – Taken from early adopters & now mainstream users– Showing before & after proofs in detail– Running applications & software similar to your usage

Bottom Line: The unique OpenVMS approach to performance (using live production proof points) provides the highest predictive value

Page 7: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 7

© 2002

Definition of HeadroomDefinition of Headroom

Headroom helps explain performance on live customer systems

Predicted height of roofline of maximum throughput Actual throughput PLUS estimated spare capacity

Point of Maximum Throughput happens when load increases until it levels off, but in recently upgraded live systems, this does not typically happen immediately.

Page 8: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 8

© 2002

Performance & Peace of MindPerformance & Peace of Mind

Raising the Roof A long-standing OpenVMS tradition

Marvel EV7 creates an especially strong upward step

Why is this 25 year long series of systematic increases in OpenVMS headroom so important a factor for you to consider?

Why are headroom comparisons between OpenVMS systems running on older and new servers so revealing of future value?

Page 9: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 9

© 2002

4P head-to-head test – application Y4P head-to-head test – application Y

Node(s) : WILDFIRE 4P and MARVEL 4P

IO directio(# 1) IO directio(# 2)

23:00:00(27-Oct-2002)

22:00:00(27-Oct-2002)

21:00:00(27-Oct-2002)

20:00:00(27-Oct-2002)

19:00:00(27-Oct-2002)

18:00:00(27-Oct-2002)

20,000

19,000

18,000

17,000

16,000

15,000

14,000

13,000

12,000

11,000

10,000

9,000

8,000

7,000

6,000

5,000

4,000

3,000

2,000

1,000

0

20,000

19,000

18,000

17,000

16,000

15,000

14,000

13,000

12,000

11,000

10,000

9,000

8,000

7,000

6,000

5,000

4,000

3,000

2,000

1,000

0

Appx 2X more powerful @4p

Marvel finishes here

Page 10: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 10

© 2002

16P head-to-head test – Application Y16P head-to-head test – Application Y

Node(s) : WILDFIRE 16P and MARVEL 16P

IO directio(# 1) IO directio(# 2)

22:00:00(31-Oct-2002)

21:00:00(31-Oct-2002)

20:00:00(31-Oct-2002)

19:00:00(31-Oct-2002)

18:00:00(31-Oct-2002)

24,000

22,000

20,000

18,000

16,000

14,000

12,000

10,000

8,000

6,000

4,000

2,000

0

24,000

22,000

20,000

18,000

16,000

14,000

12,000

10,000

8,000

6,000

4,000

2,000

0

More than 3X more powerful

@16p

Page 11: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 11

© 2002

Application Y’s SMP Scaling CurveApplication Y’s SMP Scaling Curve

0

400

800

1200

1600

2000

0 4 8 12 16

Throughput compared to linear scaling

Further scaling past 16p likely

Page 12: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 12

© 2002

SMP ScalingSMP Scaling

0

400

800

1200

1600

2000

0 4 8 12 16

EV7 Y Curve

EV68 Z Curve

EV7 Z Curve

EV7 X Curve

Page 13: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 13

© 2002

Early VMS on Marvel EV7 Results Look StrongEarly VMS on Marvel EV7 Results Look Strong Better than Wildfire in every case Especially strong for SMP scaling Large drop in MPsynch Big jump in maximum projected headroom Maximum gains from 1.4 X to 3.5 X

Page 14: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 14

© 2002

Gains in VMS OS Scaling = Greater TPSGains in VMS OS Scaling = Greater TPS

TPS

This varies with CPU

model

# of CPUs (this also varies by workload)

7.2-1H1

7.3

7.3-1

Linear scaling

Throughput

Point of Maximum Throughput

Page 15: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 15

© 2002

VMS on Marvel EV7 Scaling GainsVMS on Marvel EV7 Scaling Gains

# of CPUs (this varies by workload)

Wildfire Linear scaling

Wildfire Scaling

Throughput TPS

Marvel linear

scaling

Marvel Scaling

Page 16: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 16

© 2002

1.4 X to 3.5X boost in maximum headroom

1.4 X to 3.5X boost in maximum headroom

0

10000

20000

30000

40000

50000

60000

70000

GS 160 931 MHz Marvel 800 MHz

7.3-1

GS 160 GS1280

More than 2X increase in headroom in this case

Page 17: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 17

© 2002

Comparing the Relative Performance of the ES47 to the ES45Comparing the Relative Performance of the ES47 to the ES45

0.6

0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

1.5

1.0 GHz ES47 vs 1.0 GHz ES45

1.0 GHz ES47 vs 1.25 GHz ES45

cache intense

chip speed

OLTP workload

RMS1 test

Rdb1 Test

Memory intense

NOTE: Rdb1 Test and RMS1 test are based on VMS customer workloads

Page 18: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 18

© 2002

Upgrade Path for Maxed out ES45 Systems that need more scalingUpgrade Path for Maxed out ES45 Systems that need more scaling

For ES45 systems that have reached their maximum throughput and capacity, an ES80 or a GS1280 will prove to be an an excellent and effective upgrade path.

Page 19: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 19

© 2002

Factors determining size of gainFactors determining size of gain

– Current alpha server, current speed CPU– Number of CPUs– Type of workload and its SMP scalability– Mix and intensity of Spinlock usage– Current operating system version– Current versions of Oracle, TCPIP, & your application– Current bottleneck or limiting factor

Best to Focus on Idea of Marvel’s impact on your predicted Headroom

Page 20: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 20

© 2002

What to Expect with Marvel EV7 What to Expect with Marvel EV7

Best server platform ever for VMSBest SMP scaling ever for VMSBest throughput and headroom ever for VMSMore VMS applications will get useful scaling

results to 12-16 CPUs and beyondExcellent out-of-the-box performance with

further opportunities for tuning

Page 21: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 21

© 2002

Proof Points of Olympic ProportionsProof Points of Olympic Proportions

Page 22: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 22

© 2002

Page 23: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 23

© 2002

Background SlidesBackground Slides

Passing the Baton

Upgrade to EV7EV68 performance

Page 24: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 24

© 2002

Passing the Baton

What happened with other live production systems?

Let’s take a look using data captured with T4 automated collection &

viewed with our internal timeline visualizer (TLViz)

Bottom Line: Massive increase in maximum OpenVMS headroom

Page 25: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 25

© 2002

Page 26: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 26

© 2002

Background SlidesBackground Slides

Page 27: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 27

© 2002

16 CPU GS1280 Memory Latency16 CPU GS1280 Memory Latency

Average 170 ns

70

172 136 172

172 136 70 136

172 136 172

172

208

208

244 208 208

5 CPUs <= 136 ns 6 CPUs <= 172 ns 5 CPUs <= 244 ns

EV67 GS320: local latency ~330 ns; remote ~960 ns

Page 28: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 28

© 2002

Performance Improvements in V7.2-2 and V7.3Performance Improvements in V7.2-2 and V7.3

V7.2-2 and V7.3 (and Penguin)– Dedicated-CPU lock manager– Process scheduling, idle loop– MUTEX without SCHED Spinlock– SYS$RESCHED (used by DECthreads and Oracle)– SYS$GETJPI – MailBox driver

V7.3– Fibre fastpath – SCSI fastpath

Page 29: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 29

© 2002

Performance Improvements in V7.3-1Performance Improvements in V7.3-1

AST Delivery Mailboxes RMS Global Buffer Locking Reduce IOLOCK8 usage by Fibre/SCSI Improved IO Completion for RAMdisk, Mailbox & Shadowing IO Reduced Balance Slot size Timer Queue Processing Distributed Interrupts for Fast Path Drivers Various NUMA Changes

Page 30: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 30

© 2002

Performance Improvements beyond V7.3-1Performance Improvements beyond V7.3-1

LAN– Fastpath LAN drivers– Fastpath PEdriver– TCPIP

Scaling changes– Remove WSMAX and BALSETCNT restrictions

XFC– Alleviate SMP bottlenecks with very high cache rates

Continued reduction of SCHED Spinlock usage

Page 31: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 31

© 2002

LAN and PE FastpathLAN and PE Fastpath

LAN Drivers– Move off of IOLOCK8 to LAN device specific

spinlocks– Allow device interrupts to CPUs other than the

primary PEdriver

– Move off of IOLOCK8 to PE specific spinlocks– Allow a specific CPU to be chosen for PEdriver

processing

Page 32: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 32

© 2002

TCPIP PerformanceCurrent Synchronization MechanismsTCPIP PerformanceCurrent Synchronization Mechanisms

Single Threaded – One user/operation in execution at any instance

Needed to guarantee synchronization of internal kernel data structures

– True regardless of the number of CPUs or users Synchronization achieved using global single

Spinlock: IOLOCK8– Contention with other IOLOCK8 users

DECnet, LAN drivers, SCS, etc….. Everybody!

Page 33: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 33

© 2002

TCPIP PerformanceFuture Synchronization MechanismsTCPIP PerformanceFuture Synchronization Mechanisms

Multiple dynamic spinlocks– No more IOLOCK8

Queue KRP (kernel request packet)– Handled by fork thread on non-primary CPU– Similar to dedicated lock manager

Improve concurrency– Multiple concurrent network I/O

Page 34: © 2002 Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 1 “Marvel” EV7 for OpenVMS: Proof Points from Live Customer Production Systems Tech

Sept 2003 OpenVMS on Marvel EV7 Performance Characterization 34

© 2002