Transcript
Page 1: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

Precise and Accurate Processor Simulation

Harold Cain, Kevin Lepak, Brandon Schwartz, and

Mikko H. LipastiUniversity of Wisconsin—

Madisonhttp://www.ece.wisc.edu/~pharm

Page 2: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Architecture Research?Genius is one percent inspiration

and ninety-nine percent perspiration.

--Thomas Edison This is not a talk about inspiration

No new ideas or gimmicks This is a talk about perspiration

Mostly graduate student perspiration Infrastructure, tools, methodology

[CAECW Talk/Paper, February 2002]

Page 3: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Performance Modeling

Analytical models Queuing models Simulation

Trace-driven Execution-

driven Full system

Why?

Most widely used in

academic research

Perceived accuracy and precision

Page 4: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Performance Modeling

PerformanceModel

Inputs

Program Characteristics-instruction mix, miss rates

Execution Trace-instruction, address, both

Program binary + input sets

Operating system, program(s), etc. Performance results

Execution characteristicsBottlenecks

Etc.

Accuracy?

Precision?

garbage in, garbage out

Page 5: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Talk Outline Introduction & Motivation

Performance Modeling Precision, Accuracy, Flexibility PharmSim Overview Causes of Inaccuracy

O/S Effects Coherent I/O (DMA) Wrong-path Effects

Summary & Conclusions

Page 6: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Precision, Accuracy, Flexibility Precision

How closely simulator matches design Latency, bandwidth, resource occupancy,

etc. Accuracy

How closely simulation matches reality Requires precision Also requires replication of real-world

conditions, inputs Flexibility?

Enables exploration of broad design space

Page 7: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Uses for Simulation

Design andImplementation Verification

High-levelDesign

MicroarchitecturalDefinition

Design SpaceExploration

QuantitativeTradeoffAnalysis Functional validation

Late design changes

PerformanceValidation

PrecisionFlexibility

Academic Research Accuracy???

Page 8: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Causes of Inaccuracy Many possible causes

Software differences Hardware differences System effects Time dilation: interaction with physical

world Here, we consider:

Operating system code DMA traffic Wrong-path effects

Page 9: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Validating Accuracy How do we validate?

Against real hardware with perf. counters Different “input” since O/S now present Also, post-mortem: too late

Against HDL Same input as model, same error?

Without full system simulation, cannot: Replicate runtime environment Cannot really validate accuracy

Compensating errors mask inaccuracy Hence: build simulator that does not

cheat

Page 10: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

PharmSim Overview

Device simulation, etc. from SimOS-PPC PharmSim replaces functional simulators

Full OOO core model, values in rename registers

Based on SimpleMP [Rajwar] Adds priv. mode, MMU, TLB, exceptions, interrupts,

barriers, flushes, etc.

Block

Simple

SimOS-PPC-AIX 4.3.1-Disk driver-E’net driverE

thern

et

PharmSim-OOO Core-Gigaplane

Page 11: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

PharmSim Pipeline

Decode Execute CommitMemFetch Translate

Substantially similar to IBM Power4 Some instructions “cracked” (1:2 expansion) Others (e.g. lmw) microcode stream

Mem Stage Interface to 2-level cache model Sun Gigaplane XB snoopy MP coherence Caches contain values, must remain coherent

No cheating! No “flat” memory model for reference/redirect

Page 12: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Talk Outline Introduction & Motivation

Performance Modeling Precision, Accuracy, Flexibility PharmSim Overview Causes of Inaccuracy

O/S Effects Coherent I/O (DMA) Wrong-path Effects

Summary & Conclusions

Page 13: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Operating System Effects Fairly well-understood for

commercial: Must account for O/S references

For SPEC? Widely accepted: Safe to ignore O/S paths Most popular tool (Simplescalar)

Intercepts system calls Emulates on host, updates “flat” memory Returns “magically” with caches intact

Is this really OK?

Page 14: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Operating System Effects

References Modeled

Example

User-mode only Atom

User + Shared library Simplescalar with static link

User + Sh Lib + O/S H/W bus trace

User + Sh Lib + O/S + cache control ops

PharmSim

Page 15: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Operating System Effects

Dramatic error (5.8x in mcf, 2-3x commonplace) Note compensating errors (e.g. crafty, gzip, perl) IPC error > 100% (more detail at ISCA)

5.8x

Page 16: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Talk Outline Introduction & Motivation

Performance Modeling Precision, Accuracy, Flexibility PharmSim Overview Causes of Inaccuracy

O/S Effects Coherent I/O (DMA) Wrong-path Effects

Summary & Conclusions

Page 17: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Coherent I/O with DMA

Proc

Cache

Proc

Cache

Memory

PCI Bridge

Graphics

SCSI

Page 18: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

DMA Traffic How do we support DMA?

No “flat” memory image in simulator Lines may be in caches

Invalidate (if DMA write) Flush (if DMA read)

Must use existing coherence protocol Everything has to work correctly

No subtle coherence bugs How much does this matter?

Affects cache miss rates Introduces bus contention

Page 19: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

DMA Traffic PharmSim incorporates accurate DMA engine:

Issues bus invalidates, snoops Concurrent data transfer: No “magic” flat memory

Bottom line: Unimportant for SPECINT Unimportant for SPECWEB, SPECJBB

Others in progress Contrived multiprogrammed workload

4.8% of all coherence traffic due to I/O, 1% IPC effect Results understated due to “overbuilt” MP bus

MP workloads likely much more sensitive Additional evaluation in progress

Page 20: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Talk Outline Introduction & Motivation

Performance Modeling Precision, Accuracy, Flexibility PharmSim Overview Causes of Inaccuracy

O/S Effects Coherent I/O (DMA) Wrong-path Effects

Summary & Conclusions

Page 21: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Wrong-Path Execution

Branch predictor predicts control flow Branch execute redirects mispredictions Extra instructions on wrong path

NT T NT T NT TNT T

NT T NT T

NT T (TAG 1)

(TAG 2)

(TAG 3)

Page 22: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Wrong-path Execution Multiple effects on unarchitected state

Pollute/prefetch I-cache, D-cache, TLB Pollute/train branch predictor (BHR, PHT, RAS)

PharmSim (current status): BHR is updated and repaired PHT is not updated speculatively RAS is updated, no repair No speculative TLB fill

How can we filter wrong-path instructions? No “cheating”: don’t know branch outcomes

Page 23: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Eliminating Wrong-Path

RunaheadPharmSim

Branch Outcome Trace

Right-pathPharmSim

On correctly predicted branch-continue fetching (BAU)On mispredicted branch-stall instruction fetch-restart once branch resolves

Page 24: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Wrong-path Instructions

0%10%20%30%40%50%60%70%80%90%100%

benchmark

%e

xe

cu

ted

in

st

on

wro

ng

pa

th

Aggressive core model; 25%-40% wrong-path

Page 25: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Wrong-path Memory Stalls

Minor effect: better or worse

0%5%10%15%20%25%30%35%40%45%50%

crafty

gap

gcc

gzip

mcf

parser

perlbmkvortex

specjbb

specweb

benchmark

%ti

me

sta

lled

fo

r m

em

ory

exe-driven

trace-driven

Page 26: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Wrong-path RAS Accuracy

Prediction accuracy degrades up to 29% Could add fixup logic

0%10%20%30%40%50%60%70%80%90%100%

crafty

gap

gcc

gzip

mcf

parser

perlbmkvortex

specjbb

specweb

benchmark

RA

S P

red

icti

on

Ac

cu

rac

y

exe-driven

trace-driven

Page 27: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Wrong-path IPC

Negligible effect (0.9%) RAS mispredictions overlapped

0

0.5

1

1.5

2

2.5

3

benchmark

IPC exe-driven

trace-driven

Page 28: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Summary PharmSim

Simulator that does not cheat Can be used to validate assumptions,

simplifications, abstractions Evaluated three effects on accuracy

O/S: dramatic error, even for SPECINT DMA: not important for uniprocessors

MP, bus-constrained results TBD Wrong path: unimportant

Page 29: Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin—Madison pharm

February 8, 2002 Precise and Accurate Processor Simulation--Mikko Lipasti

Conclusions Ignoring O/S effects fraught with danger

Should always model O/S effects Trace-driven vs. execution-driven

Traces with O/S much better Invest in

Trace quality vs. Complexity of execution-driven simulation

Precision without accuracy? Of questionable value

Validation difficult due to compensating errors Hard to know if model is precise or accurate


Recommended