33
Designing Processors for Timing Speculation from the Ground Up Brian Greskamp, Lu Wan, Ulya Karpuzcu, Jeffrey Cook, Josep Torrellas, Deming Chen, and Craig Zilles University of Illinois http://iacoma.cs.uiuc.edu / BlueShif τ

BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Designing Processors for Timing Speculation from the Ground Up

Brian Greskamp, Lu Wan, Ulya Karpuzcu, Jeffrey Cook,Josep Torrellas, Deming Chen, and Craig Zilles

Universityof Illinois

http://iacoma.cs.uiuc.edu/

BlueShifτ

Page 2: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

Clock Frequency Crash

• Clock frequency is critical to per-thread performance• ... but frequency is now scaling slowly• Need to squeeze more speed out of the process

• Design for better than worst case delay• Use Timing Speculation (TS) architectures to

tolerate occasional timing errors❖ How to design processors to take advantage of TS?

2

Page 3: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

Err

or

Rate

(P

E)

Freq

Pe

rfo

rma

nc

e

Freq

Improving Performance with Timing Speculation (TS)

3

TS Perf gain

TSTS

• Timing Speculation (TS)• Increase f at constant V• Provide error detection and correction mechanism

Rat

ed

Rat

edsome paths overshoot

Page 4: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

TS Perf gain

Err

or

Rate

(P

E)

Freq

Pe

rfo

rma

nc

e

Freq

Err

or

Rate

(P

E)

Freq

Pe

rfo

rma

nc

e

Freq

4

BlueShift Perf gain

TS o

nly

TS o

nly

TS w

ithBl

ueSh

ift

TS w

ithBl

ueSh

ift

• Existing design methods produce steep PE vs f curves• BlueShift

• Speed up frequently overshooting paths• Increase f at same error rate (PE)

Rat

ed

Rat

ed

Contribution: BlueShift Design Methodology

some P cost

Page 5: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

Taxonomy of TS Archs• Checking granularity

• Stage-level: Detect and correct errors at each pipe stage

• At-retirement: Check each instruction just before it retires

• Checker persistence

• Always-on: TS always active (with P and area overhead)

• On-demand: TS can be disabled to save power at lower f

• Functional correctness

• Relaxed: Can prune functionality from some components

• Correct: All components must be functionally complete

5

Page 6: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

Example Architectures

6

Razor [Ernst03]Checker persistence: always-onChecking granularity: stage-levelFunctional correctness: correct

Paceline [Greskamp07]Checker persistence: on-demandChecking granularity: at-retirementFunctional correctness: correct

LeaderCore

CheckerCore

State Comparison

Spec. clk Rated clk

Core 0 Core 1

Rated clk Rated clk

CMP

XOR

NO

T

Error

Shadow

latch

Main

latch

Logic

Cone

Clk

hints

Page 7: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp 7

BlueShift• Goal under BlueShift: Speed up frequently-used critical paths

• Many circuit paths are rarely used [OptTandem07]

• Let infrequently-used critical paths remain slow

• TS architecture will correct these faults

• BlueShift: Profile-driven design process

• Identify frequently-used critical paths

• Modify design to speed them up (area, power)

[OptTandem07]: Francisco Mesa-Martínez et al., MICRO2007

Page 8: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

Target PE Targ

et f

BlueShift Steps

1. Profile gate-level design2. Identify frequently overshooting paths3. Speed up frequently overshooting paths

8

Rep

eat u

ntil

fast

eno

ugh

Err

or

Rate

(P

E)

Freq

Design goal: PE below a specified threshold at target f

Page 9: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

• Targeted Acceleration: Accelerate frequently overshooting paths without increasing worst case delay

• Good for on-demand architectures (Paceline)• Delay Trading: Sacrifice worst case delay to speed up frequently

overshooting paths• Good for always-on architectures (Razor)

9

Two Possible ApproachesE

rro

r R

ate

(P

E)

Freq

Targeted Acceleration

Err

or

Ra

te (

PE

)

Freq

Delay Trading

Page 10: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

BlueShift Design Flow

10

Initial Netlist

Physical Impl

Gate Sim

Speed up overshooting paths

Design Changes

Physical Design

Path Profile

Compute PE

PE > target PE < target

Select Training Benchmarks

Bench 0 Bench 1 Bench n

Final Design

Page 11: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

1: Profile the Design

• Perform gate-level simulation of design

• Record net transition times for each cycle

• Trace dynamic overshooting paths

11

f

b

ca

d

e

Z

X

Y

1. Profile gate-level design

2. Identify frequently overshooting paths

3. Speed up frequently overshooting paths

Page 12: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

Frequency of overshooting d(p): Fraction of cycles on which path p exceeds target clock period

• Error rate PE is upper-bounded by

• Generated sorted list of d(p) for each path averaged across all benchmarks

• Select top k fraction of paths

12

2: Identify Frequently Overshooting Paths

1. Profile gate-level design

2. Identify frequently overshooting paths

3. Speed up frequently overshooting paths

p

d(p)

Page 13: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

• Apply body bias to gates on frequently overshooting paths

• Gates with body bias are faster but consume more leakage power

13

Original Design

Standard

Gate

FBB

Gate

Bias

OSB Design

Example: On-demand Selective Biasing (OSB)

Err

or

Ra

te (

PE

)

Freq

Err

or

Ra

te (

PE

)

Freq

Initial Netlist

Physical Impl

Gate Sim

Speed up High d(p) Paths

Design Changes

Physical Design

Path Profile

Compute PE

PE > target PE < target

Final Design

Select Training Benchmarks

Bench 0 Bench 1 Bench n

List

of F

BB G

ates

Page 14: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

Normal Mode

OSB Operation

14

Bias Off

Bias

Standard

Gate

FBB

Gate

Err

or

Rate

(P

E)

Freq

Rat

ed f

Rated clk Rated clk

Core 0 Core 1

CMP

Page 15: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

OSB Operation

15

Bias

Bias On

Standard

Gate

FBB

Gate

TS Mode

Err

or

Rate

(P

E)

Freq

TS f

Target PE

Err

or

Rate

(P

E)

Freq

Rat

ed f

Spec. clk Rated clk

LeaderCore

CheckerCore

State Comparison

CMP

hints

Page 16: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp 16

• Generate timing constraints for commercial CAD tools

• Relax timing constraints for paths with low freq of overshooting d(p)

• Tighten timing constraints on paths with high freq of overshooting d(p)

• Intuitively: Generate slack on low d(p) paths and move it to high d(p) paths

Path

Con

stra

ints

Initial Netlist

Physical Impl

Gate Sim

Speed up High d(p) Paths

Design Changes

Physical Design

Path Profile

Compute PE

PE > target PE < target

Final Design

Select Training Benchmarks

Bench 0 Bench 1 Bench n

Example:Path Constraint Tuning (PCT)

Err

or

Ra

te (

PE

)

Freq

Page 17: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

How PCT Speeds Up Dynamic Overshooting Paths

17

Assume: Need to speed up path A->Z because it has high d(p) All other paths are infrequently-used and can be slow

1

1

1

A

Z

4

Original

1

A

1 1

4

Refactor

1

A

Z

2

1

2

Resize

1

A

Z

2

1

2

Re-place

1

A

Z

2

1

2

Allocate Low Vt

Page 18: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

BlueShift Evaluation

18

Module Stage # cells Descriptionsparc_exu EXE 21896 Integer FUs, control, bypasslsu_stb_ctl MEM 694 Store buffer controllsu_qctl1 MEM 2304 Load / Store queue controllsu_dctl MEM 3434 L1 D-cache control

sparc_ifu_dec F/D 612 Instruction decodersparc_ifu_fdp F/D 6894 Fetch datapath, PC maintenancesparc_ifu_fcl F/D 1921 L1 I-cache and PC control

• Sampled modules from OpenSPARC T1 (Sun Niagara)• Implemented in commercial ASIC process• Ran BlueShift Profiling on SPECint2006• Evaluated with SPECint2000

Page 19: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

On-demand Selective Biasing: Error Rate

19

Err

or

Rate

(P

E)

Freq

1.0 1.1 1.2 1.3 1.4 1.5

1e−07

1e−05

1e−03

1e−01

1.0 1.1 1.2 1.3 1.4 1.5

1e−07

1e−05

1e−03

1e−01

Erro

rs p

er C

ycle

(PE)

Frequency (norm. to rated)

Target PE

f = 1

.27

f = 1

.12

14% Frequency increase after BlueShift

Page 20: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

Path Constraint Tuning:Error Rate

20

Err

or

Rate

(P

E)

Freq

1.0 1.1 1.2 1.3 1.4 1.5

1e−07

1e−05

1e−03

1e−01

1.0 1.1 1.2 1.3 1.4 1.5

1e−07

1e−05

1e−03

1e−01

Erro

rs p

er C

ycle

(PE)

Frequency (norm. to rated)

Target PE

f = 1

.43

f = 1

.28

Up to 12% frequency increase after BlueShift

Page 21: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

Microarchitecture Evaluation

21

• Two configurations• Paceline + On-demand Selective Biasing (OSB)• Razor + Path Constraint Tuning (PCT)

• Model out-of-order microarchitecture• 4-wide, 152-entry ROB• Simulate with SESC

• Use circuit model to estimate total PE and power consumption of BlueShifted modules in pipeline

• Use Vdd scaling to speed up SRAM modules

Page 22: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

Speedup

22

0

0.2

0.4

0.6

0.8

1

1.2

1.4

Non-TS Paceline Razor

SPEC

int 2

000

Perfo

rman

ce8%

6%

Blue

Shift

PC

T

Blue

Shift

OSB

Trad

ition

al D

esig

n

Trad

ition

al D

esig

nBlueShift improves performance over traditional design

Page 23: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

00.20.40.60.8

11.21.41.61.8

Non-TS Paceline Razor

Power

23

SPEC

int 2

000

Per-C

ore

Pow

er

12%

23%

Blue

Shift

PC

T

Blue

Shift

OSB

Trad

ition

al

Trad

ition

al

BlueShift has significant power cost over traditional designs

Page 24: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

Summary

• Presented profile-driven BlueShift flow• Proposed two specific optimizations aimed at

different TS architectures• Demonstrated frequency improvements

• 14% (8% speedup) from OSB• 12% (6% speedup) from PCT

24

BlueShift: A novel approach to designing processors for TS

Page 25: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Designing Processors for Timing Speculation from the Ground Up

Brian Greskamp, Lu Wan, Ulya Karpuzcu, Jeffrey Cook,Josep Torrellas, Deming Chen, and Craig Zilles

Universityof Illinois

http://iacoma.cs.uiuc.edu/

BlueShifτ

Page 26: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

Err

or

Rate

(P

E)

Freq

# o

f D

yn

Pa

ths

DelayP

erf

orm

an

ce

Freq

Improving Performance with Timing Speculation

26

Err

or

Rate

(P

E)

Freq

# o

f D

yn

Pa

ths

DelayP

erf

orm

an

ce

Freq

Target PER

ated

TS

Err

or

Rate

(P

E)

Freq

# o

f D

yn

Pa

ths

DelayP

erf

orm

an

ce

Freq

TS&B

S

Perf gain

• Premise: Increase f at constant V• Some paths will overshoot the clock cycle, causing errors• Timing Speculation: Detect and correct timing errors• BlueShift: Decrease the error rate

Page 27: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

OSB Performance Detail

27

0

2

4

6

8

10

12

bzi

p2

cra

fty

gap

gcc

gzi

p

mcf

pars

er

twolf

vort

ex

vpr

mean

0

2

4

6

8

10

12

0

2

4

6

8

10

12

Checker Leader NonBS Leader BS Extra

Sp

ee

du

p (

% o

ver

Unpaired)

Po

we

r (W

)

(a) (b)

2xUnpaired

PacelineBase

Paceline+OSB

bzi

p2

cra

fty

ga

p

gcc

gzi

p

mcf

pa

rse

r

two

lf

vort

ex

vpr

hm

ea

n

0

5

10

15

20

25

Paceline Base Paceline+OSB

!"#$%& '( )&%*+%,-./& 0-1 -.2 3+4&% /+.5$,36"+. 071 +* 2"**&%&.6 )-/&8".&97-5&2 3%+/&55+% /+.!#$%-6"+.5: !" -.2

#$%!" %&*&% 6+ ;8$&<="*6-78& -.2 .+.9;8$&<="*6-78& ,+2$8&5> %&53&/6"?&8@:

bzi

p2

cra

fty

ga

p

gcc

gzi

p

mcf

pa

rse

r

two

lf

vort

ex

vpr

me

an

0

2

4

6

8

10

0

2

4

6

8

10 NonBS BS

bzi

p2

cra

fty

ga

p

gcc

gzi

p

mcf

pa

rse

r

two

lf

vort

ex

vpr

hm

ea

n

0

10

20

30

40

Razor Base Razor+PCT

Sp

ee

du

p (

% o

ver

Unpaired)

Po

we

r (W

)

(a) (b)

RazorBase

Razor+PCT

!"#$%& A( )&%*+%,-./& 0-1 -.2 3+4&% /+.5$,36"+. 071 +* 2"**&%&.6 B-C+%97-5&2 3%+/&55+% /+.!#$%-6"+.5:

!"#$%& '071 5=+45 6=& 3+4&% /+.5$,&2 7@ 6=& 3%+/&55+% -.2

DE /-/=&5 ". &'()*+%) !',)> &'()*+%)-."!> -.2 64+ ".56-./&5 +*

/%0'+1)2: F=& 3+4&% "5 7%+G&. 2+4. ".6+ 3+4&% /+.5$,&2 7@

6=& /=&/G&% /+%& 04="/= "5 .&?&% ;8$&<="*6&21> .+.9;8$&<="*6-78&

,+2$8&5 ". 6=& 8&-2&%> ;8$&<="*6-78& ,+2$8&5 ". 6=& 8&-2&%> -.2

&H6%- )-/&8".& 56%$/6$%&5 0/=&/G3+".6".#> IJ> -.2 ;J1: K. -?&%9

-#&> 6=& 3+4&% /+.5$,&2 7@ &'()*+%)-."! "5 ELM ="#=&% 6=-.

6=-6 +* &'()*+%) !',): N+.5&O$&.68@> ;8$&<="*6 4"6= K<; 2+&5

.+6 -22 ,$/= 6+ 6=& 3+4&% /+.5$,36"+. 4="8& 2&8"?&%".# - 5"#9

."!/-.6 3&%*+%,-./& #-".: P+6& 6=-6 6=& /=&/G&% 3+4&% "5 #&.9

&%-88@ 8+4&% 4=&. 6=& /+%&5 %$. ". 3-"%&2 ,+2&: F="5 "5 7&/-$5&

6=& /=&/G&% /+%& 5-?&5 &.&%#@ 7@ 5G"33".# 6=& &H&/$6"+. +* ,-.@

4%+.#93-6= ".56%$/6"+.5:

!"# $%&'()*+, *-(.'(/%01- %02 *'3-(

Q& .+4 /+,3-%& 64+ B-C+%97-5&2 -%/="6&/6$%&5: 3'4$1 !',)

$5&5 6=& 3'4$1 !',) ,+2$8& ",38&,&.6-6"+.> 4="8& 3'4$1-&56

$5&5 6=& 3'4$1-&56 +.& +76-".&2 7@ -338@".# ;8$&<="*6 4"6=

)NF: R5 7&*+%&> 3'4$1-&56 %$.5 -6 6=& *%&O$&./@ #"?&. 7@ 6=&

PE /$%?&5 +* 6=& ;8$&<="*6-78& /+,3+.&.65S 6=&.> 4& -338@ 6%-9

2"6"+.-8 ?+86-#& 5/-8".# 6+ 6=& .+.9;8$&<="*6-78& /+,3+.&.65 5+

6=-6 6=&@ /-. /-6/= $3:

!"#$%& A0-1 5=+45 6=& 53&&2$3 +* 6=& 3'4$1 !',) -.2 3'7

4$1-&56 -%/="6&/6$%&5 +?&% 6=& /%0'+1)2 +.& $5&2 -5 - 7-5&8".&

". !"#$%& '0-1: <"./& 6=&5& B-C+%97-5&2 -%/="6&/6$%&5 6-%#&6 ="#=

3&%*+%,-./&> 6=&@ 2&8"?&% ="#=&% 53&&2$35: Q& 5&& 6=-6> +. -?&%9

-#&> 3'4$1-&56T5 3&%*+%,-./& "5 UM ="#=&% 6=-. 6=-6 +* 3'4$1

!',): F="5 "5 6=& ",3-/6 +* ;8$&<="*6 4"6= )NF ". 6="5 2&5"#. V

4="/= "5 .+6 .&#8"#"78& /+.5"2&%".# 6=-6 3'4$1 !',) 4-5 -8%&-2@

2&5"#.&2 *+% ="#= 3&%*+%,-./&: Q& -85+ 5&& 6=-6 801 -.2> 6+ -

8&55&% &H6&.6> 9:$*; 2+ .+6 3&%*+%, -5 4&88 -5 6=& +6=&% -338"/-9

6"+.5 $.2&% 3'4$1-&56: F="5 "5 6=& %&5$86 +* 6=& $.*-?+%-78& PE

/$%?& *+% 6=&5& -338"/-6"+.5 ". !"#$%& W021:

!"#$%& A071 5=+45 6=& 3+4&% /+.5$,&2 7@ 6=& 64+ 3%+/&55+%

/+.!#$%-6"+.5: F=& 3+4&% "5 7%+G&. 2+4. ".6+ 6=& /+.6%"7$6"+.5

+* 6=& .+.9;8$&<="*6-78& -.2 6=& ;8$&<="*6-78& ,+2$8&5: K. -?9

&%-#&> 3'4$1-&56 /+.5$,&5 LXM ,+%& 3+4&% 6=-. 3'4$1 !',):

F="5 "5 7&/-$5& "6 %$.5 -6 - ="#=&% *%&O$&./@> $5&5 - ="#=&% 5$39

38@ ?+86-#& *+% 6=& .+.9;8$&<="*6-78& ,+2$8&5> -.2 .&&25 ,+%&

5=-2+4 8-6/=&5 -.2 =+8296",& 7$**&%5:

Y"?&. 3'4$1-&56T5 2&8"?&%&2 53&&2$3 -.2 3+4&% /+56> 4& 5&&

6=-6 ;8$&<="*6 4"6= )NF "5 .+6 /+,3&88".# *%+, -. E × D2 3&%9

53&/6"?&: Z.56&-2> 4& 5&& "6 -5 - 6&/=."O$& 6+ *$%6=&% 53&&29$3

- ="#=93&%*+%,-./& 2&5"#. 0-6 - 3+4&% /+561 4=&. /+.?&.6"+.-8

6&/=."O$&5 5$/= -5 ?+86-#& 5/-8".# +% 7+2@ 7"-5".# 2+ .+6 3%+?"2&

*$%6=&% 3&%*+%,-./&: <3&/"!/-88@> *+% *$<+( 0":&:> ;8$&<="*6-78&1

,+2$8&5> ;8$&<="*6 4"6= )NF 3%+?"2&5 -. $19=$<$%'* ,&-.5 +*

",3%+?".# 3&%*+%,-./& 4=&. *$%6=&% ?+86-#& 5/-8".# +% 7+2@ 7"9

-5".# 7&/+,&5 $.*&-5"78&: Z. 6="5 /-5& =+4&?&%> *+% 6=& 3"3&8".& -5

- 4=+8&> .+.9;8$&<="*6-78& 56-#&5 %&,-". - 7+668&.&/G 6=-6 ,$56

7& -22%&55&2 $5".# 5+,& +6=&% 6&/=."O$&:

!"4 +'/567%78'0%9 :;-(<-%2

R86=+$#= ,+56 ,+2$8&5 +* F-78& [ 4&%& *$88@ +36","C&2 4"6=

;8$&<="*6 ". +.& 2-@ +. +$% E\\9/+%& /8$56&%> 6=& +36","C-6"+. +*

,0'1( )>? 6++G -7+$6 +.& 4&&G: <$/= 8+.# 6$%.-%+$.2 6",&5 2$%9

Page 28: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

OSB Power Detail

28

0

2

4

6

8

10

12

bzi

p2

cra

fty

gap

gcc

gzi

p

mcf

pars

er

twolf

vort

ex

vpr

mean

0

2

4

6

8

10

12

0

2

4

6

8

10

12

Checker Leader NonBS Leader BS Extra

Sp

ee

du

p (

% o

ver

Unpaired)

Po

we

r (W

)

(a) (b)

2xUnpaired

PacelineBase

Paceline+OSB

bzi

p2

cra

fty

ga

p

gcc

gzi

p

mcf

pa

rser

two

lf

vort

ex

vpr

hm

ea

n

0

5

10

15

20

25

Paceline Base Paceline+OSB

!"#$%& '( )&%*+%,-./& 0-1 -.2 3+4&% /+.5$,36"+. 071 +* 2"**&%&.6 )-/&8".&97-5&2 3%+/&55+% /+.!#$%-6"+.5: !" -.2

#$%!" %&*&% 6+ ;8$&<="*6-78& -.2 .+.9;8$&<="*6-78& ,+2$8&5> %&53&/6"?&8@:

bzi

p2

cra

fty

gap

gcc

gzi

p

mcf

pars

er

twolf

vort

ex

vpr

mean

0

2

4

6

8

10

0

2

4

6

8

10 NonBS BS

bzi

p2

cra

fty

gap

gcc

gzi

p

mcf

pars

er

twolf

vort

ex

vpr

hm

ean

0

10

20

30

40

Razor Base Razor+PCT

Speedup (

% o

ver

Unpaired)

Pow

er

(W)

(a) (b)

RazorBase

Razor+PCT

!"#$%& A( )&%*+%,-./& 0-1 -.2 3+4&% /+.5$,36"+. 071 +* 2"**&%&.6 B-C+%97-5&2 3%+/&55+% /+.!#$%-6"+.5:

!"#$%& '071 5=+45 6=& 3+4&% /+.5$,&2 7@ 6=& 3%+/&55+% -.2

DE /-/=&5 ". &'()*+%) !',)> &'()*+%)-."!> -.2 64+ ".56-./&5 +*

/%0'+1)2: F=& 3+4&% "5 7%+G&. 2+4. ".6+ 3+4&% /+.5$,&2 7@

6=& /=&/G&% /+%& 04="/= "5 .&?&% ;8$&<="*6&21> .+.9;8$&<="*6-78&

,+2$8&5 ". 6=& 8&-2&%> ;8$&<="*6-78& ,+2$8&5 ". 6=& 8&-2&%> -.2

&H6%- )-/&8".& 56%$/6$%&5 0/=&/G3+".6".#> IJ> -.2 ;J1: K. -?&%9

-#&> 6=& 3+4&% /+.5$,&2 7@ &'()*+%)-."! "5 ELM ="#=&% 6=-.

6=-6 +* &'()*+%) !',): N+.5&O$&.68@> ;8$&<="*6 4"6= K<; 2+&5

.+6 -22 ,$/= 6+ 6=& 3+4&% /+.5$,36"+. 4="8& 2&8"?&%".# - 5"#9

."!/-.6 3&%*+%,-./& #-".: P+6& 6=-6 6=& /=&/G&% 3+4&% "5 #&.9

&%-88@ 8+4&% 4=&. 6=& /+%&5 %$. ". 3-"%&2 ,+2&: F="5 "5 7&/-$5&

6=& /=&/G&% /+%& 5-?&5 &.&%#@ 7@ 5G"33".# 6=& &H&/$6"+. +* ,-.@

4%+.#93-6= ".56%$/6"+.5:

!"# $%&'()*+, *-(.'(/%01- %02 *'3-(

Q& .+4 /+,3-%& 64+ B-C+%97-5&2 -%/="6&/6$%&5: 3'4$1 !',)

$5&5 6=& 3'4$1 !',) ,+2$8& ",38&,&.6-6"+.> 4="8& 3'4$1-&56

$5&5 6=& 3'4$1-&56 +.& +76-".&2 7@ -338@".# ;8$&<="*6 4"6=

)NF: R5 7&*+%&> 3'4$1-&56 %$.5 -6 6=& *%&O$&./@ #"?&. 7@ 6=&

PE /$%?&5 +* 6=& ;8$&<="*6-78& /+,3+.&.65S 6=&.> 4& -338@ 6%-9

2"6"+.-8 ?+86-#& 5/-8".# 6+ 6=& .+.9;8$&<="*6-78& /+,3+.&.65 5+

6=-6 6=&@ /-. /-6/= $3:

!"#$%& A0-1 5=+45 6=& 53&&2$3 +* 6=& 3'4$1 !',) -.2 3'7

4$1-&56 -%/="6&/6$%&5 +?&% 6=& /%0'+1)2 +.& $5&2 -5 - 7-5&8".&

". !"#$%& '0-1: <"./& 6=&5& B-C+%97-5&2 -%/="6&/6$%&5 6-%#&6 ="#=

3&%*+%,-./&> 6=&@ 2&8"?&% ="#=&% 53&&2$35: Q& 5&& 6=-6> +. -?&%9

-#&> 3'4$1-&56T5 3&%*+%,-./& "5 UM ="#=&% 6=-. 6=-6 +* 3'4$1

!',): F="5 "5 6=& ",3-/6 +* ;8$&<="*6 4"6= )NF ". 6="5 2&5"#. V

4="/= "5 .+6 .&#8"#"78& /+.5"2&%".# 6=-6 3'4$1 !',) 4-5 -8%&-2@

2&5"#.&2 *+% ="#= 3&%*+%,-./&: Q& -85+ 5&& 6=-6 801 -.2> 6+ -

8&55&% &H6&.6> 9:$*; 2+ .+6 3&%*+%, -5 4&88 -5 6=& +6=&% -338"/-9

6"+.5 $.2&% 3'4$1-&56: F="5 "5 6=& %&5$86 +* 6=& $.*-?+%-78& PE

/$%?& *+% 6=&5& -338"/-6"+.5 ". !"#$%& W021:

!"#$%& A071 5=+45 6=& 3+4&% /+.5$,&2 7@ 6=& 64+ 3%+/&55+%

/+.!#$%-6"+.5: F=& 3+4&% "5 7%+G&. 2+4. ".6+ 6=& /+.6%"7$6"+.5

+* 6=& .+.9;8$&<="*6-78& -.2 6=& ;8$&<="*6-78& ,+2$8&5: K. -?9

&%-#&> 3'4$1-&56 /+.5$,&5 LXM ,+%& 3+4&% 6=-. 3'4$1 !',):

F="5 "5 7&/-$5& "6 %$.5 -6 - ="#=&% *%&O$&./@> $5&5 - ="#=&% 5$39

38@ ?+86-#& *+% 6=& .+.9;8$&<="*6-78& ,+2$8&5> -.2 .&&25 ,+%&

5=-2+4 8-6/=&5 -.2 =+8296",& 7$**&%5:

Y"?&. 3'4$1-&56T5 2&8"?&%&2 53&&2$3 -.2 3+4&% /+56> 4& 5&&

6=-6 ;8$&<="*6 4"6= )NF "5 .+6 /+,3&88".# *%+, -. E × D2 3&%9

53&/6"?&: Z.56&-2> 4& 5&& "6 -5 - 6&/=."O$& 6+ *$%6=&% 53&&29$3

- ="#=93&%*+%,-./& 2&5"#. 0-6 - 3+4&% /+561 4=&. /+.?&.6"+.-8

6&/=."O$&5 5$/= -5 ?+86-#& 5/-8".# +% 7+2@ 7"-5".# 2+ .+6 3%+?"2&

*$%6=&% 3&%*+%,-./&: <3&/"!/-88@> *+% *$<+( 0":&:> ;8$&<="*6-78&1

,+2$8&5> ;8$&<="*6 4"6= )NF 3%+?"2&5 -. $19=$<$%'* ,&-.5 +*

",3%+?".# 3&%*+%,-./& 4=&. *$%6=&% ?+86-#& 5/-8".# +% 7+2@ 7"9

-5".# 7&/+,&5 $.*&-5"78&: Z. 6="5 /-5& =+4&?&%> *+% 6=& 3"3&8".& -5

- 4=+8&> .+.9;8$&<="*6-78& 56-#&5 %&,-". - 7+668&.&/G 6=-6 ,$56

7& -22%&55&2 $5".# 5+,& +6=&% 6&/=."O$&:

!"4 +'/567%78'0%9 :;-(<-%2

R86=+$#= ,+56 ,+2$8&5 +* F-78& [ 4&%& *$88@ +36","C&2 4"6=

;8$&<="*6 ". +.& 2-@ +. +$% E\\9/+%& /8$56&%> 6=& +36","C-6"+. +*

,0'1( )>? 6++G -7+$6 +.& 4&&G: <$/= 8+.# 6$%.-%+$.2 6",&5 2$%9

Page 29: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

PCT Performance Detail

29

0

2

4

6

8

10

12

bzi

p2

cra

fty

gap

gcc

gzi

p

mcf

pa

rse

r

twolf

vort

ex

vpr

mea

n

0

2

4

6

8

10

12

0

2

4

6

8

10

12

Checker Leader NonBS Leader BS Extra

Sp

ee

du

p (

% o

ver

Unpaired)

Po

we

r (W

)

(a) (b)

2xUnpaired

PacelineBase

Paceline+OSB

bzi

p2

cra

fty

ga

p

gcc

gzi

p

mcf

pa

rse

r

two

lf

vort

ex

vpr

hm

ea

n

0

5

10

15

20

25

Paceline Base Paceline+OSB

!"#$%& '( )&%*+%,-./& 0-1 -.2 3+4&% /+.5$,36"+. 071 +* 2"**&%&.6 )-/&8".&97-5&2 3%+/&55+% /+.!#$%-6"+.5: !" -.2

#$%!" %&*&% 6+ ;8$&<="*6-78& -.2 .+.9;8$&<="*6-78& ,+2$8&5> %&53&/6"?&8@:

bzi

p2

cra

fty

gap

gcc

gzi

p

mcf

pars

er

twolf

vort

ex

vpr

mean

0

2

4

6

8

10

0

2

4

6

8

10 NonBS BS

bzi

p2

cra

fty

gap

gcc

gzi

p

mcf

pars

er

twolf

vort

ex

vpr

hm

ean

0

10

20

30

40

Razor Base Razor+PCT

Speedup (

% o

ver

Unpaired)

Po

we

r (W

)

(a) (b)

RazorBase

Razor+PCT

!"#$%& A( )&%*+%,-./& 0-1 -.2 3+4&% /+.5$,36"+. 071 +* 2"**&%&.6 B-C+%97-5&2 3%+/&55+% /+.!#$%-6"+.5:

!"#$%& '071 5=+45 6=& 3+4&% /+.5$,&2 7@ 6=& 3%+/&55+% -.2

DE /-/=&5 ". &'()*+%) !',)> &'()*+%)-."!> -.2 64+ ".56-./&5 +*

/%0'+1)2: F=& 3+4&% "5 7%+G&. 2+4. ".6+ 3+4&% /+.5$,&2 7@

6=& /=&/G&% /+%& 04="/= "5 .&?&% ;8$&<="*6&21> .+.9;8$&<="*6-78&

,+2$8&5 ". 6=& 8&-2&%> ;8$&<="*6-78& ,+2$8&5 ". 6=& 8&-2&%> -.2

&H6%- )-/&8".& 56%$/6$%&5 0/=&/G3+".6".#> IJ> -.2 ;J1: K. -?&%9

-#&> 6=& 3+4&% /+.5$,&2 7@ &'()*+%)-."! "5 ELM ="#=&% 6=-.

6=-6 +* &'()*+%) !',): N+.5&O$&.68@> ;8$&<="*6 4"6= K<; 2+&5

.+6 -22 ,$/= 6+ 6=& 3+4&% /+.5$,36"+. 4="8& 2&8"?&%".# - 5"#9

."!/-.6 3&%*+%,-./& #-".: P+6& 6=-6 6=& /=&/G&% 3+4&% "5 #&.9

&%-88@ 8+4&% 4=&. 6=& /+%&5 %$. ". 3-"%&2 ,+2&: F="5 "5 7&/-$5&

6=& /=&/G&% /+%& 5-?&5 &.&%#@ 7@ 5G"33".# 6=& &H&/$6"+. +* ,-.@

4%+.#93-6= ".56%$/6"+.5:

!"# $%&'()*+, *-(.'(/%01- %02 *'3-(

Q& .+4 /+,3-%& 64+ B-C+%97-5&2 -%/="6&/6$%&5: 3'4$1 !',)

$5&5 6=& 3'4$1 !',) ,+2$8& ",38&,&.6-6"+.> 4="8& 3'4$1-&56

$5&5 6=& 3'4$1-&56 +.& +76-".&2 7@ -338@".# ;8$&<="*6 4"6=

)NF: R5 7&*+%&> 3'4$1-&56 %$.5 -6 6=& *%&O$&./@ #"?&. 7@ 6=&

PE /$%?&5 +* 6=& ;8$&<="*6-78& /+,3+.&.65S 6=&.> 4& -338@ 6%-9

2"6"+.-8 ?+86-#& 5/-8".# 6+ 6=& .+.9;8$&<="*6-78& /+,3+.&.65 5+

6=-6 6=&@ /-. /-6/= $3:

!"#$%& A0-1 5=+45 6=& 53&&2$3 +* 6=& 3'4$1 !',) -.2 3'7

4$1-&56 -%/="6&/6$%&5 +?&% 6=& /%0'+1)2 +.& $5&2 -5 - 7-5&8".&

". !"#$%& '0-1: <"./& 6=&5& B-C+%97-5&2 -%/="6&/6$%&5 6-%#&6 ="#=

3&%*+%,-./&> 6=&@ 2&8"?&% ="#=&% 53&&2$35: Q& 5&& 6=-6> +. -?&%9

-#&> 3'4$1-&56T5 3&%*+%,-./& "5 UM ="#=&% 6=-. 6=-6 +* 3'4$1

!',): F="5 "5 6=& ",3-/6 +* ;8$&<="*6 4"6= )NF ". 6="5 2&5"#. V

4="/= "5 .+6 .&#8"#"78& /+.5"2&%".# 6=-6 3'4$1 !',) 4-5 -8%&-2@

2&5"#.&2 *+% ="#= 3&%*+%,-./&: Q& -85+ 5&& 6=-6 801 -.2> 6+ -

8&55&% &H6&.6> 9:$*; 2+ .+6 3&%*+%, -5 4&88 -5 6=& +6=&% -338"/-9

6"+.5 $.2&% 3'4$1-&56: F="5 "5 6=& %&5$86 +* 6=& $.*-?+%-78& PE

/$%?& *+% 6=&5& -338"/-6"+.5 ". !"#$%& W021:

!"#$%& A071 5=+45 6=& 3+4&% /+.5$,&2 7@ 6=& 64+ 3%+/&55+%

/+.!#$%-6"+.5: F=& 3+4&% "5 7%+G&. 2+4. ".6+ 6=& /+.6%"7$6"+.5

+* 6=& .+.9;8$&<="*6-78& -.2 6=& ;8$&<="*6-78& ,+2$8&5: K. -?9

&%-#&> 3'4$1-&56 /+.5$,&5 LXM ,+%& 3+4&% 6=-. 3'4$1 !',):

F="5 "5 7&/-$5& "6 %$.5 -6 - ="#=&% *%&O$&./@> $5&5 - ="#=&% 5$39

38@ ?+86-#& *+% 6=& .+.9;8$&<="*6-78& ,+2$8&5> -.2 .&&25 ,+%&

5=-2+4 8-6/=&5 -.2 =+8296",& 7$**&%5:

Y"?&. 3'4$1-&56T5 2&8"?&%&2 53&&2$3 -.2 3+4&% /+56> 4& 5&&

6=-6 ;8$&<="*6 4"6= )NF "5 .+6 /+,3&88".# *%+, -. E × D2 3&%9

53&/6"?&: Z.56&-2> 4& 5&& "6 -5 - 6&/=."O$& 6+ *$%6=&% 53&&29$3

- ="#=93&%*+%,-./& 2&5"#. 0-6 - 3+4&% /+561 4=&. /+.?&.6"+.-8

6&/=."O$&5 5$/= -5 ?+86-#& 5/-8".# +% 7+2@ 7"-5".# 2+ .+6 3%+?"2&

*$%6=&% 3&%*+%,-./&: <3&/"!/-88@> *+% *$<+( 0":&:> ;8$&<="*6-78&1

,+2$8&5> ;8$&<="*6 4"6= )NF 3%+?"2&5 -. $19=$<$%'* ,&-.5 +*

",3%+?".# 3&%*+%,-./& 4=&. *$%6=&% ?+86-#& 5/-8".# +% 7+2@ 7"9

-5".# 7&/+,&5 $.*&-5"78&: Z. 6="5 /-5& =+4&?&%> *+% 6=& 3"3&8".& -5

- 4=+8&> .+.9;8$&<="*6-78& 56-#&5 %&,-". - 7+668&.&/G 6=-6 ,$56

7& -22%&55&2 $5".# 5+,& +6=&% 6&/=."O$&:

!"4 +'/567%78'0%9 :;-(<-%2

R86=+$#= ,+56 ,+2$8&5 +* F-78& [ 4&%& *$88@ +36","C&2 4"6=

;8$&<="*6 ". +.& 2-@ +. +$% E\\9/+%& /8$56&%> 6=& +36","C-6"+. +*

,0'1( )>? 6++G -7+$6 +.& 4&&G: <$/= 8+.# 6$%.-%+$.2 6",&5 2$%9

Page 30: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

PCT Power Detail

30

0

2

4

6

8

10

12

bzi

p2

cra

fty

gap

gcc

gzi

p

mcf

pars

er

twolf

vort

ex

vpr

mean

0

2

4

6

8

10

12

0

2

4

6

8

10

12

Checker Leader NonBS Leader BS Extra

Speedup (

% o

ver

Unpaired)

Pow

er

(W)

(a) (b)

2xUnpaired

PacelineBase

Paceline+OSB

bzi

p2

cra

fty

gap

gcc

gzi

p

mcf

pa

rser

two

lf

vort

ex

vpr

hm

ea

n0

5

10

15

20

25

Paceline Base Paceline+OSB

!"#$%& '( )&%*+%,-./& 0-1 -.2 3+4&% /+.5$,36"+. 071 +* 2"**&%&.6 )-/&8".&97-5&2 3%+/&55+% /+.!#$%-6"+.5: !" -.2

#$%!" %&*&% 6+ ;8$&<="*6-78& -.2 .+.9;8$&<="*6-78& ,+2$8&5> %&53&/6"?&8@:

bzi

p2

cra

fty

gap

gcc

gzi

p

mcf

pars

er

twolf

vort

ex

vpr

mean

0

2

4

6

8

10

0

2

4

6

8

10 NonBS BS

bzi

p2

cra

fty

gap

gcc

gzi

p

mcf

pars

er

twolf

vort

ex

vpr

hm

ean

0

10

20

30

40

Razor Base Razor+PCT

Speedup (

% o

ver

Unpaired)

Pow

er

(W)

(a) (b)

RazorBase

Razor+PCT

!"#$%& A( )&%*+%,-./& 0-1 -.2 3+4&% /+.5$,36"+. 071 +* 2"**&%&.6 B-C+%97-5&2 3%+/&55+% /+.!#$%-6"+.5:

!"#$%& '071 5=+45 6=& 3+4&% /+.5$,&2 7@ 6=& 3%+/&55+% -.2

DE /-/=&5 ". &'()*+%) !',)> &'()*+%)-."!> -.2 64+ ".56-./&5 +*

/%0'+1)2: F=& 3+4&% "5 7%+G&. 2+4. ".6+ 3+4&% /+.5$,&2 7@

6=& /=&/G&% /+%& 04="/= "5 .&?&% ;8$&<="*6&21> .+.9;8$&<="*6-78&

,+2$8&5 ". 6=& 8&-2&%> ;8$&<="*6-78& ,+2$8&5 ". 6=& 8&-2&%> -.2

&H6%- )-/&8".& 56%$/6$%&5 0/=&/G3+".6".#> IJ> -.2 ;J1: K. -?&%9

-#&> 6=& 3+4&% /+.5$,&2 7@ &'()*+%)-."! "5 ELM ="#=&% 6=-.

6=-6 +* &'()*+%) !',): N+.5&O$&.68@> ;8$&<="*6 4"6= K<; 2+&5

.+6 -22 ,$/= 6+ 6=& 3+4&% /+.5$,36"+. 4="8& 2&8"?&%".# - 5"#9

."!/-.6 3&%*+%,-./& #-".: P+6& 6=-6 6=& /=&/G&% 3+4&% "5 #&.9

&%-88@ 8+4&% 4=&. 6=& /+%&5 %$. ". 3-"%&2 ,+2&: F="5 "5 7&/-$5&

6=& /=&/G&% /+%& 5-?&5 &.&%#@ 7@ 5G"33".# 6=& &H&/$6"+. +* ,-.@

4%+.#93-6= ".56%$/6"+.5:

!"# $%&'()*+, *-(.'(/%01- %02 *'3-(

Q& .+4 /+,3-%& 64+ B-C+%97-5&2 -%/="6&/6$%&5: 3'4$1 !',)

$5&5 6=& 3'4$1 !',) ,+2$8& ",38&,&.6-6"+.> 4="8& 3'4$1-&56

$5&5 6=& 3'4$1-&56 +.& +76-".&2 7@ -338@".# ;8$&<="*6 4"6=

)NF: R5 7&*+%&> 3'4$1-&56 %$.5 -6 6=& *%&O$&./@ #"?&. 7@ 6=&

PE /$%?&5 +* 6=& ;8$&<="*6-78& /+,3+.&.65S 6=&.> 4& -338@ 6%-9

2"6"+.-8 ?+86-#& 5/-8".# 6+ 6=& .+.9;8$&<="*6-78& /+,3+.&.65 5+

6=-6 6=&@ /-. /-6/= $3:

!"#$%& A0-1 5=+45 6=& 53&&2$3 +* 6=& 3'4$1 !',) -.2 3'7

4$1-&56 -%/="6&/6$%&5 +?&% 6=& /%0'+1)2 +.& $5&2 -5 - 7-5&8".&

". !"#$%& '0-1: <"./& 6=&5& B-C+%97-5&2 -%/="6&/6$%&5 6-%#&6 ="#=

3&%*+%,-./&> 6=&@ 2&8"?&% ="#=&% 53&&2$35: Q& 5&& 6=-6> +. -?&%9

-#&> 3'4$1-&56T5 3&%*+%,-./& "5 UM ="#=&% 6=-. 6=-6 +* 3'4$1

!',): F="5 "5 6=& ",3-/6 +* ;8$&<="*6 4"6= )NF ". 6="5 2&5"#. V

4="/= "5 .+6 .&#8"#"78& /+.5"2&%".# 6=-6 3'4$1 !',) 4-5 -8%&-2@

2&5"#.&2 *+% ="#= 3&%*+%,-./&: Q& -85+ 5&& 6=-6 801 -.2> 6+ -

8&55&% &H6&.6> 9:$*; 2+ .+6 3&%*+%, -5 4&88 -5 6=& +6=&% -338"/-9

6"+.5 $.2&% 3'4$1-&56: F="5 "5 6=& %&5$86 +* 6=& $.*-?+%-78& PE

/$%?& *+% 6=&5& -338"/-6"+.5 ". !"#$%& W021:

!"#$%& A071 5=+45 6=& 3+4&% /+.5$,&2 7@ 6=& 64+ 3%+/&55+%

/+.!#$%-6"+.5: F=& 3+4&% "5 7%+G&. 2+4. ".6+ 6=& /+.6%"7$6"+.5

+* 6=& .+.9;8$&<="*6-78& -.2 6=& ;8$&<="*6-78& ,+2$8&5: K. -?9

&%-#&> 3'4$1-&56 /+.5$,&5 LXM ,+%& 3+4&% 6=-. 3'4$1 !',):

F="5 "5 7&/-$5& "6 %$.5 -6 - ="#=&% *%&O$&./@> $5&5 - ="#=&% 5$39

38@ ?+86-#& *+% 6=& .+.9;8$&<="*6-78& ,+2$8&5> -.2 .&&25 ,+%&

5=-2+4 8-6/=&5 -.2 =+8296",& 7$**&%5:

Y"?&. 3'4$1-&56T5 2&8"?&%&2 53&&2$3 -.2 3+4&% /+56> 4& 5&&

6=-6 ;8$&<="*6 4"6= )NF "5 .+6 /+,3&88".# *%+, -. E × D2 3&%9

53&/6"?&: Z.56&-2> 4& 5&& "6 -5 - 6&/=."O$& 6+ *$%6=&% 53&&29$3

- ="#=93&%*+%,-./& 2&5"#. 0-6 - 3+4&% /+561 4=&. /+.?&.6"+.-8

6&/=."O$&5 5$/= -5 ?+86-#& 5/-8".# +% 7+2@ 7"-5".# 2+ .+6 3%+?"2&

*$%6=&% 3&%*+%,-./&: <3&/"!/-88@> *+% *$<+( 0":&:> ;8$&<="*6-78&1

,+2$8&5> ;8$&<="*6 4"6= )NF 3%+?"2&5 -. $19=$<$%'* ,&-.5 +*

",3%+?".# 3&%*+%,-./& 4=&. *$%6=&% ?+86-#& 5/-8".# +% 7+2@ 7"9

-5".# 7&/+,&5 $.*&-5"78&: Z. 6="5 /-5& =+4&?&%> *+% 6=& 3"3&8".& -5

- 4=+8&> .+.9;8$&<="*6-78& 56-#&5 %&,-". - 7+668&.&/G 6=-6 ,$56

7& -22%&55&2 $5".# 5+,& +6=&% 6&/=."O$&:

!"4 +'/567%78'0%9 :;-(<-%2

R86=+$#= ,+56 ,+2$8&5 +* F-78& [ 4&%& *$88@ +36","C&2 4"6=

;8$&<="*6 ". +.& 2-@ +. +$% E\\9/+%& /8$56&%> 6=& +36","C-6"+. +*

,0'1( )>? 6++G -7+$6 +.& 4&&G: <$/= 8+.# 6$%.-%+$.2 6",&5 2$%9

Page 31: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

Taxonomy of TS Archs

31

Razor[Ernst03]

Paceline[Greskamp07]

CLS[Liu00]

OptimisticTandem

[Martinez07]

Checking Granularity: Per-stage or at-retirement

Page 32: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

Taxonomy of TS Archs

• Checker persistence: Is TS always-on, or only on demand?

• If on-demand, must be careful not to decrease rated f

• Checking granularity: Check per-stage or at retirement?

• If per-stage, can tolerate higher PE

• Is functional correctness relaxed for speculative units?

• Functional correctness requires new techniques

32

Page 33: BlueShif Designing Processors for Timing Speculation from the … · 2010. 12. 24. · P HJVTH Brian Greskamp NYV\W Clock Frequency Crash • Clock frequency is critical to per-thread

Brian Greskamp

Taxonomy of TS Archs

• Checker persistence: Is TS always-on, or only on demand?

• Is functional correctness relaxed for speculative units?

33

Example Architecture Persistence CorrectnessRazor [Ernst03] always-on correct

OptTandem [Martinez07] always-on relaxedPaceline [Greskamp07] on-demand correct