50
EECS 470 Lecture 5 Slide 1 EECS 470 Lecture 5 Intro to Dynamic Scheduling (Scoreboarding) Winter 2021 Jon Beaumont http://www.eecs.umich.edu/courses/eecs470 Many thanks to Prof. Martin and Roth of University of Pennsylvania for most of these slides. Portions developed in part by Profs. Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Shen, Smith, Sohi, Tyson, Vijaykumar, and Wenisch of Carnegie Mellon University, Purdue University, University of Michigan, and University of Wisconsin.

Lecture 5 Intro to Dynamic Scheduling

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 1

EECS 470

Lecture 5

Intro to Dynamic

Scheduling (Scoreboarding)

Winter 2021

Jon Beaumont

http://www.eecs.umich.edu/courses/eecs470

Many thanks to Prof. Martin and Roth of University of Pennsylvania for most of these slides. Portions developed in part by Profs. Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Shen, Smith, Sohi, Tyson, Vijaykumar, and Wenisch of Carnegie Mellon University, Purdue University, University of Michigan, and University of Wisconsin.

Page 2: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 2

Announcements

• Reminder Lab #2 due Friday at 12:30p

Get checked off during GSI/IA OH

Verilog assignment #2 due Wed 2/10 Submit to autograder by 11:59p

HW # 1 due Thursday Submit to Gradescope by 11:59p

• We'll take the last few minutes of class to discuss the recent news in the department

• Involves discussion of sexual assault • No one is obligated to stay

Page 3: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 3

Last Time

• Hazards • Detection

• Resolution

• Software (avoidance)

• Hardware (stalling, forwarding)

Page 4: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 4

Lingering Questions

• Remember, you can submit lingering questions to cover next lecture at: https://bit.ly/3oSr5FD

Page 5: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 5

Today

• ILP and limits of scalar pipelines

• Introduce dynamic scheduling (i.e. out-of-order execution)

• Register renaming (high level)

• Case study: scoreboard scheduling

Page 6: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 6

Readings

For Today:

• H & P Chapter C.5-C.7, Chapter 3

For Monday:

• D. Sima “Design Space of Register Renaming Techniques” • Can access online from umich IP address

Page 7: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 7

Limitations of Scalar Pipelines

Upper Bound on Scalar Pipeline Throughput

Limited by IPC=1 “Flynn Bottleneck”

Inefficient Unification Into Single Pipeline

Long latency for each instruction

Performance Lost Due to Rigid In-order Pipeline

Unnecessary stalls

Page 8: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 8

Architectures for Instruction-Level Parallelism

Page 9: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 9

Superscalar Machine

Page 10: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 10

What is the real problem?

CPI of in-order pipelines degrades very sharply if the machine parallelism is increased beyond a certain point, i.e., when NxM approaches average distance between dependent instructions

Forwarding is no longer effective Pipeline may never be full due to frequent dependency stalls!

Page 11: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 11

ILP: Instruction-Level Parallelism ILP is a measure of the amount of inter-dependencies between

instructions

Average ILP = no. instruction / no. cyc required

code1: ILP = 1 i.e. must execute serially

code2: ILP = 3 i.e. can execute at the same time

code1: r1 r2 + 1 r3 r1 / 17 r4 r0 - r3

code2: r1 r2 + 1 r3 r9 / 17 r4 r0 - r10

Page 12: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 13

The Problem With In-Order Pipelines

What’s happening in cycle 4? • mulf stalls due to RAW hazard

• OK, this is a fundamental problem

• subf stalls due to pipeline hazard (aka structural hazard)

• Why? subf can’t proceed into D because mulf is there

• That is the only reason, and it isn’t a fundamental one

Why can’t subf go into D in cycle 4 and E+ in cycle 5?

This tends to be a bigger problem with long latency instructions

E.g. loads w/ cache misses or FP arithmetic

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

addf f0,f1,f2 F D E+ E+ E+ W

mulf f2,f3,f2 F D d* d* E* E* E* E* E* W

subf f0,f1,f4 F p* p* D E+ E+ E+ W

Page 13: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 14

New Concepts

• Two (somewhat) independent techniques, although very often employed together: • Dynamic scheduling (a.k.a out-of-order processing)

• Register Renaming

Page 14: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 15

regfile

D$ I$

B P

insn buffer

S D

add p2,p3,p4

mul p2,p4,p5

sub p2,p5,p6

div p4,4,p7

Concept 1: Dynamic Scheduling

• Instructions fetch/decoded into Instruction Buffer • Also called “instruction window” or “instruction scheduler”

• Each cycle, hardware checks if the source registers for each instruction to see if it's ready to execute

• Instructions can leave buffer when ready in arbitrary order • E.g. if "mul" takes a long time to execute, "div" can execute before "sub"

Page 15: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 16

New Hazards

• Problem! • Out of order introduces new types of hazards!

• 3 types of hazards: • Read-after-write (RAW): we’re familiar with this

• Write-after-read (WAR): “Anti dependencies”

• Write-after-write (WAW): “Output dependencies”

• Last two relevant to out-of-order processing

• Is RAR a thing?

WAR and WAW also

called "name"

hazards

Page 16: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 17

New Hazards

• "Write-after-write" (WAW)

• Occurs in out-of-order processors when two instructions write to the same register

• "Write-after-read" (WAR)

• Occurs in out-of-order processors when an instruction overwrites a value read by an earlier instruction

add r2,r3,r1

sub r2,r1,r3

mul r2,r3,r3

div r1, 4,r1

If "div" writes its

result back before

"add", r1 will have

the wrong value!

add r2,r3,r1

sub r2,r1,r3

mul r2,r3,r3

div r1, 4,r1

If "div" writes its

result back before

"sub" reads it, sub

will execute

incorrectly

Why are RAR hazards not a thing?

a) Programs never read the same value multiple times

b) These are already fixed with forwarding

c) Hazards only occur when a value changes

Which could solve WAR and WAW? (select all)

a) Having a bunch more registers

b) Restricting the order of instruction executions

c) Predict when there are hazards

Page 17: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 18

Concept 2: Register Renaming • We could just stall anytime there is a WAR or WAW hazard

• BUT, Anti (WAR) and output (WAW) dependencies are “false” • The dependence is on name/location rather than data

• Given infinite registers, WAR/WAW can always be eliminated

• Idea: increase number of physical registers (not visible to programmer)

• Dynamically rename instrs to use new registers, removes WAR and WAW, but leaves RAW intact

• Example • Names: r1,r2,r3

• Locations: p1,p2,p3,p4,p5,p6,p7

• Original mapping: r1p1, r2p2, r3p3, p4–p7 are “free”

MapTable FreeList Orig. insns Renamed insns r1 r2 r3

p1 p2 p3 p4,p5,p6,p7 add r2,r3,r1 add p2,p3,p4

p4 p2 p3 p5,p6,p7 sub r2,r1,r3 sub p2,p4,p5

p4 p2 p5 p6,p7 mul r2,r3,r3 mul p2,p5,p6

p4 p2 p6 p7 div r1,4,r1 div p4,4,p7

time

Page 18: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 19

Dynamic Scheduling – Full Picture

• Dynamic scheduling • Totally in the hardware

• Also called “out-of-order execution” (OoO)

• Fetch many instructions into instruction window • Use branch prediction to speculate past (multiple) branches

• Flush pipeline on branch misprediction

• Optional: rename to avoid false dependencies (WAW and WAR)

• Execute instructions as soon as possible • Register dependencies are known

• Handling memory dependencies more tricky (much more later)

• Commit instructions in order • Why? To discuss later

• Current machines: 100+ instruction scheduling window

Page 19: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 20

Going Forward: What’s Next

• We’ll build this up in steps over the next few weeks • “Scoreboarding” - first OoO, no register renaming

• “Tomasulo’s algorithm” - adds register renaming

• Handling precise state and speculation

• P6-style execution (Intel Pentium Pro)

• R10k-style execution (MIPS R10k)

• Handling memory dependencies

• Let’s get started!

Page 20: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 22

New Pipeline Diagram Insn D X W ldf X(r1),f1 c1 c2 c3

mulf f0,f1,f2 c3 c4+ c7

stf f2,Z(r1) c7 c8 c9

addi r1,4,r1 c8 c9 c10

ldf X(r1),f1 c10 c11 c12

mulf f0,f1,f2 c12 c13+ c16

stf f2,Z(r1) c16 c17 c18

• Alternative pipeline diagram • Down: insns

• Across: pipeline stages

• Decode, eXecute, Writeback

• In boxes: cycles

• '+' means takes multiple cycles

• Why? Convenient for out-of-order

Page 21: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 24

Instruction Buffer

• Trick: insn buffer (many names for this buffer) • Basically: a bunch of latches for holding insns

• Split D(ecode) into two pieces • Accumulate decoded insns in buffer in-order

• Buffer sends insns down rest of pipeline out-of-order

regfile

D$ I$

B P

insn buffer

D2 D1

Page 22: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 25

Dispatch and Issue

• Dispatch (D): first part of decode • Allocate slot in insn buffer

– New kind of structural hazard (insn buffer is full)

• In order: stall back-propagates to younger insns

• Issue (S): second part of decode • Send insns from insn buffer to execution units

+ Out-of-order: wait doesn’t back-propagate to younger insns

regfile

D$ I$

B P

insn buffer

S D

Page 23: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 26

Dispatch and Issue with Floating-Point

regfile

D$ I$

B P

insn buffer

S D

F-regfile

E/

E +

E +

E* E* E*

We often have

different "functional

units" to execute

different types of

insts

Page 24: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 27

Scheduling Algorithm I: Scoreboard

• Scoreboard • Centralized control scheme: insn status explicitly tracked

• Insn buffer: Functional Unit Status Table (FUST)

• No register renaming

• First implementation: CDC 6600 [1964] • 16 separate non-pipelined functional units (7 int, 4 FP, 5 mem)

• No bypassing

• Our example: “Simple Scoreboard” • 5 FU: 1 ALU, 1 load, 1 store, 2 FP (3-cycle, pipelined)

Page 25: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 28

Scoreboard Data Structures

• FU Status Table • FU, busy, op, R, R1, R2: destination/source register names

• T: destination register tag (FU producing the value)

• T1,T2: source register tags (FU producing the values)

• Register Status Table • T: tag (FU that will write this register)

• Tags interpreted as ready-bits

• Tag == 0 Value is ready in register file

• Tag != 0 Value is not ready, will be supplied by T

• Insn status table • S,X bits for all active insns

Page 26: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 29

Simple Scoreboard Data Structures

• Insn fields and status bits

• Tags

• Values

FU Status

R1 R2

X S Insn

value

FU

T

T2 T1 T op == == == ==

Reg Status

Fetched insns

Regfile

R

T

== == == ==

CAMs

Page 27: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 30

Scoreboard Dispatch (D)

• Stall for WAW Hazards • Allocate scoreboard entry

• Populate T1 and T2 using Reg Status

• Update Reg Status of destination register

FU Status

R1 R2

X S Insn

value

FU

T

T2 T1 T op == == == ==

Reg Status

Fetched insns

Regfile

R

T

== == == ==

Page 28: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 31

Scoreboard Issue (S)

• Wait for RAW register hazards • Read registers

FU Status

R1 R2

X S Insn

value

FU

T

T2 T1 T op == == == ==

Reg Status

Fetched insns

Regfile

R

T

== == == ==

Page 29: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 32

Issue Policy and Issue Logic

• Issue • If multiple insns ready, which one to choose? Issue policy

• Oldest first? Safe

• Longest latency first? May yield better performance

• Select logic: implements issue policy

• W1 priority encoder

• W: window size (number of scoreboard entries)

Page 30: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 33

Scoreboard Execute (X)

• Execute insn

FU Status

R1 R2

X S Insn

value

FU

T

T2 T1 T op == == == ==

Reg Status

Fetched insns

Regfile

R

T

== == == ==

Page 31: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 34

Scoreboard Writeback (W)

• Wait for WAR hazard • Write value into regfile, clear Reg Status entry

• Compare tag to waiting insns input tags, match ? clear input tag

• Free scoreboard entry

FU Status

R1 R2

X S Insn

value

FU

T

T2 T1 T op == == == ==

Reg Status

Fetched insns

Regfile

R

T

== == == ==

Page 32: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 35

Scoreboard Pipeline

• New pipeline structure: F, D, S, X, W • F (fetch)

• Same as it ever was

• D (dispatch)

• Structural or WAW hazard ? stall : allocate scoreboard entry

• S (issue)

• RAW hazard ? wait : read registers, go to execute

• X (execute)

• Execute operation, notify scoreboard when done

• W (writeback)

• WAR hazard ? wait : write register, free scoreboard entry

• W and RAW-dependent S in same cycle

• W and structural-dependent D in same cycle

Poll: Which stages would register renaming directly reduce the number of stalls in?

Page 33: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 36

Scoreboard Data Structures Insn Status Insn D S X W ldf X(r1),f1

mulf f0,f1,f2

stf f2,Z(r1)

addi r1,4,r1

ldf X(r1),f1

mulf f0,f1,f2

stf f2,Z(r1)

Reg Status

Reg T

f0

f1

f2

r1

FU Status FU busy op R R1 R2 T1 T2 ALU no

LD no

ST no

FP1 no

FP2 no

Page 34: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 37

Scoreboard: Cycle 1 Insn Status Insn D S X W ldf X(r1),f1 c1

mulf f0,f1,f2

stf f2,Z(r1)

addi r1,4,r1

ldf X(r1),f1

mulf f0,f1,f2

stf f2,Z(r1)

Reg Status Reg T f0

f1 LD

f2

r1

FU Status FU busy op R R1 R2 T1 T2 ALU no

LD yes ldf f1 - r1 - -

ST no

FP1 no

FP2 no

allocate

Page 35: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 38

Scoreboard: Cycle 2 Insn Status Insn D S X W ldf X(r1),f1 c1 c2

mulf f0,f1,f2 c2

stf f2,Z(r1)

addi r1,4,r1

ldf X(r1),f1

mulf f0,f1,f2

stf f2,Z(r1)

Reg Status Reg T f0

f1 LD

f2 FP1

r1

FU Status FU busy op R R1 R2 T1 T2 ALU no

LD yes ldf f1 - r1 - -

ST no

FP1 yes mulf f2 f0 f1 - LD

FP2 no

allocate

Page 36: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 39

Scoreboard: Cycle 3 Insn Status Insn D S X W ldf X(r1),f1 c1 c2 c3

mulf f0,f1,f2 c2

stf f2,Z(r1) c3

addi r1,4,r1

ldf X(r1),f1

mulf f0,f1,f2

stf f2,Z(r1)

Reg Status Reg T f0

f1 LD

f2 FP1

r1

Functional unit status FU busy op R R1 R2 T1 T2 ALU no

LD yes ldf f1 - r1 - -

ST yes stf - f2 r1 FP1 -

FP1 yes mulf f2 f0 f1 - LD

FP2 no

allocate

Page 37: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 40

Scoreboard: Cycle 4 Insn Status Insn D S X W ldf X(r1),f1 c1 c2 c3 c4

mulf f0,f1,f2 c2 c4

stf f2,Z(r1) c3

addi r1,4,r1 c4

ldf X(r1),f1

mulf f0,f1,f2

stf f2,Z(r1)

Reg Status Reg T f0

f1 LD

f2 FP1

r1 ALU

FU Status FU busy op R R1 R2 T1 T2 ALU yes addi r1 r1 - - -

LD no

ST yes stf - f2 r1 FP1 -

FP1 yes mulf f2 f0 f1 - LD

FP2 no

allocate

free

f1 (LD) is ready issue mulf

f1 written clear

Page 38: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 41

Scoreboard: Cycle 5 Insn Status Insn D S X W ldf X(r1),f1 c1 c2 c3 c4

mulf f0,f1,f2 c2 c4 c5

stf f2,Z(r1) c3

addi r1,4,r1 c4 c5

ldf X(r1),f1 c5

mulf f0,f1,f2

stf f2,Z(r1)

Reg Status Reg T f0

f1 LD

f2 FP1

r1 ALU

FU Status FU busy op R R1 R2 T1 T2 ALU yes addi r1 r1 - - -

LD yes ldf f1 - r1 - ALU

ST yes stf - f2 r1 FP1 -

FP1 yes mulf f2 f0 f1 - -

FP2 no

allocate

Page 39: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 42

Scoreboard: Cycle 6 Insn Status Insn D S X W ldf X(r1),f1 c1 c2 c3 c4

mulf f0,f1,f2 c2 c4 c5+

stf f2,Z(r1) c3

addi r1,4,r1 c4 c5 c6

ldf X(r1),f1 c5

mulf f0,f1,f2

stf f2,Z(r1)

Reg Status Reg T f0

f1 LD

f2 FP1

r1 ALU

FU Status FU busy op R R1 R2 T1 T2 ALU yes addi r1 r1 - - -

LD yes ldf f1 - r1 - ALU

ST yes stf - f2 r1 FP1 -

FP1 yes mulf f2 f0 f1 - -

FP2 no

D stall: WAW hazard w/ mulf (f2) How to tell? RegStatus[f2] non-empty

Page 40: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 43

Scoreboard: Cycle 7 Insn Status Insn D S X W ldf X(r1),f1 c1 c2 c3 c4

mulf f0,f1,f2 c2 c4 c5+

stf f2,Z(r1) c3

addi r1,4,r1 c4 c5 c6

ldf X(r1),f1 c5

mulf f0,f1,f2

stf f2,Z(r1)

FU Status FU busy op R R1 R2 T1 T2 ALU yes addi r1 r1 - - -

LD yes ldf f1 - r1 - ALU

ST yes stf - f2 r1 FP1 -

FP1 yes mulf f2 f0 f1 - -

FP2 no

W wait: WAR hazard w/ stf (r1) How to tell? Untagged r1 in FuStatus Requires CAM

Reg Status Reg T f0

f1 LD

f2 FP1

r1 ALU

Page 41: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 44

Scoreboard: Cycle 8 Insn Status Insn D S X W ldf X(r1),f1 c1 c2 c3 c4

mulf f0,f1,f2 c2 c4 c5+ c8

stf f2,Z(r1) c3 c8

addi r1,4,r1 c4 c5 c6

ldf X(r1),f1 c5

mulf f0,f1,f2 c8

stf f2,Z(r1)

Reg Status Reg T f0

f1 LD

f2 FP1 FP2

r1 ALU

FU Status FU busy op R R1 R2 T1 T2 ALU yes addi r1 r1 - - -

LD yes ldf f1 - r1 - ALU

ST yes stf - f2 r1 FP1 -

FP1 no

FP2 yes mulf f2 f0 f1 - LD allocate

free f1 (FP1) is ready issue stf

first mulf done (FP1)

W wait

Page 42: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 45

Scoreboard: Cycle 9 Insn Status Insn D S X W ldf X(r1),f1 c1 c2 c3 c4

mulf f0,f1,f2 c2 c4 c5+ c8

stf f2,Z(r1) c3 c8 c9

addi r1,4,r1 c4 c5 c6 c9

ldf X(r1),f1 c5 c9

mulf f0,f1,f2 c8

stf f2,Z(r1)

Reg Status Reg T f0

f1 LD

f2 FP2

r1 ALU

FU Status FU busy op R R1 R2 T1 T2 ALU no

LD yes ldf f1 - r1 - ALU

ST yes stf - f2 r1 - -

FP1 no

FP2 yes mulf f2 f0 f1 - LD

D stall: structural hazard FuStatus[ST]

r1 written clear

free

r1 (ALU) is ready issue ldf

Page 43: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 46

Scoreboard: Cycle 10 Insn Status Insn D S X W ldf X(r1),f1 c1 c2 c3 c4

mulf f0,f1,f2 c2 c4 c5+ c8

stf f2,Z(r1) c3 c8 c9 c10

addi r1,4,r1 c4 c5 c6 c9

ldf X(r1),f1 c5 c9 c10

mulf f0,f1,f2 c8

stf f2,Z(r1) c10

Reg Status Reg T f0

f1 LD

f2 FP2

r1

FU Status FU busy op R R1 R2 T1 T2 ALU no

LD yes ldf f1 - r1 - -

ST yes stf - f2 r1 FP2 -

FP1 no

FP2 yes mulf f2 f0 f1 - LD

W & structural-dependent D in same cycle

free, then allocate

Page 44: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 47

In-Order vs. Scoreboard

• Big speedup? – Only 1 cycle advantage for scoreboard

• Why? addi WAR hazard

• Scoreboard issued addi earlier (c8 c5)

• But WAR hazard delayed W until c9

• Delayed issue of second iteration

In-Order Scoreboard Insn D X W D S X W ldf X(r1),f1 c1 c2 c3 c1 c2 c3 c4

mulf f0,f1,f2 c3 c4+ c7 c2 c4 c5+ c8

stf f2,Z(r1) c7 c8 c9 c3 c8 c9 c10

addi r1,4,r1 c8 c9 c10 c4 c5 c6 c9

ldf X(r1),f1 c10 c11 c12 c5 c9 c10 c11

mulf f0,f1,f2 c12 c13+ c16 c8 c11 c12+ c15

stf f2,Z(r1) c16 c17 c18 c10 c15 c16 c17

Page 45: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 48

In-Order vs. Scoreboard II: Cache Miss

• Assume • 5 cycle cache miss on first ldf

• Ignore FUST structural hazards

– Little relative advantage

• addi WAR hazard (c7 c13) stalls second iteration

In-Order Scoreboard Insn D X W D S X W ldf X(r1),f1 c1 c2+ c7 c1 c2 c3+ c8

mulf f0,f1,f2 c7 c8+ c11 c2 c8 c9+ c12

stf f2,Z(r1) c11 c12 c13 c3 c12 c13 c14

addi r1,4,r1 c12 c13 c14 c4 c5 c6 c13

ldf X(r1),f1 c14 c15 c16 c5 c13 c14 c15

mulf f0,f1,f2 c16 c17+ c20 c6 c15 c16+ c19

stf f2,Z(r1) c20 c21 c22 c7 c19 c20 c21

Page 46: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 49

Scoreboard Redux

• The good + Cheap hardware

• InsnStatus + FuStatus + RegStatus ~ 1 FP unit in area

+ Pretty good performance

• 1.7X for FORTRAN (scientific array) programs

• The less good – No bypassing

• Is this a fundamental problem?

– Limited scheduling scope

• Structural/WAW hazards delay dispatch

– Slow issue of truly-dependent (RAW) insns

• WAR hazards delay writeback

• Fix with hardware register renaming

Page 47: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 50

Next Time

• Add register renaming with Tomasulo's Algorithm

• Lingering questions / feedback? I'll include an anonymous form at the end of every lecture: https://bit.ly/3oXr4Ah

50

That's all for today, let's take 5 and

chat about the recent news for

those who want to stay

We will be discussing sexual

assault

Page 48: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 51

In the News

• Professor Peter Chen has been arrested pending charges of (essentially) child rape • Professor Chen regularly teaches Eng 100, EECS 482 (Operating Systems), is

the chief advisor for the CS-Engineering undergrad program, and was interim chair of the department multiple times

• We don't know many of the details yet

• He has been placed on administrative leave pending more information

• My discussion here is not meant to confirm or deny these allegations

• This is the most recent of multiple allegations of sexual misconduct by CS faculty in the past few years

51

Page 49: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 52

Impact

• However these events resolve, they can have a profoundly destructive impact on all of us • Survivors of sexual assault (or their loved ones) still feel trauma

• Trust between faculty, staff and students is strained or broken

• Take care of yourselves and each other

• Feel free to reach out to me any time

52

Page 50: Lecture 5 Intro to Dynamic Scheduling

EECS 470 Lecture 5 Slide 53

Resources

• CSE information on Reporting Misconduct (guidelines for reporting concerns and misconduct, including anonymously, includes a non-complete list of “responsible employees" (i.e. mandatory reporters, must report any misconduct))

• Sexual Assault Prevention and Awareness Center (SAPAC) · (Support for survivors of sexual assault, 24/7 crisis line)

• U-M Counseling and Psychological Services (CAPS) (Provides info on counseling, 24/7 crisis line)

• Office of Institutional Equity

• Other services: College of Engineering C.A.R.E. Center, Campus Mind Works, Depression Center, Services for Students with Disabilities, UHS, UM Psychiatric Emergency Services

53