View
215
Download
0
Tags:
Embed Size (px)
Citation preview
EECE476: Computer Architecture
Lecture 17: Pipelining
Data Hazards: Forwarding & Stalls
Chapter 6.4, 6.5
The University ofBritish Columbia EECE 476 © 2005 Guy Lemieux
2
Implementing Forwarding
• Basic pipeline data flow, no forwarding
3
Implementing Forwarding
• Basic pipelined data flow, with forwarding
D.Rs
D.RtD.Rd
D.Rt
SgnExt(Imm16)
X.Rd M.Rd W.Rd
4
Forwarding Example
• Read after Write Hazard
• Same instruction sequence as beforeSUB $2, $1,$3 <- writes $2
AND $12,$2,$5 <- reads $2
OR $13, $6,$2 <- reads $2
ADD $14, $2,$2 <- reads $2 (twice!)
SW $15,10($2) <- reads $2
• Let’s see the pipeline details
5
Forwarding Details
D.Rs = 1
D.Rt = 3D.Rd = 2
D.Rt = 3X.Rd M.Rd W.Rd
SUB $2,$1,$3
[$1]
[$3]
6
Forwarding Details
D.Rs = 2
D.Rt = 5D.Rd = 12
D.Rt = 5 X.Rd = 2 M.Rd W.Rd
AND $12,$2,$5
[$2]
[$5]
SUB $2,$1,$3
[$1]
[$3]
32
31
7
Forwarding Details – from X
D.Rs = 6
D.Rt = 2D.Rd = 13
D.Rt = 2 X.Rd = 12 M.Rd = 2 W.Rd
AND $12,$2,$5
[$2]
[$5]
SUB $2,$1,$3
M.Out =SUB.Result
OR $13,$6,$2
[M.Out]
2
512
5
2
52
8
Forwarding Details – from M
D.Rs = 2
D.Rt = 2D.Rd = 14
D.Rt = 2 X.Rd = 13 M.Rd = 12 W.Rd = 2
AND $12,$2,$5
[$6]
[W.Out]
SUB $2,$1,$3
M.Out =AND.Result
OR $13,$6,$2
[$6]
6
213
2
12
26
ADD $14,$2,$2
2
W.Out =SUB.Result
What valueis here?
9
Forwarding Details – from W
D.Rs = 2
D.Rt = 2D.Rd = 14
D.Rt = 2 X.Rd = 13 M.Rd = 12 W.Rd = 2
AND $12,$2,$5
[$6]
[W.Out]
SUB $2,$1,$3
M.Out =AND.Result
OR $13,$6,$2
[$6]
6
213
2
12
26
ADD $14,$2,$2
2
W.Out =SUB.Result
BookassumesRegisterFile“writesbeforeit reads”,ie: itforwardsWB resultauto-magically(we must make our clocked NIOS register file do this!)
10
Forwarding Details
D.Rs = 2
D.Rt =15D.Rd = ?
D.Rt =15 X.Rd = 14 M.Rd = 13 W.Rd = 12
OR $13,$6,$2
[SUB.Result]
[SUB.Result]
AND $12,$2,$5
M.Out =OR.Result
ADD $14,$2,$2
2
214
2
13
22
SW $15,10($2)
12
W.Out =AND.Result
What valueis here?
11
New Rules on Forwarding Arrows – Forwarding from W to X
• Forwarding from X or M to X still OK– Please draw arrows according to rules in previous lecture
• Forwarding from W to X no longer necessary– NO RED ARROWS from W to X– Assume W to REGFILE occurs before the read needed by D– This effectively accomplishes the forwarding from W to X– Example: the ADD $14,$2,$2 case just shown
12
New Forwarding
• Observations– RED arrows indicate forwarding
– Arrow always starts at X, M, or W stage• This is the source of the newer data• Source is always the direct output of the pipeline register
– Arrow always ends at X stage• This is the destination of the forwarded data• Destination must choose the forwarded source: X, M, or W
– Arrow always spans ONLY 1 clock cycle in time• From cycle i to cycle i+1
– Arrow may span 1, 2, or 3 instructions (stages of pipeline)
New: assume REGFILE does W before R
RED: showschanges fromlecture 16
13
Forwarding Again
• Read after Write HazardLW $2, 10($1) <- writes $2AND $12,$2,$5 <- reads $2OR $13, $6,$2 <- reads $2ADD $14, $2,$2 <- reads $2 (twice!)SW $15,10($2) <- reads $2
• Difference from previous example– Prev. example, value of was $2 ready after “X” stage– This example, value of $2 is ready after “M” stage
• Let’s see what must happen to operate correctly!
14
Forwarding with “lw”• Example: Clock cycle 1
1 LW $2,10($1) I D X M W
AND $12,$2,$5 I D X M W
OR $13,$6,$2 I D X M W
ADD $14,$2,$2 I D X M W
SW $15,10($2) I D X M W
I D X M W
I D X M W
I D X M W
I D X M W
I D X M W
WMXDI
15
Forwarding with “lw”• Clock cycle 2
1 LW $2,10($1) I D X M W
2 AND $12,$2,$5 I D X M W
OR $13,$6,$2 I D X M W
ADD $14,$2,$2 I D X M W
SW $15,10($2) I D X M W
I D X M W
I D X M W
I D X M W
I D X M W
I D X M W
WMXDI
16
Forwarding with “lw”• Clock cycle 3 can’t forward, wait for M!
1 LW $2,10($1) I D X M W
2 AND $12,$2,$5 I D X M W
3 OR $13,$6,$2 I D X M W
ADD $14,$2,$2 I D X M W
SW $15,10($2) I D X M W
I D X M W
I D X M W
I D X M W
I D X M W
I D X M W
WMXDI
17
Forwarding with “lw”• Clock cycle 4 insert “bubble” in X, forward M
1 LW $2,10($1) I D X M W
2 ? I D X M W
3 AND $12,$2,$5 I D D X M W
4 OR $13,$6,$2 I I D X M W
ADD $14,$2,$2 I D X M W
SW $15,10($2) I D X M W
I D X M W
I D X M W
I D X M W
I D X M W
WM?DI
18
Forwarding with “lw”• Clock cycle 5 forward W (automagic by RegFile)
1 LW $2,10($1) I D X M W
2 ? I D ? ? W
3 AND $12,$2,$5 I D D X M W
4 OR $13,$6,$2 I I D X M W
5 ADD $14,$2,$2 I D X M W
SW $15,10($2) I D X M W
I D X M W
I D X M W
I D X M W
I D X M W
W?XDI
19
Forwarding with “lw”• Clock cycle 6
1 LW $2,10($1) I D X M W
2 ? I D ? ? W
3 AND $12,$2,$5 I D D X M W
4 OR $13,$6,$2 I I D X M W
5 ADD $14,$2,$2 I D X M W
6 SW $15,10($2) I D X M W
I D X M W
I D X M W
I D X M W
I D X M W
?MXDI
20
Forwarding with “lw”• Clock cycle 7 (forwarding shown by arrows)
1 LW $2,10($1) I D X M W
2 I D ? ? ?
3 AND $12,$2,$5 I D D X M W
4 OR $13,$6,$2 I I D X M W
5 ADD $14,$2,$2 I D X M W
6 SW $15,10($2) I D X M W
7 I D X M W
I D X M W
I D X M W
I D X M W
WMXDI
21
Summary: Forwarding with “lw”• Called “delayed load” or “load-use delay” or “load delay slot”
– Impossible to get result after “X”– Must wait for M stage– May need to wait >1 cycle in deep pipelines– Causes a Data Hazard which cannot be resolved with forwarding
• Hardware solutions (our textbook MIPS does this!)– Formally, pipeline stall, sometimes interlock. Informally: bubble– Need to detect dependence: “Hazard Detection Unit”– Some CPUs do not do this (eg, very very early MIPS)– Software help:
• To reduce performance loss, don’t place dependent instruction (RAW) right after “lw”
• Lazy solution (don’t do this!)– No Hazard Detection Unit– Require software help (eg, insert NOP)– Sometimes called a “feature”– Not considered a “good” solution anymore
22
Hazard Detection Unit
23
Control Hazards
• Next lecture !