Upload
buidieu
View
230
Download
3
Embed Size (px)
Citation preview
CSE 141 - Carro
A Pipelined CPU
The beauty of parallel operations
CSE 141 - Carro
Review -- Single Cycle CPU
CSE 141 - Carro
Review -- Multiple Cycle CPU
CSE 141 - Carro
Review -- Instruction Latencies
Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5
Ifetch Reg/Dec Exec Mem WrLoad
Ifetch Reg/Dec Exec Mem WrLoad
Single-Cycle CPU
Multiple Cycle CPU
Ifetch Reg/Dec Exec WrAdd
CSE 141 - Carro
Instruction Latencies and ThroughputSingle-Cycle CPU
Multiple Cycle CPU
Pipelined CPU
Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8
Ifetch Reg/Dec Exec Mem WrLoad
Ifetch Reg/Dec Exec Mem WrLoad
Ifetch Reg/Dec Exec Mem WrLoad
Ifetch Reg/Dec Exec Mem WrLoad
Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5
Ifetch Reg/Dec Exec Mem WrLoad
Ifetch Reg/Dec Exec Mem WrLoad
CSE 141 - Carro
Pipelining Advantages
Higher maximum throughput
Higher utilization of CPU resources
Ideal speedup is number of stages in the pipeline. Do weachieve this?
What makes it easy:
But, more complicated datapath, more complex control
CSE 141 - Carro
Pipelining Advantages
CPU Design Technology
Single-Cycle CPU
Multiple-Cycle CPU
Pipelined CPU
Control Logic
Combinational Logic
FSM or Microprogram
Combinational Logic
Peak Throughput
1, but slow clock
3, but fast clock
1, with fast clock3
CSE 141 - Carro
Pipelining in Modern CPUs
CPU Datapath
Arithmetic Units
System Buses
Software (at multiple levels)
etc...
CSE 141 - Carro
A Pipelined Datapath
IF: Instruction fetch
ID: Instruction decode and register fetch
EX: Execution and effective address calculation
MEM: Memory access
WB: Write back
Basic Question -> Basic Idea
Each HW resource has a specific task. What do we need to add to actually split the datapath into stages?
CSE 141 - Carro
Pipelined Datapath
Instruction Inst
ruct
ion
A word of advice: there is a hidden problem here! Can you find it?
CSE 141 - Carro
Corrected datapath
Instruction Inst
ruct
ion
CSE 141 - Carro
Graphically Representing Pipelines
IM Reg DM Reg
IM Reg DM Reg
CC 1 CC 2 CC 3 CC 4 CC 5 CC 6
Time (in clock cycles)
lw $10, 20($1)
Program
The physical components are not there, it is just a representationCan help with answering questions like:
how many cycles does it take to execute this code?what is the ALU doing during cycle 4?use this representation to help understand datapaths
CSE 141 - Carro
Execution in a Pipelined Datapath
IM Reg
AL
U DM Reg
IM Reg
AL
U DM Reg
IM Reg
AL
U DM Reg
IM Reg
AL
U DM Reg
IM Reg
AL
U DM Reg
CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8 CC9
lw
lw
lw
lw
lw
steadystate
IF ID EX MEM WB
IF ID EX MEM WB
CSE 141 - Carro
Mixed Instructions in the Pipeline
IM Reg
AL
U Reg
IM Reg
AL
U DM Reg
CC1 CC2 CC3 CC4 CC5 CC6
lw
add
CSE 141 - Carro
Pipeline Principles
All instructions that share a pipeline must have the samestages in the same order.
add
sw
All intermediate values must be latched each cycle.
There is no functional block reuse (in the same instruction)
IM Reg
AL
U DM Reg
IF ID EX MEM WB
Because of this, the HW resembles the one of the single cycle CPU
CSE 141 - Carro
Pipelined DatapathInstruction Fetch Instruction Decode/
Register FetchExecute/
Address CalculationMemory Access Write Back
registers!
Instructionmemory
Address
4
32
0
AddAdd
result
Shiftleft 2
IF/ID EX/MEM MEM/WB
Mux
0
1
Add
PC
0Writedata
Mux
1Registers
Readdata 1
Readdata 2
Readregister 1
Readregister 2
16Sign
extend
Writeregister
Writedata
Readdata
1
ALUresult
Mux
ALU
Zero
ID/EX
Datamemory
Address
CSE 141 - Carro
The Pipeline in Executionadd $10, $1, $2 Instruction Decode/
Register FetchExecute/
Address CalculationMemory Access Write Back
Instructionmemory
Address
4
32
0
AddAdd
result
Shiftleft 2
IF/ID EX/MEM MEM/WB
Mux
0
1
Add
PC
0Writedata
Mux
1Registers
Readdata 1
Readdata 2
Readregister 1
Readregister 2
16Sign
extend
Writeregister
Writedata
Readdata
1
ALUresult
Mux
ALU
Zero
ID/EX
Datamemory
Address
CSE 141 - Carro
The Pipeline in Executionlw $12, 1000($4) add $10, $1, $2 Execute/
Address CalculationMemory Access Write Back
Instructionmemory
Address
4
32
0
AddAdd
result
Shiftleft 2
IF/ID EX/MEM MEM/WB
Mux
0
1
Add
PC
0Writedata
Mux
1Registers
Readdata 1
Readdata 2
Readregister 1
Readregister 2
16Sign
extend
Writeregister
Writedata
Readdata
1
ALUresult
Mux
ALU
Zero
ID/EX
Datamemory
Address
CSE 141 - Carro
The Pipeline in Executionsub $15, $4, $1 lw $12, 1000($4) add $10, $1, $2 Memory Access Write Back
Instructionmemory
Address
4
32
0
AddAdd
result
Shiftleft 2
IF/ID EX/MEM MEM/WB
Mux
0
1
Add
PC
0Writedata
Mux
1Registers
Readdata 1
Readdata 2
Readregister 1
Readregister 2
16Sign
extend
Writeregister
Writedata
Readdata
1
ALUresult
Mux
ALU
Zero
ID/EX
Datamemory
Address
CSE 141 - Carro
The Pipeline in ExecutionInstruction Fetch sub $15, $4, $1 lw $12, 1000($4) add $10, $1, $2 Write Back
Instructionmemory
Address
4
32
0
AddAdd
result
Shiftleft 2
IF/ID EX/MEM MEM/WB
Mux
0
1
Add
PC
0Writedata
Mux
1Registers
Readdata 1
Readdata 2
Readregister 1
Readregister 2
16Sign
extend
Writeregister
Writedata
Readdata
1
ALUresult
Mux
ALU
Zero
ID/EX
Datamemory
Address
CSE 141 - Carro
The Pipeline in ExecutionInstruction Fetch Instruction Decode/
Register Fetchsub $15, $4, $1 lw $12, 1000($4) add $10, $1, $2
Instructionmemory
Address
4
32
0
AddAdd
result
Shiftleft 2
IF/ID EX/MEM MEM/WB
Mux
0
1
Add
PC
0Writedata
Mux
1Registers
Readdata 1
Readdata 2
Readregister 1
Readregister 2
16Sign
extend
Writeregister
Writedata
Readdata
1
ALUresult
Mux
ALU
Zero
ID/EX
Datamemory
Address
CSE 141 - Carro
The Pipeline in ExecutionInstruction Fetch Instruction Decode/
Register FetchExecute/
Address Calculationsub $15, $4, $1 lw $12, 1000($4)
Instructionmemory
Address
4
32
0
AddAdd
result
Shiftleft 2
IF/ID EX/MEM MEM/WB
Mux
0
1
Add
PC
0Writedata
Mux
1Registers
Readdata 1
Readdata 2
Readregister 1
Readregister 2
16Sign
extend
Writeregister
Writedata
Readdata
1
ALUresult
Mux
ALU
Zero
ID/EX
Datamemory
Address
CSE 141 - Carro
Pipeline Control
PC
Instruction Inst
ruct
ion
ALUOp
RegDst
ALUSrc
16 32ALU
MemRead
Instruction
AddAdd
Add
0
1
M
CSE 141 - Carro
Pipelined Control
FSM not really appropriate (many details to remember)
Combinational Logic, at the right time!
IF/I
D
ID/E
X
EX
/ME
M
ME
M/W
B
controlinstruction
CSE 141 - Carro
Pipelined Control Signals
Execution Stage Control Lines Memory Stage Control Lines Write Back Stage ControlLines
Instruction RegDst ALUOp1 ALUOp0 ALUSrc Branch MemRead MemWrite RegWrite MemtoRegR-Format 1 1 0 0 0 0 0 1 0lw 0 0 0 1 0 1 0 1 1sw x 0 0 1 0 0 1 0 xbeq x 0 1 0 1 0 0 0 x
Control
EX
M
WB
M
WB
WB
IF/ID ID/EX EX/MEM MEM/WB
Instruction
CSE 141 - Carro
The Pipeline with Control Logic
PC
Instruction Inst
ruct
ion
Mem
toR
eg
Reg
Writ
e
Mem
Writ
e
CSE 141 - Carro
Is it really that easy?
What happens when...add $3, $10, $11
lw $8, 1000($3)
sub $11, $8, $7
Typical problem of starting something without having finished the previous task
CSE 141 - Carro
The Pipeline in Executionlw $8, 1000($3) add $3, $10, $11 Execute/
Address CalculationMemory Access Write Back
Instructionmemory
Address
4
32
0
AddAdd
result
Shiftleft 2
IF/ID EX/MEM MEM/WB
Mux
0
1
Add
PC
0Writedata
Mux
1Registers
Readdata 1
Readdata 2
Readregister 1
Readregister 2
16Sign
extend
Writeregister
Writedata
Readdata
1
ALUresult
Mux
ALU
Zero
ID/EX
Datamemory
Address
CSE 141 - Carro
The Pipeline in Executionsub $11, $8, $7 lw $8, 1000($3) add $3, $10, $11 Memory Access Write Back
Instructionmemory
Address
4
32
0
AddAdd
result
Shiftleft 2
IF/ID EX/MEM MEM/WB
Mux
0
1
Add
PC
0Writedata
Mux
1Registers
Readdata 1
Readdata 2
Readregister 1
Readregister 2
16Sign
extend
Writeregister
Writedata
Readdata
1
ALUresult
Mux
ALU
Zero
ID/EX
Datamemory
Address
CSE 141 - Carro
The Pipeline in Executionadd $10, $1, $2 sub $11, $8, $7 lw $8, 1000($3) add $3, $10, $11 Write Back
Instructionmemory
Address
4
32
0
AddAdd
result
Shiftleft 2
IF/ID EX/MEM MEM/WB
Mux
0
1
Add
PC
0Writedata
Mux
1Registers
Readdata 1
Readdata 2
Readregister 1
Readregister 2
16Sign
extend
Writeregister
Writedata
Readdata
1
ALUresult
Mux
ALU
Zero
ID/EX
Datamemory
Address
CSE 141 - Carro
Data HazardsWhen a result is needed in the pipeline before it is
IM Reg
AL
U DM Reg
IM Reg
AL
U DM
IM Reg
AL
U DM Reg
IM Reg A
LU DM Reg
IM Reg
AL
U DM Reg
CC1 CC2 CC3 CC4 CC5 CC6 CC7 CC8
sub $2, $1, $3
and $12, $2, $5
or $13, $6, $2
add $14, $2, $2
sw $15, 100($2)
R2 Available
R2 Needed