Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
EECC550 - ShaabanEECC550 - Shaaban#1 Lec # 5 Spring 2001 13-28-2001
CPU Design StepsCPU Design Steps1. Analyze instruction set operations using independent
RTN => datapath requirements.
2. Select set of datapath components & establish clockmethodology.
3. Assemble datapath meeting the requirements.
4. Analyze implementation of each instruction to determinesetting of control points that effects the register transfer.
5. Assemble the control logic.
EECC550 - ShaabanEECC550 - Shaaban#2 Lec # 5 Spring 2001 13-28-2001
CPU Design & Implantation ProcessCPU Design & Implantation Process• Bottom-up Design:
– Assemble components in target technology to establish critical timing.
• Top-down Design:– Specify component behavior from high-level requirements.
• Iterative refinement:– Establish a partial solution, expand and improve.
datapath control
processorInstruction SetArchitecture
=>
Reg. File Mux ALU Reg Mem Decoder Sequencer
Cells Gates
EECC550 - ShaabanEECC550 - Shaaban#3 Lec # 5 Spring 2001 13-28-2001
Single Cycle MIPS Datapath: Single Cycle MIPS Datapath: CPI = 1, Long Clock CycleCPI = 1, Long Clock Cycleim
m16
32
ALUctr
Clk
busW
RegWr
3232
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst
Exten
der
Mu
x
3216imm16
ALUSrcExtOp
Mu
x
MemtoReg
Clk
Data InWrEn32 Adr
DataMemory
MemWrA
LU
Equal
Instruction<31:0>
0
1
0
1
01
<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRtRs
=
Ad
der
Ad
der
PC
Clk
00
Mu
x
4
nPC_sel
PC
Ext
Adr
InstMemory
EECC550 - ShaabanEECC550 - Shaaban#4 Lec # 5 Spring 2001 13-28-2001
Drawback of Single Cycle ProcessorDrawback of Single Cycle Processor
• Long cycle time.
• All instructions must take as much time as the slowest:– Cycle time for load is longer than needed for all other
instructions.
• Real memory is not as well-behaved as idealized memory– Cannot always complete data access in one (short) cycle.
EECC550 - ShaabanEECC550 - Shaaban#5 Lec # 5 Spring 2001 13-28-2001
Abstract View of Single Cycle CPUAbstract View of Single Cycle CPU
PC
Nex
t P
C
Reg
iste
rF
etch ALU Re
g.
Wrt
Mem
Acc
ess
Dat
aM
emInst
ruct
ion
Fet
ch
Res
ult
Sto
re
AL
Uct
r
Reg
Dst
AL
USr
c
Ext
Op
Mem
Wr
Eq
ual
nPC
_sel
Reg
Wr
Mem
Wr
Mem
Rd
MainControl
ALUcontrol
op
fun
Ext
EECC550 - ShaabanEECC550 - Shaaban#6 Lec # 5 Spring 2001 13-28-2001
Single Cycle Instruction TimingSingle Cycle Instruction Timing
PC Inst Memory mux ALU Data Mem mux
PC Reg FileInst Memory mux ALU mux
PC Inst Memory mux ALU Data Mem
PC Inst Memory cmp mux
Reg File
Reg File
Reg File
Arithmetic & Logical
Load
Store
Branch
Critical Path
setup
setup
EECC550 - ShaabanEECC550 - Shaaban#7 Lec # 5 Spring 2001 13-28-2001
Reducing Cycle Time: Multi-Cycle DesignReducing Cycle Time: Multi-Cycle Design• Cut combinational dependency graph by inserting registers / latches.
• The same work is done in two or more fast cycles, rather than one slowcycle.
storage element
Acyclic CombinationalLogic
storage element
storage element
Acyclic CombinationalLogic (A)
storage element
storage element
Acyclic CombinationalLogic (B)
=>
EECC550 - ShaabanEECC550 - Shaaban#8 Lec # 5 Spring 2001 13-28-2001
Clock Cycle Time & Critical PathClock Cycle Time & Critical Path
• Critical path: the slowest path between any two storage devices
• Cycle time is a function of the critical path
• must be greater than:
– Clock-to-Q + Longest Path through the Combination Logic +Setup
Clk
.
.
.
.
.
.
.
.
.
.
.
.
EECC550 - ShaabanEECC550 - Shaaban#9 Lec # 5 Spring 2001 13-28-2001
Instruction Processing CyclesInstruction Processing Cycles
Obtain instruction from program storage
Determine instruction type
Obtain operands from registers
Compute result value or status
Store result in register/memory if needed
(usually called Write Back).
Update program counter to address
of next instruction } Commonsteps for all instructions
Instruction
Fetch
Instruction
Decode
Execute
Result
Store
Next
Instruction
EECC550 - ShaabanEECC550 - Shaaban#10 Lec # 5 Spring 2001 13-28-2001
Partitioning The Single Cycle DatapathPartitioning The Single Cycle DatapathAdd registers between smallest steps
PC
Nex
t P
C
Ope
rand
Fet
ch Exec Re
g.
File
Mem
Acc
ess
Dat
aM
em
Inst
ruct
ion
Fet
ch
Res
ult
Sto
re
AL
Uct
r
Reg
Dst
AL
USr
c
Ext
Op
Mem
Wr
nPC
_sel
Reg
Wr
Mem
Wr
Mem
Rd
EECC550 - ShaabanEECC550 - Shaaban#11 Lec # 5 Spring 2001 13-28-2001
Example Multi-cycle DatapathExample Multi-cycle Datapath
PC
Nex
t P
C
Ope
rand
Fet
ch
Ext
ALU Re
g.
File
Mem
Acc
ess
Dat
aM
em
Inst
ruct
ion
Fet
ch
Res
ult
Sto
re
AL
Uct
r
Reg
Dst
AL
USr
c
Ext
Op
nPC
_sel
Reg
Wr
Mem
Wr
Mem
Rd
IR
A
B
R
M
RegFile
Mem
ToR
eg
Equ
al
Registers added:
IR: Instruction registerA, B: Two registers to hold operands read from register file.R: or ALUOut, holds the output of the ALUM: or Memory data register (MDR) to hold data read from data memory
EECC550 - ShaabanEECC550 - Shaaban#12 Lec # 5 Spring 2001 13-28-2001
Operations In Each CycleOperations In Each Cycle
Instruction Fetch
Instruction Decode
Execution
Memory
WriteBack
R-Type
IR ← ← Mem[PC]
A ←← R[rs]
B ← ← R[rt]
R ←← A + B
R[rd] ← ← R
PC ← ← PC + 4
Logic Immediate
IR ←← Mem[PC]
A ←← R[rs]
R ← ← A OR ZeroExt[imm16]
R[rt] ← ← R PC ← ← PC + 4
Load
IR ←← Mem[PC]
A ←← R[rs]
R ← ← A + SignEx(Im16)
M ← ← Mem[R]
R[rd] ← ← MPC ← ← PC + 4
Store
IR ← ← Mem[PC]
A ←← R[rs]
B ← ← R[rt]
R ←← A + SignEx(Im16)
Mem[R] ←← B
PC ← ← PC + 4
Branch
IR ←← Mem[PC]
A ←← R[rs]
B ← ← R[rt]
If Equal = 1
PC ←← PC + 4 +
(SignExt(imm16) x4)
else
PC ←← PC + 4
IF
ID
EX
MEM
WB
EECC550 - ShaabanEECC550 - Shaaban#13 Lec # 5 Spring 2001 13-28-2001
MIPS Multi-Cycle Datapath:MIPS Multi-Cycle Datapath:
Five Cycles of LoadFive Cycles of LoadCycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5
IF ID EX MEM WBLoad
1- Instruction Fetch (IF) Instruction Fetch• Fetch the instruction from the Instruction Memory.
2- Instruction Decode (ID): Registers Fetch and Instruction Decode.
3- Execute (EX): Calculate the effective memory address.
4- Memory (MEM): Read the data from the Data Memory.
5- Write Back (WB): Write the data back to the register file. Update PC.
EECC550 - ShaabanEECC550 - Shaaban#14 Lec # 5 Spring 2001 13-28-2001
Single Cycle Vs. Multi-Cycle CPUSingle Cycle Vs. Multi-Cycle CPU
Clk
Cycle 1
Multiple Cycle Implementation:
IF ID EX MEM WB
Cycle 2 Cycle 3 Cycle 4 Cycle 5 Cycle 6 Cycle 7 Cycle 8 Cycle 9 Cycle 10
IF ID EX MEM
Load Store
Clk
Single Cycle Implementation:
Load Store Waste
IF
R-type
Cycle 1 Cycle 2
8 ns
2ns
EECC550 - ShaabanEECC550 - Shaaban#15 Lec # 5 Spring 2001 13-28-2001
Finite State Machine (FSM) Control ModelFinite State Machine (FSM) Control Model
• State specifies control points for Register Transfer.
• Transfer occurs upon exiting state (same falling edge).
State X
Register TransferControl Points
Depends on Input
Control State
Next StateLogic
Output Logic
inputs (conditions)
outputs (control points)
EECC550 - ShaabanEECC550 - Shaaban#16 Lec # 5 Spring 2001 13-28-2001
Control Specification For Multi-cycle CPUControl Specification For Multi-cycle CPUFinite State Machine (FSM)Finite State Machine (FSM)
IR ←← MEM[PC]
R-type
A ←← R[rs]B ←← R[rt]
R ← ← A fun B
R[rd] ← ← RPC ←← PC + 4
R ← ← A or ZX
R[rt] ←← RPC ← ← PC + 4
ORi
R ← ← A + SX
R[rt] ←← MPC ←← PC + 4
M ← ← MEM[R]
LW
R ← ← A + SX
MEM[R] ←← BPC ← ← PC + 4
BEQ & Equal
BEQ & ~Equal
PC ←← PC + 4 PC ← ← PC + SX || 00
SW
“instruction fetch”
“decode / operand fetch”
Exe
cute
Mem
ory
Wri
te-b
ack
To instruction fetch
To instruction fetchTo instruction fetch
EECC550 - ShaabanEECC550 - Shaaban#17 Lec # 5 Spring 2001 13-28-2001
Traditional FSM ControllerTraditional FSM Controller
State
6
4
11nextState
op
Equal
control points
state op condnextstate control points
Truth or Transition Table
datapath State
To datapath
EECC550 - ShaabanEECC550 - Shaaban#18 Lec # 5 Spring 2001 13-28-2001
Traditional FSM ControllerTraditional FSM Controller
datapath + state diagram => controldatapath + state diagram => control
• Translate RTN statements intocontrol points.
• Assign states.
• Implement the controller.
EECC550 - ShaabanEECC550 - Shaaban#19 Lec # 5 Spring 2001 13-28-2001
Mapping Mapping RTNsRTNs To Control Points Examples To Control Points Examples& State Assignments& State Assignments
IR ← ← MEM[PC]
0000
R-type
A ←← R[rs]B ←← R[rt] 0001
R ← ← A fun B 0100
R[rd] ← ← RPC ←← PC + 4
0101
R ← ← A or ZX 0110
R[rt] ←← RPC ← ← PC + 4
0111
ORi
R ← ← A + SX 1000
R[rt] ←← MPC ←← PC + 4
1010
M ← ← MEM[S] 1001
LW
R ← ← A + SX 1011
MEM[S] ←← BPC ← ← PC + 4 1100
BEQ & Equal
BEQ & ~Equal
PC ←← PC + 4 0011
PC ← ← PC + SX || 00 0010
SW
“instruction fetch”
“decode / operand fetch”
Exe
cute
Mem
ory
Wri
te-b
ack
imem_rd, IRen
Aen, Ben
ALUfun, Sen
RegDst,RegWr,PCen To instruction fetch
state 0000
To instruction fetch state 0000To instruction fetch state 0000
EECC550 - ShaabanEECC550 - Shaaban#20 Lec # 5 Spring 2001 13-28-2001
Detailed Control SpecificationState Op field Eq Next IR PC Ops Exec Mem Write-Back
en sel A B Ex Sr ALU S R W M M-R Wr Dst0000 ?????? ? 0001 10001 BEQ 0 0011 1 10001 BEQ 1 0010 1 10001 R-type x 0100 1 10001 orI x 0110 1 10001 LW x 1000 1 10001 SW x 1011 1 10010 xxxxxx x 0000 1 10011 xxxxxx x 0000 1 00100 xxxxxx x 0101 0 1 fun 10101 xxxxxx x 0000 1 0 0 1 10110 xxxxxx x 0111 0 0 or 10111 xxxxxx x 0000 1 0 0 1 01000 xxxxxx x 1001 1 0 add 11001 xxxxxx x 1010 1 0 01010 xxxxxx x 0000 1 0 1 1 01011 xxxxxx x 1100 1 0 add 11100 xxxxxx x 0000 1 0 0 1
R
ORI
LW
SW
BEQ
EECC550 - ShaabanEECC550 - Shaaban#21 Lec # 5 Spring 2001 13-28-2001
Alternative Multiple Cycle Datapath (In Textbook)• Miminizes Hardware: 1 memory, 1 adder
IdealMemoryWrAdrDin
RAdr
32
32
32Dout
MemWr
32
AL
U
3232
ALUOp
ALUControl
Instruction Reg
32
IRWr
32
Reg File
Ra
Rw
busW
Rb5
5
32busA
32busB
RegWr
Rs
Rt
Mux
0
1
Rt
Rd
PCWr
ALUSelA
Mux 01
RegDst
Mux
0
1
32
PC
MemtoReg
Extend
ExtOp
Mux
0
132
0
1
23
4
16Imm 32
<< 2
ALUSelB
Mux
1
0
Target32
Zero
ZeroPCWrCond PCSrc BrWr
32
IorD
AL
U O
ut
EECC550 - ShaabanEECC550 - Shaaban#22 Lec # 5 Spring 2001 13-28-2001
Alternative Multiple Cycle Datapath (In Textbook)
•Shared instruction/data memory unit• A single ALU shared among instructions• Shared units require additional or widened multiplexors• Temporary registers to hold data between clock cycles of the instruction:
• Additional registers: Instruction Register (IR), Memory Data Register (MDR), A, B, ALUOut
EECC550 - ShaabanEECC550 - Shaaban#23 Lec # 5 Spring 2001 13-28-2001
Operations In Each CycleOperations In Each Cycle
Instruction Fetch
Instruction Decode
Execution
Memory
WriteBack
R-Type
IR ← ← Mem[PC]PC ←← PC + 4
A ←← R[rs]
B ← ← R[rt]
ALUout ←← PC +(SignExt(imm16)x4)
ALUout ←← A + B
R[rd] ← ← ALUout
Logic Immediate
IR ←← Mem[PC]PC ←← PC + 4
A ←← R[rs]
B ← ← R[rt]
ALUout ←← PC +
(SignExt(imm16) x4)
ALUout ← ←
A OR ZeroExt[imm16]
R[rt] ← ← ALUout
Load
IR ←← Mem[PC]PC ←← PC + 4
A ←← R[rs]
B ← ← R[rt]
ALUout ←← PC +
(SignExt(imm16) x4)
ALUout ← ←
A + SignEx(Im16)
M ← ← Mem[ALUout]
R[rd] ← ← Mem
Store
IR ← ← Mem[PC]PC ←← PC + 4
A ←← R[rs]
B ← ← R[rt]
ALUout ←← PC +
(SignExt(imm16) x4)
ALUout ←←
A + SignEx(Im16)
Mem[ALUout] ←← B
Branch
IR ←← Mem[PC]PC ←← PC + 4
A ←← R[rs]
B ← ← R[rt]
ALUout ←← PC +
(SignExt(imm16) x4)
If Equal = 1
PC ← ← ALUout
EECC550 - ShaabanEECC550 - Shaaban#24 Lec # 5 Spring 2001 13-28-2001
High-Level View of Finite StateHigh-Level View of Finite StateMachine ControlMachine Control
• First steps are independent of the instruction class• Then a series of sequences that depend on the instruction opcode• Then the control returns to fetch a new instruction.• Each box above represents one or several state.
EECC550 - ShaabanEECC550 - Shaaban#25 Lec # 5 Spring 2001 13-28-2001
Instruction Fetch and DecodeInstruction Fetch and DecodeFSM StatesFSM States
EECC550 - ShaabanEECC550 - Shaaban#26 Lec # 5 Spring 2001 13-28-2001
Load/Store Instructions FSM StatesLoad/Store Instructions FSM States
EECC550 - ShaabanEECC550 - Shaaban#27 Lec # 5 Spring 2001 13-28-2001
R-Type InstructionsR-Type InstructionsFSM StatesFSM States
EECC550 - ShaabanEECC550 - Shaaban#28 Lec # 5 Spring 2001 13-28-2001
Jump InstructionJump InstructionSingle StateSingle State
Branch InstructionBranch InstructionSingle StateSingle State
EECC550 - ShaabanEECC550 - Shaaban#29 Lec # 5 Spring 2001 13-28-2001
EECC550 - ShaabanEECC550 - Shaaban#30 Lec # 5 Spring 2001 13-28-2001
Finite State Machine (FSM) Specification Finite State Machine (FSM) SpecificationIR ← MEM[PC]PC ← PC + 4
R-type
ALUout ← A fun B
R[rd] ← ALUout
ALUout ← A op ZX
R[rt] ← ALUout
ORiALUout
← A + SX
R[rt] ← M
M ← MEM[ALUout]
LW
ALUout ← A + SX
MEM[ALUout] ← B
SW
“instruction fetch”
“decode”
Exe
cute
Mem
ory
Writ
e-ba
ck
0000
0001
0100
0101
0110
0111
1000
1001
1010
1011
1100
BEQ
0010
If A = B thenPC ← ALUout
A ← R[rs]B ← R[rt]
ALUout ← ← PC +SX
To instruction fetch
To instruction fetchTo instruction fetch
EECC550 - ShaabanEECC550 - Shaaban#31 Lec # 5 Spring 2001 13-28-2001
MIPS Multi-cycle DatapathMIPS Multi-cycle DatapathPerformance EvaluationPerformance Evaluation
• What is the average CPI?– State diagram gives CPI for each instruction type
– Workload below gives frequency of each type
Type CPIi for type Frequency CPIi x freqIi
Arith/Logic 4 40% 1.6
Load 5 30% 1.5
Store 4 10% 0.4
branch 3 20% 0.6
Average CPI: 4.1
Better than CPI = 5 if all instructions took the same number of clock cycles (5).