View
220
Download
2
Tags:
Embed Size (px)
Citation preview
EECC550 - ShaabanEECC550 - Shaaban#1 Lec # 5 Winter 2000 12-20-2000
CPU Design StepsCPU Design Steps1. Analyze instruction set operations using independent
RTN => datapath requirements.
2. Select set of datapath components & establish clock methodology.
3. Assemble datapath meeting the requirements.
4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer.
5. Assemble the control logic.
EECC550 - ShaabanEECC550 - Shaaban#2 Lec # 5 Winter 2000 12-20-2000
CPU Design & Implantation ProcessCPU Design & Implantation Process• Bottom-up Design:
– Assemble components in target technology to establish critical timing.
• Top-down Design:– Specify component behavior from high-level requirements.
• Iterative refinement:– Establish a partial solution, expand and improve.
datapath control
processorInstruction SetArchitecture
=>
Reg. File Mux ALU Reg Mem Decoder Sequencer
Cells Gates
EECC550 - ShaabanEECC550 - Shaaban#3 Lec # 5 Winter 2000 12-20-2000
Single Cycle MIPS Datapath: Single Cycle MIPS Datapath: CPI = 1, Long Clock CycleCPI = 1, Long Clock Cycleim
m16
32
ALUctr
Clk
busW
RegWr
32
32
busA
32busB
55 5
Rw Ra Rb32 32-bitRegisters
Rs
Rt
Rt
RdRegDst
Exten
der
Mu
x
3216imm16
ALUSrcExtOp
Mu
x
MemtoReg
Clk
Data InWrEn32 Adr
DataMemory
MemWrA
LU
Equal
Instruction<31:0>
0
1
0
1
01
<21:25>
<16:20>
<11:15>
<0:15>
Imm16RdRtRs
=
Ad
der
Ad
der
PC
Clk
00
Mu
x
4
nPC_sel
PC
Ext
Adr
InstMemory
EECC550 - ShaabanEECC550 - Shaaban#4 Lec # 5 Winter 2000 12-20-2000
Drawback of Single Cycle ProcessorDrawback of Single Cycle Processor
• Long cycle time.
• All instructions must take as much time as the slowest:– Cycle time for load is longer than needed for all other
instructions.
• Real memory is not as well-behaved as idealized memory– Cannot always complete data access in one (short) cycle.
EECC550 - ShaabanEECC550 - Shaaban#5 Lec # 5 Winter 2000 12-20-2000
Abstract View of Single Cycle CPUAbstract View of Single Cycle CPU
PC
Nex
t P
C
Reg
iste
rF
etch ALU Reg
. W
rt
Mem
Acc
ess
Dat
aM
emInst
ruct
ion
Fet
ch
Res
ult
Sto
re
AL
Uct
r
Reg
Dst
AL
US
rc
Ext
Op
Mem
Wr
Eq
ual
nPC
_sel
Reg
Wr
Mem
Wr
Mem
Rd
MainControl
ALUcontrol
op
fun
Ext
EECC550 - ShaabanEECC550 - Shaaban#6 Lec # 5 Winter 2000 12-20-2000
Single Cycle Instruction TimingSingle Cycle Instruction Timing
PC Inst Memory mux ALU Data Mem mux
PC Reg FileInst Memory mux ALU mux
PC Inst Memory mux ALU Data Mem
PC Inst Memory cmp mux
Reg File
Reg File
Reg File
Arithmetic & Logical
Load
Store
Branch
Critical Path
setup
setup
EECC550 - ShaabanEECC550 - Shaaban#7 Lec # 5 Winter 2000 12-20-2000
Reducing Cycle Time: Multi-Cycle DesignReducing Cycle Time: Multi-Cycle Design• Cut combinational dependency graph by inserting registers / latches.• The same work is done in two or more fast cycles, rather than one slow cycle.
storage element
Acyclic CombinationalLogic
storage element
storage element
Acyclic CombinationalLogic (A)
storage element
storage element
Acyclic CombinationalLogic (B)
=>
EECC550 - ShaabanEECC550 - Shaaban#8 Lec # 5 Winter 2000 12-20-2000
Clock Cycle Time & Critical PathClock Cycle Time & Critical Path
• Critical path: the slowest path between any two storage devices
• Cycle time is a function of the critical path
• must be greater than:
– Clock-to-Q + Longest Path through the Combination Logic + Setup
Clk
.
.
.
.
.
.
.
.
.
.
.
.
EECC550 - ShaabanEECC550 - Shaaban#9 Lec # 5 Winter 2000 12-20-2000
Instruction Processing CyclesInstruction Processing Cycles
Obtain instruction from program storage
Determine instruction type
Obtain operands from registers
Compute result value or status
Store result in register/memory if needed
(usually called Write Back).
Update program counter to address
of next instruction } Commonsteps for all instructions
Instruction
Fetch
Instruction
Decode
Execute
Result
Store
Next
Instruction
EECC550 - ShaabanEECC550 - Shaaban#10 Lec # 5 Winter 2000 12-20-2000
Partitioning The Single Cycle DatapathPartitioning The Single Cycle Datapath Add registers between smallest steps
PC
Nex
t P
C
Ope
rand
Fet
ch Exec Reg
. F
ile
Mem
Acc
ess
Dat
aM
em
Inst
ruct
ion
Fet
ch
Res
ult
Sto
re
AL
Uct
r
Reg
Dst
AL
US
rc
Ext
Op
Mem
Wr
nPC
_sel
Reg
Wr
Mem
Wr
Mem
Rd
EECC550 - ShaabanEECC550 - Shaaban#11 Lec # 5 Winter 2000 12-20-2000
Example Multi-cycle DatapathExample Multi-cycle Datapath
PC
Nex
t P
C
Ope
rand
Fet
ch
Ext
ALU Reg
. F
ile
Mem
Acc
ess
Dat
aM
em
Inst
ruct
ion
Fet
ch
Res
ult
Sto
re
AL
Uct
r
Reg
Dst
AL
US
rc
Ext
Op
nPC
_sel
Reg
Wr
Mem
Wr
Mem
Rd
IR
A
B
R
M
RegFile
Mem
ToR
eg
Equ
al
Registers added:
IR: Instruction registerA, B: Two registers to hold operands read from register file.R: or ALUOut, holds the output of the ALUM: or Memory data register (MDR) to hold data read from data memory
EECC550 - ShaabanEECC550 - Shaaban#12 Lec # 5 Winter 2000 12-20-2000
Operations In Each CycleOperations In Each Cycle
Instruction Fetch
Instruction Decode
Execution
Memory
WriteBack
R-Type
IR Mem[PC]
A R[rs]
B R[rt]
R A + B
R[rd] R
PC PC + 4
Logic Immediate
IR Mem[PC]
A R[rs]
R A OR ZeroExt[imm16]
R[rt] R
PC PC + 4
Load
IR Mem[PC]
A R[rs]
R A + SignEx(Im16)
M Mem[R]
R[rd] M
PC PC + 4
Store
IR Mem[PC]
A R[rs]
B R[rt]
R A + SignEx(Im16)
Mem[R] B
PC PC + 4
Branch
IR Mem[PC]
A R[rs]
B R[rt]
If Equal = 1
PC PC + 4 +
(SignExt(imm16) x4)
else
PC PC + 4
EECC550 - ShaabanEECC550 - Shaaban#13 Lec # 5 Winter 2000 12-20-2000
Finite State Machine (FSM) Control ModelFinite State Machine (FSM) Control Model
• State specifies control points for Register Transfer.
• Transfer occurs upon exiting state (same falling edge).
State X
Register TransferControl Points
Depends on Input
Control State
Next StateLogic
Output Logic
inputs (conditions)
outputs (control points)
EECC550 - ShaabanEECC550 - Shaaban#14 Lec # 5 Winter 2000 12-20-2000
Control Specification For Multi-cycle CPUControl Specification For Multi-cycle CPUFinite State Machine (FSM)Finite State Machine (FSM)
IR MEM[PC]
R-type
A R[rs]B R[rt]
R A fun B
R[rd] RPC PC + 4
R A or ZX
R[rt] RPC PC + 4
ORi
R A + SX
R[rt] MPC PC + 4
M MEM[R]
LW
R A + SX
MEM[R] BPC PC + 4
BEQ & Equal
BEQ & ~Equal
PC PC + 4 PC PC + SX || 00
SW
“instruction fetch”
“decode / operand fetch”
Execute
Memory
Write-back
To instruction fetch
To instruction fetchTo instruction fetch
EECC550 - ShaabanEECC550 - Shaaban#15 Lec # 5 Winter 2000 12-20-2000
Traditional FSM ControllerTraditional FSM Controller
State
6
4
11nextState
op
Equal
control points
state op condnextstate control points
Truth or Transition Table
datapath State
To datapath
EECC550 - ShaabanEECC550 - Shaaban#16 Lec # 5 Winter 2000 12-20-2000
Traditional FSM ControllerTraditional FSM Controller
datapath + state diagram => controldatapath + state diagram => control
• Translate RTN statements into control points.
• Assign states.
• Implement the controller.
EECC550 - ShaabanEECC550 - Shaaban#17 Lec # 5 Winter 2000 12-20-2000
Mapping RTNs To Control Points ExamplesMapping RTNs To Control Points Examples& State Assignments& State Assignments
IR MEM[PC]
0000
R-type
A R[rs]B R[rt] 0001
R A fun B 0100
R[rd] RPC PC + 4
0101
R A or ZX 0110
R[rt] RPC PC + 4
0111
ORi
R A + SX 1000
R[rt] MPC PC + 4
1010
M MEM[S] 1001
LW
R A + SX 1011
MEM[S] BPC PC + 4 1100
BEQ & Equal
BEQ & ~Equal
PC PC + 4 0011
PC PC + SX || 00 0010
SW
“instruction fetch”
“decode / operand fetch”
Execute
Memory
Write-back
imem_rd, IRen
Aen, Ben
ALUfun, Sen
RegDst,RegWr,PCen To instruction fetch
state 0000
To instruction fetch state 0000To instruction fetch state 0000
EECC550 - ShaabanEECC550 - Shaaban#18 Lec # 5 Winter 2000 12-20-2000
Detailed Control SpecificationState Op field Eq Next IR PC Ops Exec Mem Write-Back
en sel A B Ex Sr ALU S R W M M-R Wr Dst0000 ?????? ? 0001 10001 BEQ 0 0011 1 10001 BEQ 1 0010 1 10001 R-type x 0100 1 10001 orI x 0110 1 10001 LW x 1000 1 10001 SW x 1011 1 10010 xxxxxx x 0000 1 10011 xxxxxx x 0000 1 00100 xxxxxx x 0101 0 1 fun 10101 xxxxxx x 0000 1 0 0 1 10110 xxxxxx x 0111 0 0 or 10111 xxxxxx x 0000 1 0 0 1 01000 xxxxxx x 1001 1 0 add 11001 xxxxxx x 1010 1 0 01010 xxxxxx x 0000 1 0 1 1 01011 xxxxxx x 1100 1 0 add 11100 xxxxxx x 0000 1 0 0 1
R
ORI
LW
SW
BEQ
EECC550 - ShaabanEECC550 - Shaaban#19 Lec # 5 Winter 2000 12-20-2000
Alternative Multiple Cycle Datapath (In Textbook)• Miminizes Hardware: 1 memory, 1 adder
IdealMemoryWrAdrDin
RAdr
32
32
32Dout
MemWr
32
AL
U
3232
ALUOp
ALUControl
Instru
ction R
eg
32
IRWr
32
Reg File
Ra
Rw
busW
Rb5
5
32busA
32busB
RegWr
Rs
Rt
Mu
x
0
1
Rt
Rd
PCWr
ALUSelA
Mux 01
RegDst
Mu
x
0
1
32
PC
MemtoReg
Extend
ExtOp
Mu
x0
132
0
1
23
4
16Imm 32
<< 2
ALUSelB
Mu
x1
0
Target32
Zero
ZeroPCWrCond PCSrc BrWr
32
IorD
AL
U O
ut
EECC550 - ShaabanEECC550 - Shaaban#20 Lec # 5 Winter 2000 12-20-2000
Alternative Multiple Cycle Datapath (In Textbook)
•Shared instruction/data memory unit• A single ALU shared among instructions• Shared units require additional or widened multiplexors• Temporary registers to hold data between clock cycles of the instruction:
• Additional registers: Instruction Register (IR), Memory Data Register (MDR), A, B, ALUOut
EECC550 - ShaabanEECC550 - Shaaban#21 Lec # 5 Winter 2000 12-20-2000
Operations In Each CycleOperations In Each Cycle
Instruction Fetch
Instruction Decode
Execution
Memory
WriteBack
R-Type
IR Mem[PC]PC PC + 4
A R[rs]
B R[rt]
ALUout PC + (SignExt(imm16) x4)
ALUout A + B
R[rd] ALUout
Logic Immediate
IR Mem[PC]PC PC + 4
A R[rs]
B R[rt]
ALUout PC +
(SignExt(imm16) x4)
ALUout
A OR ZeroExt[imm16]
R[rt] ALUout
Load
IR Mem[PC]PC PC + 4
A R[rs]
B R[rt]
ALUout PC +
(SignExt(imm16) x4)
ALUout
A + SignEx(Im16)
M Mem[ALUout]
R[rd] Mem
Store
IR Mem[PC]PC PC + 4
A R[rs]
B R[rt]
ALUout PC +
(SignExt(imm16) x4)
ALUout
A + SignEx(Im16)
Mem[ALUout] B
Branch
IR Mem[PC]PC PC + 4
A R[rs]
B R[rt]
ALUout PC +
(SignExt(imm16) x4)
If Equal = 1
PC ALUout
EECC550 - ShaabanEECC550 - Shaaban#22 Lec # 5 Winter 2000 12-20-2000
High-Level View of Finite State High-Level View of Finite State Machine ControlMachine Control
• First steps are independent of the instruction class• Then a series of sequences that depend on the instruction opcode• Then the control returns to fetch a new instruction.• Each box above represents one or several state.
EECC550 - ShaabanEECC550 - Shaaban#23 Lec # 5 Winter 2000 12-20-2000
Instruction Fetch and Decode Instruction Fetch and Decode FSM StatesFSM States
EECC550 - ShaabanEECC550 - Shaaban#24 Lec # 5 Winter 2000 12-20-2000
Load/Store Instructions FSM StatesLoad/Store Instructions FSM States
EECC550 - ShaabanEECC550 - Shaaban#25 Lec # 5 Winter 2000 12-20-2000
R-Type Instructions R-Type Instructions FSM StatesFSM States
EECC550 - ShaabanEECC550 - Shaaban#26 Lec # 5 Winter 2000 12-20-2000
Jump Instruction Jump Instruction Single StateSingle State
Branch Instruction Branch Instruction Single StateSingle State
EECC550 - ShaabanEECC550 - Shaaban#28 Lec # 5 Winter 2000 12-20-2000
Finite State Machine (FSM) SpecificationFinite State Machine (FSM) SpecificationIR MEM[PC]
PC PC + 4
R-type
ALUout A fun B
R[rd] ALUout
ALUout A op ZX
R[rt] ALUout
ORiALUout
A + SX
R[rt] M
M MEM[ALUout]
LW
ALUout A + SX
MEM[ALUout] B
SW
“instruction fetch”
“decode”
Exe
cute
Mem
ory
Writ
e-ba
ck
0000
0001
0100
0101
0110
0111
1000
1001
1010
1011
1100
BEQ
0010
If A = B then PC ALUout
A R[rs]B R[rt]
ALUout PC +SX
To instruction fetch
To instruction fetchTo instruction fetch
EECC550 - ShaabanEECC550 - Shaaban#29 Lec # 5 Winter 2000 12-20-2000
MIPS Multi-cycle Datapath MIPS Multi-cycle Datapath Performance EvaluationPerformance Evaluation
• What is the average CPI?– State diagram gives CPI for each instruction type
– Workload below gives frequency of each type
Type CPIi for type Frequency CPIi x freqIi
Arith/Logic 4 40% 1.6
Load 5 30% 1.5
Store 4 10% 0.4
branch 3 20% 0.6
Average CPI: 4.1
Better than CPI = 5 if all instructions took the same number of clock cycles (5).