Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
CPU DESIGNThe Single-Cycle Implementation
Computer OrganizationCSE
2021
Shakil M. Khan
(adapted from Profs. H. Roumani, A. Asif)
Dept of CS & Eng, York University
CSE-2021 June-30-2011 2
Sequential vs. Combinational Circuits
Digital circuits can be classified into two categories:
1. Combinational Circuits: mux, ALU
2. Sequential Circuits: flip-flops, registers, memory
CSE-2021 June-30-2011 3
Clocks
Clock cycle
S ta teelement
1Combinational logic
S ta tee leme nt
2
• Periodic signal oscillating between low and high states with fixed cycle time
• Clock frequency = inverse of clock cycle time
• Clock controls when the state of a memory element changes
Clock period Rising e dge
Falling edge
CPU DESIGN
• The Datapath
• Single-Cycle Control
• Performance
Focus on the Subset:
addi, add/sub/and/or/slt, lw/sw, beq, j
CSE-2021 June-30-2011 4
Building the
Datapath
CSE-2021 June-30-2011
The Basic Datapath Components (1)
6
PC
1. Program counter
contains address of next instruction
16 32Sign
extend
2. Sign-extension unit
extends a 16-bit integer to a 32-bit integer
Add Sum
3. Adder
adds two 32-bit integers
4. AL U
ALU control
ALUre sult
ALU
Zero
3
add/subtract/and/or/compare two 32-bit integers
CSE-2021 June-30-2011
The Basic Datapath Components (2)
7
InstructionMemory
Instructionaddress
Instruction
5.Instruction memory
Register
numbers
7. Register Files
RegWrite
RegistersWriteregister
Readdata 1
Readdata 2
Readregister 1
Readregister 2
WritedataData
5
5
5 6. Data memory unit
MemRead
MemWrite
Datamemory
Writedata
Readdata
Address
The Basic Datapath Components
A
L
U
RFIM
BUS
P
C
CSE-2021 June-30-2011 8
The Basic Datapath [computational R-Type]
A
L
U
RFIM
P
C
CSE-2021 June-30-2011 9
Recall the ML Formats:
6 5 5 5 5 6
R opCode rs rt rd sa funCode
Register rs = source, rt = target, rd = destination.
6 5 5 16
I opCode rs rt immediate
6 26
J opCode immediate
CSE-2021 June-30-2011 10
The Basic Datapath [computational R-Type]
A
L
U
RFIM
25 – 21
20 – 16
15 – 11
P
C
CSE-2021 June-30-2011 11
The PC Circuitry
A
L
U
RF
P
C
IM
4
25 – 21
20 – 16
15 – 11
CSE-2021 June-30-2011 12
Add support for computational I-Types
0
1
A
L
U
P
C
IM
4
25 – 21
20 – 16
15-11
RF
CSE-2021 June-30-2011 13
15 – 11
Add support for computational I-Types
0
1
A
L
U
RF
0
1
SE
P
C
IM
4
25 – 21
20 – 16
15-11
CSE-2021 June-30-2011 14
0
1
A
L
U
RF
0
1
SE
P
C
IM
4
25 – 21
20 – 16
15-11
Add support for lw
DM
CSE-2021 June-30-2011 15
Add support for lw
DM
0
1
A
L
U
RF
0
1
SE
1
0
P
C
IM
4
25 – 21
20 – 16
15-11
CSE-2021 June-30-2011 16
Add support for sw
DM
0
1
A
L
U
RF
0
1
SE
1
0
P
C
IM
4
25 – 21
20 – 16
15-11
CSE-2021 June-30-2011 17
Add Support for branch
DM
0
1
A
L
U
RF
0
1
SE
1
0
P
C
IM
4 0
1
sll
25 – 21
20 – 16
15-11
CSE-2021 June-30-2011 18
19
Combined Datapath (w/o Jump)
PC
Instructionmemory
Readaddress
Instruction
16 32
Add ALUresult
Mux
Registers
Writeregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Shift
left 2
4
Mux
ALU operation3
RegWrite
MemRead
MemWrite
PCSrc
ALUSrc
MemtoReg
ALUresult
ZeroALU
Datamemory
Address
Writedata
Readdata M
ux
Sign
extend
Add
CSE-2021 June-30-2011 19
20
add/sub/or/and/slt $s1,$s2,$s3
PC
Instructionmemory
Readaddress
Instruction
16 32
Add ALUresult
Mux
Registers
Writeregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Shift
left 2
4
Mux
ALU operation3
RegWrite
MemRead
MemWrite
PCSrc
ALUSrc
MemtoReg
ALUresult
ZeroALU
Datamemory
Address
Writedata
Readdata M
ux
Sign
extend
Add
CSE-2021 June-30-2011 20
21
lw $s1, offset($s2)
PC
Instructionmemory
Readaddress
Instruction
16 32
Add ALUresult
Mux
Registers
Writeregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Shift
left 2
4
Mux
ALU operation3
RegWrite
MemRead
MemWrite
PCSrc
ALUSrc
MemtoReg
ALUresult
ZeroALU
Datamemory
Address
Writedata
Readdata M
ux
Sign
extend
Add
CSE-2021 June-30-2011 21
22
sw $s1, offset($s2)
PC
Instructionmemory
Readaddress
Instruction
16 32
Add ALUresult
Mux
Registers
Writeregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Shift
left 2
4
Mux
ALU operation3
RegWrite
MemRead
MemWrite
PCSrc
ALUSrc
MemtoReg
ALUresult
ZeroALU
Datamemory
Address
Writedata
Readdata M
ux
Sign
extend
Add
CSE-2021 June-30-2011 22
23
beq $s1, $s2, w_offset
PC
Instructionmemory
Readaddress
Instruction
16 32
Add ALUresult
Mux
Registers
Writeregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Shift
left 2
4
Mux
ALU operation3
RegWrite
MemRead
MemWrite
PCSrc
ALUSrc
MemtoReg
ALUresult
ZeroALU
Datamemory
Address
Writedata
Readdata M
ux
Sign
extend
Add
CSE-2021 June-30-2011 23
Shift left 2
PC
Instruction memory
Read address
Instruction [31– 0]
Data memory
Read data
Write data
RegistersWrite register
Write data
Read data 1
Read data 2
Read register 1
Read register 2
Instruction [15– 11]
Instruction [20– 16]
Instruction [25– 21]
Add
ALU result
Zero
Instruction [5– 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
Branch
JumpRegDst
ALUSrc
Instruction [31– 26]
4
M u x
Instruction [25–0] Jump address [31– 0]
PC+4 [31– 28]
Sign extend
16 32Instruction [15– 0]
1
M u x
1
0
M u x
0
1
M u x
0
1
ALU control
Control
AddALU
result
M u x
0
1 0
ALU
Shift left 2
26 28
Address
CSE-2021 June-30-2011 24
Building the
Control
Control
DM
0
1
A
L
U
RF
0
1
SE
1
0
P
C
IM
4 0
1
sll
clk
clk
CSE-2021 June-30-2011 26
Exercise
SIGNAL VALUE
ALUSrc
MemToReg
RegDst
RegWrite
MemRead
MemWrite
Branch
Jump
Operation (3-bit)
add $t0, $s0, $a0
CSE-2021 June-30-2011 27
Exercise
SIGNAL VALUE
ALUSrc
MemToReg
RegDst
RegWrite
MemRead
MemWrite
Branch
Jump
Operation (3-bit)
sw $t0, 500($s0)
CSE-2021 June-30-2011 28
Exercise
SIGNAL VALUE
ALUSrc
MemToReg
RegDst
RegWrite
MemRead
MemWrite
Branch
Jump
Operation (3-bit)
beq $t0, $s0, 401
CSE-2021 June-30-2011 29
Generating the Control Signals
All signals depend on the instruction, i.e. on a total of
12 bits complex.
Note that non-ALU signals depend only on the 6-bit
op_code simpler.
Hence, split the control into a main control unit that
sees only the opcode, and an auxiliary one that sees
the funtion code.
The two communicate via a new signal, ALUop
CSE-2021 June-30-2011 30
Splitting the Control
Main Control Unit31 – 26
ALU Control Unit5 – 0
Operation
8 control
signals8
3
2
CSE-2021 June-30-2011 31
The Operation Signal
A 3-bit signal through which the auxiliary
control unit tells the ALU to:
000 = and
001 = or
010 = add
110 = sub
111 = slt
CSE-2021 June-30-2011 32
The ALUop Signal
A 2-bit signal through which the main control
unit tells the auxiliary to:
00 = add (no matter what the fun_code is)
01 = subtract (no matter what the fun_code is)
10 = R-Type (follow the fun_code)
CSE-2021 June-30-2011 33
The Main Control Unit
Combinational
Logic
31-26
RegDst
ALUsrc
MemToReg
RegWrite
MemRead
MemWrite
Branch
Jump
ALUop-1 ALUop-0CSE-2021 June-30-2011 34
CSE-2021 June-30-2011
Instruction RegDst ALUSrc MemtoReg RegWrite MemRd MemWrt Branch ALUOp1 ALUOp0
R-format 1 0 0 1 0 0 0 1 0
lw 0 1 1 1 1 0 0 0 0
sw X 1 X 0 0 1 0 0 0
beq X 0 X 0 0 0 1 0 1
Instruction Opcode in
Decimal
Opcode in Binary
Op5 Op4 Op3 Op2 Op1 Op0
R-format 0ten 0 0 0 0 0 0
lw 35ten 1 0 0 0 1 1
sw 43ten 1 0 1 0 1 1
beq 4ten 0 0 0 1 0 0
Inputs of Control Unit:
Outputs of Control Unit:
The Main Control Unit (1)
35
R-format Iw sw beq
Op0
Op1
Op2
Op3
Op4
Op5
Inputs
Outputs
RegDst
ALUSrc
MemtoReg
RegWrite
MemRead
MemWrite
Branch
ALUOp1
ALUOpO
The Main Control Unit (2)
CSE-2021 June-30-2011 36
ALU Control
AUXILURY
Control Unit
ALUop-1
Operation-2
Operation-1
Operation-0
ALUop-0
5-0
CSE-2021 June-30-2011 37
CSE-2021 June-30-2011
Aux Controller Implementation (1)
Instruction
(opcode)
Inputs
Desired ALU action
Outputs Operation
(Op3 – Op0)ALUOp
(ALUOp1 – ALUOp0)
Function Field
(F5 – F0)
lw (I) 0 0 (0 0) X X X X X X add 0 0 1 0
sw (I) 0 0 (0 0) X X X X X X add 0 0 1 0
beq (I) 0 1 (0 1) X X X X X X sub 0 1 1 0
add (32) 1 0 (1 0) X X 0 0 0 0 add 0 0 1 0
sub (34) 1 X (1 0) X X 0 0 1 0 sub 0 1 1 0
and (36) 1 0 (1 0) X X 0 1 0 0 and 0 0 0 0
or (37) 1 0 (1 0) X X 0 1 0 1 or 0 0 0 1
slt (42) 1 X (1 1) X X 1 0 1 0 slt 0 1 1 1
38
Aux Controller Implementation (2)
Operation2
Operation1
Operation0
Operation
ALUOp1
F3
F2
F1
F0
F (5– 0)
ALUOp0
ALUOp
ALU control block
CSE-2021 June-30-2011 39
Op0 ALUOp1 (F0 F3)
Op1 ALUOp1 F2
Op2 ALUOp0 ALUOP1F1
The Single-Cycle
Performance
CSE-2021 June-30-2011
CSE-2021 June-30-2011
• Load = 5 functional units:
inst. fetch, register access, ALU, data memory access, register access
• Store = 4 functional units:
instruction fetch, register access, ALU, data memory access
• R-type = 4 functional units:
instruction fetch, register access, ALU, register access
• Branch = 3 functional units:
instruction fetch, register access, ALU
• Jump = 1 functional unit:
instruction fetch
Performance Analysis
41
Component DelaysRF=50, ALU=100, and MEM (both IM and DM)=200 ps.
Compute CPU Time to execute various instructionsj, beq, add, sw, lw
Compute Max GHz for the CPU ClockAnswer: 1.66 GHz
Critique of S/Cycle+very simple
-caters to the slowest
-h/w redundancy
CSE-2021 June-30-2011 42