66
EECC550 EECC550 - - Shaaban Shaaban #1 Lec # 4 Winter 2011 12-13-2011 CPU Organization (Design) CPU Organization (Design) Datapath Design: Capabilities & performance characteristics of principal Functional Units (FUs) needed by ISA instructions (e.g., Registers, ALU, Shifters, Logic Units, ...) Ways in which these components are interconnected (buses connections, multiplexors, etc.). How information flows between components. Control Unit Design: Logic and means by which such information flow is controlled. Control and coordination of FUs operation to realize the targeted Instruction Set Architecture to be implemented (can either be implemented using a finite state machine or a microprogram). Hardware description with a suitable language, possibly using Register Transfer Notation (RTN). 4 th Edition Chapter 4.1-4.4 -3 rd Edition Chapter 5.1-5.4 Components & their connections needed by ISA instructions Control/sequencing of operations of datapath components to realize ISA instructions Components Connections ISA Requirements

Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#1 Lec # 4 Winter 2011 12-13-2011

CPU Organization (Design)CPU Organization (Design)• Datapath Design:

– Capabilities & performance characteristics of principal Functional Units (FUs) needed by ISA instructions

– (e.g., Registers, ALU, Shifters, Logic Units, ...)– Ways in which these components are interconnected (buses

connections, multiplexors, etc.).– How information flows between components.

• Control Unit Design:– Logic and means by which such information flow is controlled.– Control and coordination of FUs operation to realize the targeted

Instruction Set Architecture to be implemented (can either be implemented using a finite state machine or a microprogram).

• Hardware description with a suitable language, possibly using Register Transfer Notation (RTN).

4th Edition Chapter 4.1-4.4 - 3rd Edition Chapter 5.1-5.4

Components & their connections needed by ISA instructions

Control/sequencing of operations of datapath componentsto realize ISA instructions

Components

Connections

ISARequirements

Page 2: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#2 Lec # 4 Winter 2011 12-13-2011

1 Analyze instruction set to get datapath requirements:– Using independent RTN, write the micro-operations required for target ISA

instructions.• This provides the the required datapath components and how they are

connected.

2 Select set of datapath components and establish clocking methodology(defines when storage or state elements can read and when they can be written, e.g clock edge-triggered)

3 Assemble datapath meeting the requirements.4 Identify and define the function of all control points or signals needed by

the datapath.– Analyze implementation of each instruction to determine setting of control

points that affects its operations.5 Control unit design, based on micro-operation timing and control

signals identified:– Combinational logic: For single cycle CPU.– Hard-Wired: Finite-state machine implementation.– Microprogrammed.

Major CPU Design StepsMajor CPU Design Steps

e.g Any instruction completed in one cycle

1 2

i.e CPI = 1

e.g Flip-Flops

ISA Requirements → CPU Design

Page 3: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#3 Lec # 4 Winter 2011 12-13-2011

CPU Design & Implantation ProcessCPU Design & Implantation Process• Top-down Design:

– Specify component behavior from high-level requirements (ISA).• Bottom-up Design:

– Assemble components in target technology to establish critical timing (hardware delays, critical path timing).

• Iterative refinement:– Establish a partial solution, expand and improve.

Datapath Control

ProcessorInstruction SetArchitecture (ISA):ProvidesRequirements

Reg. File Mux ALU Reg Mem Decoder Sequencer

Cells GatesTarget VLSI implementationTechnology

ISA Requirements CPU Design

Page 4: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#4 Lec # 4 Winter 2011 12-13-2011

Datapath Design StepsDatapath Design Steps• Write the micro-operation sequences required for a number of

representative target ISA instructions using independent RTN.• Independent RTN statements specify: the required datapath

components and how they are connected.

• From the above, create an initial datapath by determining possible destinations for each data source (i.e registers, ALU).– This establishes connectivity requirements (data paths, or

connections) for datapath components.– Whenever multiple sources are connected to a single input,

a multiplexor of appropriate size is added.

• Find the worst-time propagation delay (critical path) in the datapath to determine the datapath clock cycle (CPU clock cycle, C).

• Complete the micro-operation sequences for all remaining instructions adding datapath components + connections/multiplexors as needed.

(or destination)

1 2

Page 5: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#5 Lec # 4 Winter 2011 12-13-2011

MIPS Instruction FormatsMIPS Instruction Formats

• op: Opcode, operation of the instruction.• rs, rt, rd: The source and destination register specifiers.• shamt: Shift amount.• funct: Selects the variant of the operation in the “op” field.• address / immediate: Address offset or immediate value.• target address: Target address of the jump instruction.

op target address02631

6 bits 26 bits

op rs rt rd shamt funct061116212631

6 bits 6 bits5 bits5 bits5 bits5 bits

op rs rt Immediate (imm16)016212631

6 bits 16 bits5 bits5 bits

R-Type

I-Type: ALULoad/Store, Branch

J-Type: Jumps

[31:26] [25:21] [20:16] [15:11] [10:6] [5:0]

[31:26] [25:21] [20:16] [15:0]

[31:26] [25:0]

Or address offset

Page 6: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#6 Lec # 4 Winter 2011 12-13-2011

MIPS RMIPS R--Type (ALU) Instruction FieldsType (ALU) Instruction Fields

• op: Opcode, basic operation of the instruction. – For R-Type op = 0

• rs: The first register source operand.• rt: The second register source operand.• rd: The register destination operand.• shamt: Shift amount used in constant shift operations.• funct: Function, selects the specific variant of operation in the op field.

OP rs rt rd shamt funct6 bits 5 bits 5 bits 5 bits 5 bits 6 bits

R-Type: All ALU instructions that use three registers

add $1,$2,$3sub $1,$2,$3

and $1,$2,$3or $1,$2,$3

Examples:

Destination register in rd Operand register in rt

Operand register in rs

[31:26] [25:21] [20:16] [15:11] [10:6] [5:0]

1st operand 2nd operand Destination

Instruction Word ← Mem[PC]R[rd] ← R[rs] funct R[rt]PC ← PC + 4

Rs, rt , rdare register specifier fields

R-Type = Register Type Register Addressing used (Mode 1)

Independent RTN:

Funct field value examples:Add = 32 Sub = 34 AND = 36 OR =37 NOR = 39

Opcode for R-Type= 0

FunctionField

Page 7: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#7 Lec # 4 Winter 2011 12-13-2011

MIPS ALU IMIPS ALU I--Type Instruction FieldsType Instruction FieldsI-Type ALU instructions that use two registers and an immediate value

Loads/stores, conditional branches.

• op: Opcode, operation of the instruction.• rs: The register source operand. • rt: The result destination register.• immediate: Constant second operand for ALU instruction.

OP rs rt Immediate (imm16)

6 bits 5 bits 5 bits 16 bits

add immediate: addi $1,$2,100

and immediate andi $1,$2,10

Examples:

Result register in rtSource operand register in rs

Constant operandin immediate

[31:26] [25:21] [20:16] [15:0]

1st operand 2nd operandDestination

Instruction Word ← Mem[PC]R[rt] ← R[rs] + imm16PC ← PC + 4

Independent RTN for addi:

I-Type = Immediate Type Immediate Addressing used (Mode 2)

imm16

OP = 8

OP = 12

imm16 = 16 bit immediate field

imm16

Page 8: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#8 Lec # 4 Winter 2011 12-13-2011

MIPS Load/Store IMIPS Load/Store I--Type Instruction FieldsType Instruction Fields

• op: Opcode, operation of the instruction.– For load word op = 35, for store word op = 43.

• rs: The register containing memory base address. • rt: For loads, the destination register. For stores, the source

register of value to be stored. • address: 16-bit memory address offset in bytes added to base

register.

OP rs rt address

6 bits 5 bits 5 bits 16 bits

Store word: sw $3, 500($4)

Load word: lw $1, 32($2)

Examples: Offsetbase register in rs

source register in rt

Destination register in rt Offsetbase register in rs

Signed addressoffset in bytes

Base Src./Dest.

[31:26] [25:21] [20:16] [15:0]

(e.g. offset)

Instruction Word ← Mem[PC] R[rt] ← Mem[R[rs] + imm16]PC ← PC + 4

Instruction Word ← Mem[PC]Mem[R[rs] + imm16] ← R[rt] PC ← PC + 4

Base or Displacement Addressing used (Mode 3)

imm16

imm16 = 16 bit immediate field

imm16

SignExtended

Page 9: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#9 Lec # 4 Winter 2011 12-13-2011

MIPS Branch IMIPS Branch I--Type Instruction FieldsType Instruction Fields

• op: Opcode, operation of the instruction.• rs: The first register being compared• rt: The second register being compared.• address: 16-bit memory address branch target offset in words

added to PC to form branch address.

OP rs rt address

6 bits 5 bits 5 bits 16 bits

Branch on equal beq $1,$2,100

Branch on not equal bne $1,$2,100

Examples:

Register in rsRegister in rt offset in bytes equal to

instruction address field x 4

Signed addressoffset in words

Addedto PC+4 to formbranch target

[31:26] [25:21] [20:16] [15:0]

PC-Relative Addressing used (Mode 4)

(e.g. offset)

Instruction Word ← Mem[PC]R[rs] = R[rt] : PC ← PC + 4 + imm16 x 4R[rs] ≠ R[rt] : PC ← PC + 4

Independent RTN for beq:

imm16

imm16 = 16 bit immediate field

OP = 4

OP = 5

Word = 4 bytes

imm16

Imm16 x 4

Sign extended

Page 10: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#10 Lec # 4 Winter 2011 12-13-2011

MIPS JMIPS J--Type Instruction FieldsType Instruction Fields

• op: Opcode, operation of the instruction.– Jump j op = 2– Jump and link jal op = 3

• jump target: jump memory address in words.

J-Type: Include jump j, jump and link jal

OP jump target

6 bits 26 bits

jump target = 25004 bits 26 bits 2 bits

0 0PC(31-28)

Effective 32-bit jump address: PC(31-28),jump_target,00

FromPC+4

Jump j 10000

Jump and link jal 10000

Examples:

Jump memory address in bytes equal toinstruction field jump target x 4

[31:26] [25:0]

J-Type = Jump Type Pseudodirect Addressing used (Mode 5)

Jump targetin words

Instruction Word ← Mem[PC]PC ← PC + 4PC ← PC(31-28),jump_target,00

Independent RTN for j:

Word = 4 bytes

Page 11: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#11 Lec # 4 Winter 2011 12-13-2011

A Subset of MIPS InstructionsA Subset of MIPS InstructionsADD and SUB:

add rd, rs, rtsub rd, rs, rt

OR Immediate:ori rt, rs, imm16

LOAD and STORE Wordlw rt, rs, imm16sw rt, rs, imm16

BRANCH:beq rs, rt, imm16

op rs rt rd shamt funct061116212631

6 bits 6 bits5 bits5 bits5 bits5 bits

op rs rt Immediate (imm16)016212631

6 bits 16 bits5 bits5 bits

op rs rt Immediate (imm16)016212631

6 bits 16 bits5 bits5 bits

op rs rt Immediate (imm16)016212631

6 bits 16 bits5 bits5 bits

[31:26] [25:21] [20:16] [15:0]

[31:26] [25:21] [20:16] [15:0]

[31:26] [25:21] [20:16] [15:0]

[31:26] [25:21] [20:16] [15:11] [10:6] [5:0]

32 = add34 = sub

13

R

I

I

I

4

35 = lw43 = sw

Offset in words

Offset in bytes

0

Page 12: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#12 Lec # 4 Winter 2011 12-13-2011

Basic MIPS Instruction Processing StepsBasic MIPS Instruction Processing Steps

Obtain instruction from program storage

Determine instruction type

Obtain operands from registers

Compute result value or status

Store result in register/memory if needed

(usually called Write Back).

Update program counter to address

of next instructionCommonsteps for all instructions

InstructionFetch

InstructionDecode

Execute

ResultStore

NextInstruction

Instruction ← Mem[PC]

PC ← PC + 4

Done by Control Unit(Based on Opcode)

Instruction/program Memory

T = I x CPI x C

Page 13: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#13 Lec # 4 Winter 2011 12-13-2011

Overview of MIPS Instruction MicroOverview of MIPS Instruction Micro--operationsoperations• All instructions go through these common steps:

– Send program counter to instruction memory and fetch the instruction. (fetch) Instruction ← Mem[PC]

– Update the program counter to point to next instruction PC ← PC + 4– Read one or two registers, using instruction fields. (decode)

• Load reads one register only.• Additional instruction execution actions (execution) depend on the

instruction in question, but similarities exist:– All instruction classes (except J type) use the ALU after reading the

registers:• Memory reference instructions use it for effective address calculation.• Arithmetic and logic instructions (R-Type), use it for the specified

operation.• Branches use it for comparison.

• Additional execution steps where instruction classes differ:– Memory reference instructions: Access memory for a load or store.– Arithmetic and logic instructions: Write ALU result back in register.– Branch instructions: Possibly change next instruction address (update PC)

based on comparison (i.e if branch is taken).

CommonSteps

Page 14: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#14 Lec # 4 Winter 2011 12-13-2011

Data

Register #

Register #

Register #

PC Address Instruction

Inst ruct ionm em ory

Regist ers ALU Address

Data

Dat am em ory

AddAdd

4

A Single Cycle MIPS CPU DesignA Single Cycle MIPS CPU DesignDesign target: A single-cycle per instruction MIPS CPU design

All micro-operations of an instruction are to be carried out in a single CPU clock cycle. Cycles Per Instruction = CPI = 1

CPI = 1

4th Edition Figure 4.1 page 302 - 3rd Edition Figure 5.1 page 287

Abstract view of single cycle MIPSCPU showing major functional units (components) and major connectionsbetween them

T = I x CPI x CCPU Performance Equation:

32

32

323232

ISA

Page 15: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#15 Lec # 4 Winter 2011 12-13-2011

Instruction Word ← Mem[PC] Fetch the instruction

PC ← PC + 4 Increment PC

R[rd] ← R[rs] + R[rt] Add register rs to register rt result in register rd

RR--Type Example:Type Example:MicroMicro--Operation Sequence For ADDOperation Sequence For ADD

OP rs rt rd shamt funct6 bits 5 bits 5 bits 5 bits 5 bits 6 bits

add rd, rs, rt

Independent RTN ?

[31:26] [25:21] [20:16] [15:11] [10:6] [5:0]

CommonSteps

ProgramMemory

32 = add34 = sub

0 0

i.e Funct =add = 32

Page 16: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#16 Lec # 4 Winter 2011 12-13-2011

PC

Instructionaddress

Instruction

Inst ruct ionm em or y

Add Sum

a. Instruction memory b. Program counter c. Adder

32

32 3232

32

32

32

InstructionWord

Initial Initial DatapathDatapath ComponentsComponents

Two state elements (memory) needed to store and access instructions:1 Instruction memory:• Only read access (by user code). No read control signal needed.

2 Program counter (PC): 32-bit register.• Written at end of every clock cycle (edge-triggered) : No write control signal.

3 32-bit Adder: To compute the the next instruction address (PC + 4).

Three components needed by: Instruction Fetch: Instruction ← Mem[PC]Program Counter Update: PC ← PC + 4

+ Basics of logic design/logic building blocks review in Appendix C in Book CD (4th Edition Appendix B)

4th Edition Figure 4.5, page 308 - 3rd Edition Figure 5.5, page 293

32-bit

Page 17: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#17 Lec # 4 Winter 2011 12-13-2011

4th Edition Figure 4.6 page 309 - 3rd Edition Figure 5.6 page 293

PC Readaddress

Instruction

Inst ruct ionm em or y

Add

4

Building The DatapathBuilding The Datapath

Portion of the datapathused for fetching instructionsand incrementing the program counter (PC).

Instruction Fetch& PC Update:

PC ← PC + 4

Instruction ←Mem[PC]

PC write or update is edge triggered at the end of the cycleClock input to PC, memory not shown

Instruction ← Mem[PC]PC ← PC + 4

32

32

32

32

32

1

2

1

2

Page 18: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#18 Lec # 4 Winter 2011 12-13-2011

Readregister 1

Readregister 2

Writeregister

WriteData

Regist ers ALUData

Data

ZeroALU

result

RegWrite

a. Registers b. ALU

5

5

5

Registernumbers

Readdata 1

Readdata 2

ALU operation4

32

32

32

32

32

More More DatapathDatapath ComponentsComponents

Register File:• Contains all ISA registers.• Two read ports and one write port.• Register writes by asserting write control signal• Clocking Methodology: Writes are edge-triggered.

• Thus can read and write to the same register in the same clock cycle.

ISA Register File Main 32-bit ALU

32-bit Arithmetic and Logic Unit (ALU)

(Function)

Zero = Zero flag = 1When ALU result equals zero

R[rs]

R[rt]

4th Edition Figure 4.7, page 310 - 3rd Edition Figure 5.7, page 295

+ Basics of logic design/logic building blocks review in Appendix C in Book CD (4th Edition Appendix B)

e.g add = 0010

Page 19: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#19 Lec # 4 Winter 2011 12-13-2011

• Register File consists of 32 registers:– Two 32-bit output busses:

busA and busB– One 32-bit input bus: busW

• Register is selected by:– RA (number) selects the register to put on busA (data):

busA = R[RA]– RB (number) selects the register to put on busB (data):

busB = R[RB]– RW (number) selects the register to be written

via busW (data) when Write Enable is 1Write Enable: R[RW] ← busW

• Clock input (CLK) – The CLK input is a factor ONLY during write operations.– During read operation, it behaves as a combinational logic block:

• RA or RB valid => busA or busB valid after “access time.”

Register File DetailsRegister File Details

Clk

busW

Write Enable

3232

busA

32busB

5 5 5RW RA RB

32 32-bitRegisters

Write Data

Page 20: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#20 Lec # 4 Winter 2011 12-13-2011

A Possible Register File Implementation

5-to-32Decoder

Register 0Write

Data In

DataOut

Register 1Write

Data In

DataOut

Register 30Write

Data In

DataOut

Register 31Write

Data In

DataOut

... ...

32-to-1MUX

01

3031

32...

5

32-to-1MUX

01

3031

32...

5

.

..

.

32

32

32

32

.

.

.

...

01

3031

5

.

.

.

RegisterRead Data 1(Bus A)

RegisterRead Data 2(Bus B)

Read Register 1(RA)

Read Register 2(RB)

Register Write Data (Bus W)

32

Register Write Enable (RegWrite)

WriteRegisterRW

Clk

busW

Write Enable

3232

busA

32busB

5 5 5RW RA RB

32 32-bitRegisters

Also see Appendix C in Book CD (3rd Edition Appendix B) - The Basics of Logic Design

Each Register contains 32 edge triggered D-Flip Flops

Clock input to registersnot shown in diagram

R[rs]

R[rt]

rs

rt

rd?rt?

Write Port2 Read Ports

Page 21: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#21 Lec # 4 Winter 2011 12-13-2011

Idealized MemoryIdealized Memory• Memory (idealized)

– One input bus: Data In.– One output bus: Data Out.

• Memory word is selected by:– Address selects the word to put on Data Out bus.– Write Enable = 1: address selects the memory

word to be written via the Data In bus.• Clock input (CLK):

– The CLK input is a factor ONLY during write operation,– During read operation, this memory behaves as

a combinational logic block:• Address valid => Data Out valid after “access time.”

• Ideal Memory = Short access time.

Clk

Data In

Write Enable

32 32DataOut

Address

Read Enable

Compared to other components in CPU datapath

Page 22: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#22 Lec # 4 Winter 2011 12-13-2011

Clocking Methodology Used:Edge Triggered Writes

• All storage element (e.g Flip-Flops, Registers, Data Memory) writes are triggered by the same clock edge.

• Cycle Time = CLK-to-Q + Longest Delay Path + Setup + Clock Skew

Clk

Don’t CareSetup Hold

.

.

.

.

.

.

.

.

.

.

.

.

Setup Hold

Here writes are triggered on the rising edge of the clock

Critical Path (Longest delay path)

CLK-to-QCLK-to-Q

ClockClock

Page 23: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#23 Lec # 4 Winter 2011 12-13-2011

Simplified Simplified DatapathDatapath For MIPS For MIPS RR--Type InstructionsType Instructions

InstructionRegisters

Write�register

Read�data 1

Read�data 2

Read�register 1

Read�register 2

Write�data

ALU�result

ALUZero

RegWrite

ALU operation3

Components and connections as specified by RTN statementR[rd] ← R[rs] + R[rt]

32

32

32

32

rs

rt

rd

R[rs]

R[rt]

FromInstructionMemory

Destination register R[rd] write or update isedge triggered at the end of the cycle

4[25:21]

[20:16]

[15:11]

Clock input to register bank not showni.e Funct = function =add

(Function)e.g add = 0010

Page 24: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#24 Lec # 4 Winter 2011 12-13-2011

More Detailed More Detailed DatapathDatapathFor RFor R--Type InstructionsType Instructions

With Control Points IdentifiedWith Control Points Identified

R[rd] ← R[rs] + R[rt]

i.e Funct = function =add

32

Result

ALUctr

Clk

busW

RegWr

32

32

busA

32

busB

5 5 5

Rw Ra Rb

32 32-bitRegisters

Rs RtRdA

LU

R[rs]

R[rt]

Function =Add, Subtract …4

Page 25: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#25 Lec # 4 Winter 2011 12-13-2011

RR--Type RegisterType Register--Register TimingRegister Timing

32Result

ALUctr

Clk

busW

RegWr

3232

busA

32busB

5 5 5

Rw Ra Rb32 32-bitRegisters

Rs RtRdA

LU

Clk

PC

Rs, Rt, Rd,Op, Func

Clk-to-Q

ALUctr

Instruction Memory Access Time

Old Value

New Value

RegWr Old Value

New Value

Delay through Control Logic

busA, B

Register File Access TimeOld Value

New Value

busWALU Delay

Old Value

New Value

Old Value

New Value

New ValueOld Value

Register WriteOccurs Here

PC+4

All register writes occur onfalling edge of clock(clocking methodology)

R[rs]

R[rt]

Page 26: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#26 Lec # 4 Winter 2011 12-13-2011

Instruction Word ← Mem[PC] Fetch the instruction

PC ← PC + 4 Increment PC

R[rt] ← R[rs] OR ZeroExt[imm16] OR register rs with immediate field zero extended to 32 bits, result in register rt

Logical Operations with Immediate ExampleExample::MicroMicro--Operation Sequence For ORIOperation Sequence For ORI

op rs rt Immediate (imm16)016212631

6 bits 16 bits5 bits5 bits

ori rt, rs, imm16

[31:26] [25:21] [20:16] [15:0]

Done by Main ALU

13

Imm16000 ….. 000

CommonSteps

Not in book version

Page 27: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#27 Lec # 4 Winter 2011 12-13-2011

DatapathDatapath For Logical For Logical Instructions With ImmediateInstructions With Immediate

R[rt] ← R[rs] OR ZeroExt[imm16]

32

Result

ALUctr

Clk

busW

RegWr

32

32

busA

32

busB

5 5 5

Rw Ra Rb

32 32-bitRegisters

Rs Rt

RtRdRegDst

ZeroE

xt

Mux

Mux

3216imm16

ALUSrc

AL

U

01

R[rs]

R[rt]

2x1 MUX (width 5 bits)

2x1 MUX(width 32 bits)

Function = OR

0

1

Imm16000 ….. 000

Page 28: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#28 Lec # 4 Winter 2011 12-13-2011

Instruction Word ← Mem[PC] Fetch the instruction

PC ← PC + 4 Increment PC

R[rt] ← Mem[R[rs] + SignExt[imm16]] Immediate field sign extended to 32 bits and added to register rsto form memory load address,write word at load effective address to register rt

Load Operations ExampleExample::MicroMicro--Operation Sequence For LWOperation Sequence For LW

op rs rt Immediate (imm16)016212631

6 bits 16 bits5 bits5 bits

lw rt, rs, imm16

Data Memory

Instruction Memory

Effective Address

SignedAddress offsetin bytes

[31:26] [25:21] [20:16] [15:0]35

To load from

CommonSteps

Page 29: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#29 Lec # 4 Winter 2011 12-13-2011

Additional Additional DatapathDatapath Components For Components For Loads & StoresLoads & Stores

Inputs:for address and write (store) dataOutputfor read (load) data

16-bit input sign-extended into a 32-bit value at the output

Data memory write or update is edge triggered at the end of the cycle (clocking methodology)

Address Readdata

Dat am em ory

a. Data memory unit

Writedata

M emRead

M emWrite

b. Sign-extension unit

Sig ne x t e n d

16 32

For SignExt[imm16]

32

32

32

4th Edition Figure 4.8, page 3113rd Edition Figure 5.8, page 296

Page 30: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#30 Lec # 4 Winter 2011 12-13-2011

DatapathDatapath For LoadsFor Loads

R[rt] ← Mem[R[rs] + SignExt[imm16]]

32

ALUctr

Clk

busW

RegWr

3232

busA

32busB

5 5 5

Rw Ra Rb32 32-bitRegisters

Rs

RtRdRegDst

Extender

Mux

Mux

3216

imm16

ALUSrc

ExtOp

Clk

Data InWrEn

32

Adr

DataMemory

32

AL

U

MemWr Mux

MemtoReg

1 0

0

1

0

1

R[rs]

R[rt]

Base Address register

EffectiveAddress

Offset

MemRd

Function = add

Effective AddressData Memory

LWLW

Page 31: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#31 Lec # 4 Winter 2011 12-13-2011

Instruction Word ← Mem[PC] Fetch the instruction

PC ← PC + 4 Increment PC

Mem[R[rs] + SignExt[imm16]] ← R[rt] Immediate field sign extended to 32 bits and added to register rsto form memory store effective address, register rt written to memory at store effective address.

Store Operations ExampleExample::MicroMicro--Operation Sequence For SWOperation Sequence For SW

op rs rt Immediate (imm16)016212631

6 bits 16 bits5 bits5 bits

sw rt, rs, imm16

Effective Address

SignedAddress offsetin bytes

Data Memory

43[31:26] [25:21] [20:16] [15:0]

To store at

CommonSteps

Page 32: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#32 Lec # 4 Winter 2011 12-13-2011

DatapathDatapath For StoresFor Stores

Mem[R[rs] + SignExt[imm16]] ← R[rt]

ALUSrcExtOp

32

ALUctr

Clk

busW

RegWr

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst

Extender

Mux

Mux

3216imm16

Clk

Data InWrEn

32Adr

DataMemory

MemWr

AL

U32

Mux

MemtoReg1 0

0

1

0

1

R[rs]

R[rt]

Base Address register

EffectiveAddress

OffsetR[rt]

MemRd

Effective AddressData Memory

Add =

SWSW

Page 33: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#33 Lec # 4 Winter 2011 12-13-2011

Instruction Word ← Mem[PC] Fetch the instruction

PC ← PC + 4 Increment PC

Zero ← R[rs] - R[rt] Calculate the branch conditionR[rs] == R[rt]

(i.e R[rs] - R[rt] = 0 )

Zero : PC ← PC + ( SignExt(imm16) x 4 ) Calculate the next instruction’s PC address

Conditional Branch ExampleExample::MicroMicro--Operation Sequence For BEQOperation Sequence For BEQ

op rs rt immediate016212631

6 bits 16 bits5 bits5 bits

beq rs, rt, imm16

Branch Target

PC Offset in words[31:26] [25:21] [20:16] [15:0]

“Zero” is zero flag of main ALU

Condition Action

4

Then Zero = 1

CommonSteps

Page 34: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#34 Lec # 4 Winter 2011 12-13-2011

Readregister 1

Readregister 2

Writeregister

Writedata

Re g is t e r s ALU Zero

RegWrite

Readdata 1

Readdata 2

ALU operation4

To branchcontrol logic

Add Sum Branchtarget

PC+4 from instruction datapath

Signext end

16 32

Instruction

Shif tlef t 2

Datapath For Branch Instructions

Main ALU evaluates branch conditionNew adder to compute branch target:• Sum of incremented PC and

sign-extended lower 16-bits on theinstruction.

Main ALU EvaluatesBranch Condition(subtract)

R[rs]

R[rt]Zero flag =1if R[rs] - R[rt] = 0(i.e R[rs] = R[rt])

New 32-bit Adder (Third ALU)for Branch Target

PC + 4 + ( SignExt(imm16) x 4

SignExt(imm16) x 4

[25:21] rs

[20:16] rt

[15:0] imm16SignExt(imm16)

(Main ALU)

4th Edition Figure 4.9, page 312 - 3rd Edition Figure 5.9, page 297

= Subtract

ISA

Zero ← R[rs] - R[rt]

Page 35: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#35 Lec # 4 Winter 2011 12-13-2011

More Detailed More Detailed DatapathDatapathFor Branch OperationsFor Branch Operations

Clk

busW

RegWr

32

busA

32busB

5 5 5

Rw Ra Rb32 32-bitRegisters

Rs Rt

Equ

al?

Zero

32

imm16

PCClk

00

Adder

Mux

Adder

4

PC E

xt

Instruction Address32

BranchZero

PC+4

BranchTarget

0

1 Main ALU(subtract)

Branch Target ALU New 2X1 32-bitMUX to select next PC valueSign extend

shift left 2

PC

New Third ALU(adder)

R[rs]

R[rt]

Zero ← R[rs] - R[rt]

Page 36: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#36 Lec # 4 Winter 2011 12-13-2011

Combining The Datapaths For Memory Combining The Datapaths For Memory Instructions and RInstructions and R--Type InstructionsType Instructions

Highlighted muliplexors and connections added to combine the datapaths of memory and R-Type instructions into one datapath

[25:21] rs

[20:16] rtR[rs]

R[rt]

R[rt]

0

1

1

0

rt/rdMUX not shown

4

[15:0] imm16 SignExt(imm16)

This is book version ORI not supported

4th Edition Figure 4.10 Page 314 - 3rd Edition Figure 5.10 Page 299

32

32

32

32

32

Page 37: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#37 Lec # 4 Winter 2011 12-13-2011

Instruction Fetch Datapath Added toInstruction Fetch Datapath Added toALU RALU R--Type and Memory Instructions DatapathType and Memory Instructions Datapath

This is book version ORI not supported, no zero extend of immediate needed

Readregister 1

Readregister 2

Writeregister

Writedata

Writedata

Regist ers

A LUZero

RegWrite

M emRead

M emWriteM emtoReg

Readdata 1

Readdata 2

ALU operation4

Signext end

16 32

ALUresultM

ux

0

1

Mux

1

0

ALUSrcAddress

Dat am em or y

Readdata

PC Readaddress

Instruction

Inst ruct ionm em or y

Add

4

R[rs]

R[rt]

rs

rt

PC+ 4

PC

rt/rdMUX not shown

32

32

32

32

32R[rt]

Combination of Figure 4.10 (p. 314) and Figure 4.6 (p. 309)[3rd Edition Figure 5.10 (p. 299) and Figure 5.6 (p. 293)]

Page 38: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#38 Lec # 4 Winter 2011 12-13-2011

A Simple Datapath For The MIPS ArchitectureA Simple Datapath For The MIPS ArchitectureDatapath of branches and a program counter multiplexor are added.Resulting datapath can execute in a single cycle the basic MIPS instruction:

- load/store word - ALU operations - Branches

This is book version ORI not supported, no zero extend of immediate needed

4th Edition Figure 4.11 page 315 - 3rd Edition Figure 5.11 page 300

1

0

0

1

PC +4

Branch Target

rt/rdMUX not shown

4

Branch

Zero

32

32

32

32

32

32

32

32

R[rs]

R[rt]

rs

rt

Page 39: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#39 Lec # 4 Winter 2011 12-13-2011

Main ALU Control• The main ALU has four control lines (detailed design in Appendix B)

with the following functions:

• For our current subset of MIPS instructions only the top fivefunctions will be used (thus only three control lines will be used)

• For R-type instruction the ALU function depends on both the opcodeand the 6-bit “funct” function field

• For other instructions the ALU function depends on the opcode only.• A local ALU control unit can be designed to accept 2-bit ALUop

control lines (from main control unit) and the 6-bit function field and generate the correct 4-bit ALU control lines.

ALU Control Lines ALU Function00000001001001100111

ANDORadd

subtractSet-on-less-than

1100 NOR Not Used

Or 3 bits depending on number functions actually used

4th EditionAppendix C

3rd Edition

Page 40: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#40 Lec # 4 Winter 2011 12-13-2011

InstructionOpcode

LWSWBranch EqualR-TypeR-TypeR-TypeR-TypeR-Type

ALUOp

0000011010101010

Funct Field

XXXXXXXXXXXXXXXXXX100000100010100100100101101010

DesiredALU Action

addaddsubtractaddsubtractandorset on less than

ALU ControlLines

00100010011000100110000000010111

InstructionOperation

Load wordStore wordbranch equaladdsubtractANDORset on less than

MainControl

op6

ALUControl(Local)

func

2

6ALUop

ALUctr4

AL

U

Local ALU Decoding of Local ALU Decoding of ““funcfunc”” FieldField

R-Type = 10

Add = 00 Subtract = 01

OpcodeOr3 bits

Page 41: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#41 Lec # 4 Winter 2011 12-13-2011

Local ALU Control Unit

3 ALU Control Lines

FunctionField

(2 lines From main control unit)

More details found in Appendix D in Book CD – (3rd Edition Appendix C)

4th line = 0

AddSubtractAddSubtractANDORSet-On-less-Than

Page 302

R-type=10 {Add = 00

Subtract = 01

2

Page 42: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#42 Lec # 4 Winter 2011 12-13-2011

Readregister 1

Readregister 2

Writeregister

Writedata

Writedata

Registers

ALU

Add

Zero

MemRead

MemWrite

RegWrite

PCSrc

MemtoReg

Readdata 1

Readdata 2

Signextend

16 32

Instruction[31:0] ALU

result

Add

ALUresult

Mux

Mux

Mux

ALUSrc

Address

Datamemory

Readdata

Shiftleft 2

4

Readaddress

Instructionmemory

PC

1

0

0

1

0

1

Mux

0

1

ALUcontrol

ALUOpInstruction [5:0]

Instruction [25:21]

Instruction [15:11]

Instruction [20:16]

Instruction [15:0]

RegDst

Function Field

Branch

Zero

imm16

rs

rt

32

32

32

PC +4

rd

32

R[rs]

R[rt]

3232

Branch Target

PC +4

32

ALUOp (2-bits)00 = add01 = subtract10 = R-Type

R[rt]

32

Single Cycle MIPS DatapathSingle Cycle MIPS DatapathNecessary multiplexors and control linesare identified here and local ALU control added:

This is book version ORI not supported, no zero extend of immediate needed

4th Edition Figure 4.15 page 320 - 3rd Edition Figure 5.15 page 305

Page 43: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#43 Lec # 4 Winter 2011 12-13-2011

Putting It All Together: A Single Cycle Putting It All Together: A Single Cycle DatapathDatapathim

m16

32

ALUop(2-bits)

Clk

busW

RegWr

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst

Extender

Mux

3216imm16

ALUSrcExtOp

Mux

MemtoReg

Clk

Data InWrEn32 Adr

DataMemory

MemWrA

LU

Zero

Instruction<31:0>

0

1

0

1

01

<21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRtRs

=

Adder

Adder

PC

Clk

00Mux

4

PCSrc

PC E

xt

Adr

InstMemory

BranchZero

0

1

PC+4

BranchTarget

R[rs]

R[rt]

MainALU

(Includes ORInot in book version)

ALUControlFunction

Field

MemRd

00 = add01 = subtract10 = R-Type

e.g Sign Extend + Shift Left 2

Page 44: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#44 Lec # 4 Winter 2011 12-13-2011

RegDst

Instruction<31:0>

<21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

Adr

InstructionMemory

DATA PATHDATA PATH

ALUSrcALOp(2-bits)

MemReadMemtoReg

Control UnitControl Unit

Op

<21:25>

Fun

BranchRegWrite

<0:25>

Jump_target

Control LinesMemWrite

Page 45: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#45 Lec # 4 Winter 2011 12-13-2011

The Effect of The Control SignalsSignalName

RegDst

RegWrite

ALUSrc

Branch

MemRead

MemWrite

MemtoReg

Effect when deasserted (=0)

The register destination number for thewrite register comes from the rt field(instruction bits 20:16).

None

The second main ALU operand comes from the second register file output(Read data 2) R[rt]

The PC is replaced by the output of the adder that computes PC + 4

None

None

The value fed to the register write data input comes from the main ALU.

Effect when asserted (=1)

The register destination number for thewrite register comes from the rd field(instruction bits 15:11).

The register on the write register inputis written with the value on the Writedata input.

The second main ALU operand is thesign-extended lower 16 bits on the instruction (imm16)

If Zero =1 The PC is replaced by the output of the adder that computes the branch target.

Data memory contents designated by the address input are put on the Read dataoutput.

Data memory contents designated by the address input are replaced by the value on the Write data input.

The value fed to the register write data input comes from data memory.

(BEQ)

Page 46: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#46 Lec # 4 Winter 2011 12-13-2011

Control Line Settings

Instruction

R-Format

lw

sw

beq

RegDst

1

0

X

X

ALUSrc

0

1

1

0

Memto-Reg

0

1

X

X

RegWrite

1

1

0

0

MemRead

0

1

0

0

MemWrite

0

0

1

0

Branch

0

0

0

1

ALUOp1

1

0

0

0

ALUOp0

0

0

0

1

4th Edition Figure 4.18 page 3233rd Edition Figure 5.18 page 308

ALUOp (2-bits)00 = add01 = subtract10 = R-Type

ControlLines

Page 47: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#47 Lec # 4 Winter 2011 12-13-2011

The Truth Table For The Main ControlThe Truth Table For The Main Control

(Opcode)

Similar to Figure 4.22 Page 327 (3rd Edition Figure 5.22 Page 312)

Page 48: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#48 Lec # 4 Winter 2011 12-13-2011

PLA Implementation of the Main ControlPLA Implementation of the Main Control

PLA = Programmable Logic Array - Appendix C (3rd Edition Appendix B)

Figure D.2.5 in Appendix D (3rd Edition Figure C.2.5 in Appendix C)

ControlLines

To Datapath

Page 49: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#49 Lec # 4 Winter 2011 12-13-2011

Readregister 1

Readregister 2

Writeregister

Writedata

Writedata

Registers

ALU

Add

Zero

Readdata 1

Readdata 2

Signextend

16 32

Instruction[31–0] ALU

result

Add

ALUresult

Mux

Mux

Mux

Address

Datamemory

Readdata

Shiftleft 2

4

Readaddress

Instructionmemory

PC

1

0

0

1

0

1

Mux

0

1

ALUcontrol

Instruction [5–0]

Instruction [25–21]

Instruction [31–26]

Instruction [15–11]

Instruction [20–16]

Instruction [15–0]

RegDstBranchMemReadMemtoRegALUOpMemWriteALUSrcRegWrite

Control

Single Cycle MIPS Datapath Control Unit AddedSingle Cycle MIPS Datapath Control Unit Added

In this book version, ORI is not supported—no zero extend of immediate needed.

4th Edition Figure 4.21, page 3263rd Edition Figure 5.17, page 307

Function Field

rs

rt

PC +4

rd

R[rs]

R[rt]

Branch Target

PC +4

32

3232

32

32

32PC +4

ALUOp (2-bits)00 = add01 = subtract10 = R-Type

imm16

Opcode

Page 50: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#50 Lec # 4 Winter 2011 12-13-2011

Instruction Word ← Mem[PC] Fetch the instruction

PC ← PC + 4 Increment PC

PC ← PC(31-28),jump_target,00 Update PC with jump address

Adding Support For Jump::MicroMicro--Operation Sequence For Jump: JOperation Sequence For Jump: J

OP Jump_target

6 bits 26 bits

j jump_target

jump target4 bits 26 bits 2 bits

0 0PC(31-28)

JumpAddress

Jump addressin words

[31:26] [25:0]

4 highest bits from PC + 4

2

CommonSteps

Page 51: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#51 Lec # 4 Winter 2011 12-13-2011

Datapath For JumpDatapath For Jump

32PC

Clk

00Mux

PCSrc

imm16

Adder

Adder

4

PC E

xt

Next Instruction Address

Mux

JUMP

Shift left 2jump_target

Instruction(15-0)

Instruction(25-0)

32

26

PC+4(31-28)

28 32

4

32 0

1

PC+4

BranchZero

BranchTarget

JumpAddress

PC

PC(31-28),jump_target,00

e.g Sign Extend + Shift Left 2

jump target4 bits 26 bits 2 bits

0 0PC(31-28)

Jump Address

Page 52: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#52 Lec # 4 Winter 2011 12-13-2011

Readregister 1

Readregister 2

Writeregister

Writedata

Writedata

Registers

Add

Readdata 1

Readdata 2

Signextend

16 32

Instruction[31–0]

Add

ALUresult

Mux

Mux

Mux

Address

Datamemory

Readdata

Shiftleft 2

Shiftleft 2

4

Readaddress

Instructionmemory

PC

1

0

0

1

1

0

Mux

0

1

Mux

0

1

ALUcontrol

Instruction [5–0]

Instruction [25–21]

Instruction [31–26]

Instruction [15–11]

Instruction [20–16]

Instruction [15–0]

RegDstJumpBranchMemReadMemtoRegALUOpMemWriteALUSrcRegWrite

Control

Instruction [25–0] Jump address [31–0]

26 28

PC + 4 [31–28]

ALUZero

ALUresult

4

Single Cycle MIPS Datapath Extended To Handle Jump with Single Cycle MIPS Datapath Extended To Handle Jump with Control Unit AddedControl Unit Added

In this book version, ORI is not supported—no zero extend of immediate needed.

4th Edition Figure 4.24 page 3293rd Edition Figure 5.24 page 314

Function Field

rs

rt

PC +4

rd

R[rs]

R[rt]

Branch Target

PC +4

32

32

32

32

32

32 PC +4

ALUOp (2-bits)00 = add01 = subtract10 = R-Type

imm16

Opcode

R[rt]

Page 53: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#53 Lec # 4 Winter 2011 12-13-2011

Control Line Settings(with jump instruction, j added)

Instruction

R-Format

lw

sw

beq

j

RegDst

1

0

X

X

X

ALUSrc

0

1

1

0

X

Memto-Reg

0

1

X

X

X

RegWrite

1

1

0

0

0

MemRead

0

1

0

0

0

MemWrite

0

0

1

0

0

Branch

0

0

0

1

X

ALUOp1

1

0

0

0

X

ALUOp0

0

0

0

1

X

Figure 4.18 page 323 (3rd Edition Figure 5.18 page 308) modified to include j

Jump

0

0

0

0

1

Page 54: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#54 Lec # 4 Winter 2011 12-13-2011

Clk

PC

Rs, Rt, Rd,Op, Func

Clk-to-Q

ALUctr

Instruction Memoey Access Time

Old Value New Value

RegWr Old Value New Value

Delay through Control Logic

busARegister File Access Time

Old Value New Value

busBALU Delay

Old Value New Value

Old Value New Value

New ValueOld Value

ExtOp Old Value New Value

ALUSrc Old Value New Value

MemtoReg Old Value New Value

Address Old Value New Value

busW Old Value New

Delay through Extender & Mux

RegisterWrite Occurs

Data Memory Access Time

Worst Case Timing (Load)Worst Case Timing (Load)

Page 55: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#55 Lec # 4 Winter 2011 12-13-2011

Instruction Timing ComparisonInstruction Timing Comparison

PC Inst Memory mux ALU Data Mem mux

PC Reg FileInst Memory mux ALU mux

PC Inst Memory mux ALU Data Mem

PC Inst Memory cmp mux

Reg File

Reg File

Reg File

Arithmetic & Logical

Load

Store

Branch

Critical Path

setup

setup

PC Inst Memory mux

Jump

Page 56: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#56 Lec # 4 Winter 2011 12-13-2011

Simplified Single Cycle Datapath Timing• Assuming the following datapath/control hardware components delays:

– Memory Units: 2 ns– ALU and adders: 2 ns– Register File: 1 ns – Control Unit < 1 ns

• Ignoring Mux and clk-to-Q delays, critical path analysis:

InstructionMemory

Register Read

MainALU

DataMemory

Register Write

PC + 4ALU

Branch TargetALU

ControlUnit

Time

0 2ns 3ns 4ns 5ns 7ns 8ns

Critical Path(Load)

Obtained from low-level target VLSI implementation technology of components

ns = nanosecond = 10-9 second

}

2 ns 2 ns 2 ns

1 ns

1 ns

2 ns

CriticalPath = 8 ns (LW)

Page 57: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#57 Lec # 4 Winter 2011 12-13-2011

Performance of SinglePerformance of Single--Cycle (CPI=1) CPUCycle (CPI=1) CPU• Assuming the following datapath hardware components delays:

– Memory Units: 2 ns– ALU and adders: 2 ns– Register File: 1 ns

• The delays needed for each instruction type can be found :

• The clock cycle is determined by the instruction with longest delay: The load in this case which is 8 ns. Clock rate = 1 / 8 ns = 125 MHz

• A program with I = 1,000,000 instructions executed takes:Execution Time = T = I x CPI x C = 106 x 1 x 8x10-9 = 0.008 s = 8 msec

Instruction Instruction Register ALU Data Register TotalClass Memory Read Operation Memory Write Delay

ALU 2 ns 1 ns 2 ns 1 ns 6 ns

Load 2 ns 1 ns 2 ns 2 ns 1 ns 8 ns

Store 2 ns 1 ns 2 ns 2 ns 7 ns

Branch 2 ns 1 ns 2 ns 5 ns

Jump 2 ns 2 ns

Load has longest delay of 8 nsthus determining the clock cycle of the CPU to be 8ns

Nanosecond, ns = 10-9 second

C = 8 ns

T = I x CPI x C

Page 58: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#58 Lec # 4 Winter 2011 12-13-2011

• The MIPS jump and link instruction, jal is used to support procedure calls by jumping to jump address (similar to j ) and saving the address of the following instruction PC+4 in register $ra ($31)

jal Address • jal uses the j instruction format:

• We wish to add jal to the single cycle datapath in Figure 4.24 page 329 (3rd Edition Figure 5.24 page 314) . Add any necessary datapaths and control signals to the single-clock datapath and justify the need for the modifications, if any.

• Specify control line values for this instruction.

Adding Support for jal to Single Cycle Datapath

op (6 bits) Target address (26 bits)

R[31] ← PC + 4

PC ← Jump Address

i.e. Return Address

Page 59: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#59 Lec # 4 Winter 2011 12-13-2011

31 22

Instruction Word ← Mem[PC]R[31] ← PC + 4PC ← Jump Address

1. Expand the multiplexor controlled by RegDst to include the value 31 as a new input 2. 2. Expand the multiplexor controlled by MemtoReg to have PC+4 as new input 2.

jump and link, jal support to Single Cycle Datapath

PC + 4

Jump Address

PC + 4

rs

rt

rd

imm16

R[rs]

R[rt]

Branch Target

PC + 4

Page 60: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#60 Lec # 4 Winter 2011 12-13-2011

Memto- Reg Mem MemRegDst ALUSrc Reg Write Read Write Branch ALUOp1 ALUOp0 Jump

R-format 01 0 00 1 0 0 0 1 0 0

lw 00 1 01 1 1 0 0 0 0 0

sw xx 1 xx 0 0 1 0 0 0 0

beq xx 0 xx 0 0 0 1 0 1 0

J xx x xx 0 0 0 x x x 1

JAL 10 x 10 1 0 0 x x x 1

Adding Control Lines Settings for jal(For Textbook Single Cycle Datapath including Jump)

jump and link, jal support to Single Cycle Datapath

PC ← Jump AddressPC+ 4R[31]

MemtoRegIs now 2 bits

RegDstIs now 2 bits

Instruction Word ← Mem[PC]R[31] ← PC + 4PC ← Jump Address

Page 61: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#61 Lec # 4 Winter 2011 12-13-2011

• We wish to add a variant of lw (load word) let’s call it LWR to the single cycle datapath in Figure 4.24 page 329 (3rd Edition Figure 5.24 page 314).

LWR $rd, $rs, $rt

• The LWR instruction is similar to lw but it sums two registers (specified by $rs, $rt) to obtain the effective load address and uses the R-Type format

• Add any necessary datapaths and control signals to the single cycle datapath and justify the need for the modifications, if any.

• Specify control line values for this instruction.

Adding Support for LWR to Single Cycle Datapath

Loaded word from memory written to register rd

Load Word Register

Page 62: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#62 Lec # 4 Winter 2011 12-13-2011

Memto- Reg Mem MemRegDst ALUSrc Reg Write Read Write Branch ALUOp1 ALUOp0 Jump

R-format 1 0 0 1 0 0 0 1 0 0

lw 0 1 1 1 1 0 0 0 0 0

sw x 1 x 0 0 1 0 0 0 0

beq x 0 x 0 0 0 1 0 1 0

J x x x 0 0 0 x x x 1

LWR 1 0 1 1 1 0 0 0 0 0

Adding Control Lines Settings for LWR(For Textbook Single Cycle Datapath including Jump)

LWR (R-format LW) support to Single Cycle Datapath

rd

Instruction Word ← Mem[PC]PC ← PC + 4R[rd] ← Mem[ R[rs] + R[rt] ]

No new components or connections are needed for the datapathjust the proper control line settings

R[rt]Add

Page 63: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#63 Lec # 4 Winter 2011 12-13-2011

• We wish to add a new instruction jm (jump memory) to the single cycle datapath in Figure 4.24 page 329 (3rd Edition Figure 5.24 page 314).

jm offset($rs)

• The jm instruction loads a word from effective address (R[rs] + offset), this is similar to lw except the loaded word is put in the PC instead of register $rt.

• Jm used the I-format with field rt not used.

• Add any necessary datapaths and control signals to the single cycle datapath and justify the need for the modifications, if any.

• Specify control line values for this instruction.

Adding Support for jm to Single Cycle Datapath

OP rs rt address (imm16)

6 bits 5 bits 5 bits 16 bitsNot Used

Jump Memory

Page 64: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#64 Lec # 4 Winter 2011 12-13-2011

Instruction Word ← Mem[PC]PC ← Mem[R[rs] + SignExt[imm16]]

1. Expand the multiplexor controlled by Jump to include the Read Data (data memory output) as new input 2. The Jump control signal is now 2 bits

Adding jump memory, jm support to Single Cycle Datapath

2Jump

2

2Jump

rs

rt

rd

imm16

PC + 4

Branch Target

R[rs]

R[rt]

Page 65: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#65 Lec # 4 Winter 2011 12-13-2011

Memto- Reg Mem MemRegDst ALUSrc Reg Write Read Write Branch ALUOp1 ALUOp0 Jump

R-format 1 0 0 1 0 0 0 1 0 00

lw 0 1 1 1 1 0 0 0 0 00

sw x 1 x 0 0 1 0 0 0 00

beq x 0 x 0 0 0 1 0 1 00

J x x x 0 0 0 x x x 01

Jm x 1 x 0 1 0 x 0 0 10

Adding Control Lines Settings for jm(For Textbook Single Cycle Datapath including Jump)

Adding jm support to Single Cycle Datapath

PC ← Mem[R[rs] + SignExt[imm16]]

Jumpis now 2 bits

add

Page 66: Requirements ISA CPU Organization (Design) Datapath Designmeseec.ce.rit.edu/eecc550-winter2011/550-12-13-2011.pdf · 2011. 12. 13. · EECC550 - Shaaban #2 Lec # 4 Winter 2011 12-13-2011

EECC550 EECC550 -- ShaabanShaaban#66 Lec # 4 Winter 2011 12-13-2011

Drawbacks of Single Cycle ProcessorDrawbacks of Single Cycle Processor1. Long cycle time:

– All instructions must take as much time as the slowest• Here, cycle time for load is longer than needed for all other instructions.

– Cycle time must be long enough for the load instruction:PC’s Clock -to-Q + Instruction Memory Access Time +Register File Access Time + ALU Delay (address calculation) +Data Memory Access Time + Register File Setup Time + Clock Skew

– Real memory is not as well-behaved as idealized memory• Cannot always complete data access in one (short) cycle.

2. Impossible to implement complex, variable-length instructions and complex addressing modes in a single cycle.– e.g indirect memory addressing.

3. High and duplicate hardware resource requirements– Any hardware functional unit cannot be used more than once in

a single cycle (e.g. ALUs).

4. Does not allow overlap of instruction processing (instruction pipelining, chapter 6).

e.g R[$1] ← Mem[ Mem[$2] ]