78
CS141-L4- 1 Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually build a CPU Questions on CS140 ? Computer Arithmetic ? •Attend office hours with TAs or me. •Do the exercises in the text.

CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU Previously: built and ALU. Today: Actually…

Embed Size (px)

DESCRIPTION

CS141-L4-3Tarun Soni, Summer’03 CPU: Building blocks Adder MUX ALU 32 A B Sum Carry 32 A B Result OP 32 A B Y Select Adder MUX ALU CarryIn

Citation preview

Page 1: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-1 Tarun Soni, Summer’03

Single Cycle CPU

Previously: built and ALU.Today: Actually build a CPU

Questions on CS140 ? Computer Arithmetic ?

•Attend office hours with TAs or me.•Do the exercises in the text.

Page 2: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-2 Tarun Soni, Summer’03

Instruction Set Architectures Performance issues 2s complement, Addition, Subtraction Multiplication, Division, Floating Point numbers

The Story so far:

Basically ISA & ALU stuff

Page 3: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-3 Tarun Soni, Summer’03

CPU: Building blocks

• Adder

• MUX

• ALU

32

32

A

B32

Sum

Carry

32

32

A

B32

Result

OP

32A

B32

Y32

Select

Adder

MU

XA

LU

CarryIn

Page 4: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-4 Tarun Soni, Summer’03

CPU: Building blocks

OP

32A

B32

Y32

Select

MU

X

3232A[31..0]

B[31..0]32

Sum[31..0]

Carry

Adder

CarryIn

32A[63..32]

B[63..32]32

Sum[63..32]

Carry

Adder

CarryIn

32

• Building a 64-bit adder from 2x32-bit adders

Page 5: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-5 Tarun Soni, Summer’03

CPU: Building blocks

32A

B32

Sum[63..32]32

Select

MU

X

32

32

A[31..0]

B[31..0]32

Sum[31..0]

Carry

Adder

CarryIn

32

32

A[63..32]

B[63..32]32

S

Cout

Adder

Cin=0

32

32

A[63..32]

B[63..32]32

S

CoutA

dder

Cin=11

A

B1

Cout1

Select

MU

X

• Silicon is cheap – sort-of

Page 6: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-6 Tarun Soni, Summer’03

CPU

Single Cycle CPU

Page 7: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-7 Tarun Soni, Summer’03

CPU

The Big Picture: Where are We Now?

• The Five Classic Components of a Computer

• Datapath Design, then Control Design

Control

Datapath

Memory

ProcessorInput

Output

Page 8: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-8 Tarun Soni, Summer’03

CPU: The big picture

InstructionFetch

InstructionDecode

OperandFetch

Execute

ResultStore

NextInstruction ° Design hardware for each of these steps!!!

Execute anentire instruction

Fetc

h

Dec

ode

Fetc

h

Exec

ute

Stor

e

Nex

t

Page 9: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-9 Tarun Soni, Summer’03

CPU: Clocking

Clk

Don’t CareSetup Hold

.

.

.

.

.

.

.

.

.

.

.

.

Setup Hold

• All storage elements are clocked by the same clock edge

Page 10: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-10 Tarun Soni, Summer’03

CPU

The Big Picture: The Performance Perspective

• Execution Time = Insts * CPI * Cycle Time• Processor design (datapath and control) will determine:

– Clock cycle time– Clock cycles per instruction

• Starting today:– Single cycle processor:

• Advantage: One clock cycle per instruction• Disadvantage: long cycle time

Execute anentire instruction

Page 11: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-11 Tarun Soni, Summer’03

CPU

• We're ready to look at an implementation of the MIPS• Simplified to contain only:

– memory-reference instructions: lw, sw – arithmetic-logical instructions: add, sub, and, or, slt– control flow instructions: beq

• Generic Implementation:– use the program counter (PC) to supply instruction address– get the instruction from memory– read registers– use the instruction to decide exactly what to do

• All instructions use the ALU after reading the registersmemory-reference? arithmetic? control flow?

CPI

Inst. Count Cycle Time

Page 12: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-12 Tarun Soni, Summer’03

CPU

Review: The MIPS Instruction Formats

op target address02631

6 bits 26 bits

op rs rt rd shamt funct061116212631

6 bits 6 bits5 bits5 bits5 bits5 bits

op rs rt immediate016212631

6 bits 16 bits5 bits5 bits

°The different fields are:•op: operation of the instruction•rs, rt, rd: the source and destination register specifiers•shamt: shift amount•funct: selects the variant of the operation in the “op” field•address / immediate: address offset or immediate value•target address: target address of the jump instruction

Page 13: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-13 Tarun Soni, Summer’03

CPU

• R-type– add rd, rs, rt– sub, and, or, slt

• LOAD and STORE– lw rt, rs, imm16– sw rt, rs, imm16

• BRANCH:– beq rs, rt, imm16

op rs rt rd shamt funct061116212631

6 bits 6 bits5 bits5 bits5 bits5 bits

op rs rt immediate016212631

6 bits 16 bits5 bits5 bits

op rs rt displacement016212631

6 bits 16 bits5 bits5 bits

Page 14: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-14 Tarun Soni, Summer’03

CPU

• Memory– instruction & data

• Registers (32 x 32)– read RS– read RT– Write RT or RD

• PC• Extender• Add and Sub register or extended immediate• Add 4 or extended immediate to PC

Requirements to implement the ISA

Page 15: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-15 Tarun Soni, Summer’03

CPU

• Combinational Elements• Storage Elements

– Clocking methodology

StateElement

clk

A

B

C = f(A,B,state){State[n] = f(A,B,state[n-1])}

CombinationalLogic

A

BC = f(A,B)

Page 16: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-16 Tarun Soni, Summer’03

CPU: Storage unit

• The set-reset latch– output depends on present inputs and also on past inputs

Page 17: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-17 Tarun Soni, Summer’03

CPU: D-flip flop

• Two inputs:– the data value to be stored (D)– the clock signal (C) indicating when to read & store D

• Two outputs:– the value of the internal state (Q) and it's complement

Q

C

D

_Q

D

C

Q

• Output changes only on the clock edge

QQ

_Q

Q

_Q

Dlatch

D

C

Dlatch

DD

C

C

D

C

Q

Page 18: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-18 Tarun Soni, Summer’03

CPU: Clocking Methodology

• An edge triggered methodology• Typical execution:

– read contents of some state elements, – send values through some combinational logic– write results to one or more state elements

Clock cycle

Stateelement

1Combinational logic

Stateelement

2

Page 19: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-19 Tarun Soni, Summer’03

CPU: Storage block

• Register– Similar to the D Flip Flop except

• N-bit input and output• Write Enable input

– Write Enable:• 0: Data Out will not change• 1: Data Out will become Data In (on the clock edge)

Clk

Data In

Write Enable

N N

Data Out

Page 20: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-20 Tarun Soni, Summer’03

CPU: Register Files

• Register File consists of (32) registers:– Two 32-bit output buses:– One 32-bit input bus: busW

• Register is selected by:– RA selects the register to put on busA– RB selects the register to put on busB– RW selects the register to be written

via busW when Write Enable is 1• Clock input (CLK)

• Factor only during write-enable=1;• Otherwise, this unit acts just like combinational logic.

Clk

busW

Write Enable

3232

busA

32busB

5 5 5RW RA RB

32 32-bitRegisters

Page 21: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-21 Tarun Soni, Summer’03

CPU: Register Files

Mux

Register 0Register 1

Register n – 1Register n

Mux

Read data 1

Read data 2

Read registernumber 1

Read registernumber 2

Read registernumber 1 Read

data 1

Readdata 2

Read registernumber 2

Register fileWriteregister

Writedata Write

n-to-1decoder

Register 0

Register 1

Register n – 1C

C

D

DRegister n

C

C

D

D

Register number

Write

Register data

01

n – 1

n

Built using D-flip flopsStill use the real clock (not shown here) to do the actual write

Page 22: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-22 Tarun Soni, Summer’03

CPU: Memory

• Memory (idealized)– One input bus: Data In– One output bus: Data Out

• Memory word is selected by:– Address selects the word to put on Data Out– Write Enable = 1: address selects the memory

word to be written via the Data In bus• Clock input (CLK)

– The CLK input is a factor ONLY during write operation– During read operation, behaves as a combinational logic block:

• Address valid => Data Out valid after “access time.”

Clk

Data In

Write Enable

32 32DataOut

Address

Page 23: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-23 Tarun Soni, Summer’03

CPU: RTL

• is a mechanism for describing the movement and manipulation of data between storage elements:

R[3] <- R[5] + R[7]PC <- PC + 4 + R[5]R[rd] <- R[rs] + R[rt]R[rt] <- Mem[R[rs] + immed]

Register Transfer Language (RTL)

Page 24: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-24 Tarun Soni, Summer’03

CPU: More building blocks

PC

Instructionmemory

Instructionaddress

Instruction

a. Instruction memory b. Program counter

Add Sum

c. Adder

ALU control

RegWrite

RegistersWriteregister

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Writedata

ALUresult

ALU

Data

Data

Registernumbers

a. Registers b. ALU

Zero5

5

5 3

16 32Sign

extend

b. Sign-extension unit

MemRead

MemWrite

Datamemory

Writedata

Readdata

a. Data memory unit

Address

Page 25: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-25 Tarun Soni, Summer’03

CPU: The big picture

InstructionFetch

InstructionDecode

OperandFetch

Execute

ResultStore

NextInstruction ° Design hardware for each of these steps!!!

Execute anentire instruction

Fetc

h

Dec

ode

Fetc

h

Exec

ute

Stor

e

Nex

t

Page 26: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-26 Tarun Soni, Summer’03

CPU: Instruction Fetch

• RTL version of the instruction fetch step: • Fetch the Instruction: mem[PC]– Update the program counter:

• Sequential Code: PC <- PC + 4 • Branch and Jump: PC <- “something else”

32

Instruction WordAddress

InstructionMemory

PCClk

Next AddressLogic

Page 27: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-27 Tarun Soni, Summer’03

CPU: Binary arithmetic for PC

• In theory, the PC is a 32-bit byte address into the instruction memory:– Sequential operation: PC<31:0> = PC<31:0> + 4– Branch operation: PC<31:0> = PC<31:0> + 4 + SignExt[Imm16] * 4

• The magic number “4” always comes up because:– The 32-bit PC is a byte address– And all our instructions are 4 bytes (32 bits) long

• In other words:– The 2 LSBs of the 32-bit PC are always zeros– There is no reason to have hardware to keep the 2 LSBs

• In practice, we can simplify the hardware by using a 30-bit PC<31:2>:– Sequential operation: PC<31:2> = PC<31:2> + 1– Branch operation: PC<31:2> = PC<31:2> + 1 + SignExt[Imm16]– In either case: Instruction Memory Address = PC<31:2> concat “00”

Page 28: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-28 Tarun Soni, Summer’03

CPU: Instruction Fetch unit

• The common RTL operations– Fetch the Instruction: inst <- mem[PC]– Update the program counter:

• Sequential Code: PC <- PC + 4 • Branch and Jump PC <- “something else”

Page 29: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-29 Tarun Soni, Summer’03

CPU: Register-Register Operations (Add, Subtract etc.)

• R[rd] <- R[rs] op R[rt] Example: addU rd, rs, rt– Ra, Rb, and Rw come from instruction’s rs, rt, and rd fields– ALUctr and RegWr: control logic after decoding the instruction

32Result

ALUctr

Clk

busW

RegWr

3232

busA

32busB

5 5 5

Rw Ra Rb32 32-bitRegisters

Rs RtRd

ALU

op rs rt rd shamt funct061116212631

6 bits 6 bits5 bits5 bits5 bits5 bits

° Worry about instruction decode to generate ALUctr and RegWr later.

Page 30: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-30 Tarun Soni, Summer’03

CPU: Register - Register Timing

32Result

ALUctr

Clk

busW

RegWr

3232

busA

32busB

5 5 5

Rw Ra Rb32 32-bitRegisters

Rs RtRd

AL

U

Clk

PC

Rs, Rt, Rd,Op, Func

Clk-to-Q

ALUctr

Instruction Memory Access Time

Old Value New Value

RegWr Old Value New Value

Delay through Control Logic

busA, B

Register File Access TimeOld Value New Value

busWALU Delay

Old Value New Value

Old Value New Value

New ValueOld Value

Register WriteOccurs Here

Page 31: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-31 Tarun Soni, Summer’03

CPU: Logical Immediate Op.• R[rt] <- R[rs] op ZeroExt[imm16] ]

32

Result

ALUctr

Clk

busW

RegWr

3232

busA

32busB

5 5 5

Rw Ra Rb32 32-bitRegisters

Rs

RtRdRegDst

ZeroExt

Mux

Mux

3216imm16

ALUSrc

ALU

11

op rs rt immediate016212631

6 bits 16 bits5 bits5 bits rd?

immediate016 1531

16 bits16 bits0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Handle Rt as destination

HandleImmediate asoperand

Page 32: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-32 Tarun Soni, Summer’03

CPU: Load Operations

• R[rt] <- Mem[R[rs] + SignExt[imm16]] Example: lw rt, rs, imm16

11

op rs rt immediate016212631

6 bits 16 bits5 bits5 bits rd

32

ALUctr

Clk

busW

RegWr

3232

busA

32busB

5 5 5

Rw Ra Rb32 32-bitRegisters

Rs

RtRdRegDst

Extender

Mux

Mux

3216

imm16

ALUSrc

ExtOp

Clk

Data InWrEn

32

Adr

DataMemory

32

ALU

MemWr Mux

W_Src

Need dataMemory!

Reg-Write could be from result or data memory

Page 33: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-33 Tarun Soni, Summer’03

CPU: Store Operations

• Mem[ R[rs] + SignExt[imm16] <- R[rt] ] Example: sw rt, rs, imm16

32

ALUctr

Clk

busW

RegWr

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst

Extender

Mux

Mux

3216imm16

ALUSrcExtOp

Clk

Data InWrEn

32Adr

DataMemory

MemWr

ALU

op rs rt immediate016212631

6 bits 16 bits5 bits5 bits

32

Mux

W_SrcReg can

write to Data Memory

Page 34: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-34 Tarun Soni, Summer’03

CPU: Branching

• beq rs, rt, imm16

– mem[PC] Fetch the instruction from memory

– Equal <- R[rs] == R[rt] Calculate the branch condition

– if (COND eq 0) Calculate the next instruction’s address• PC <- PC + 4 + ( SignExt(imm16) x 4 )

– else• PC <- PC + 4

op rs rt immediate016212631

6 bits 16 bits5 bits5 bits

Page 35: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-35 Tarun Soni, Summer’03

CPU: Datapath for Branching

• beq rs, rt, imm16 Datapath generates condition (equal)

op rs rt immediate016212631

6 bits 16 bits5 bits5 bits

32

imm16

PCClk

00

Adder

Mux

Adder

4nPC_sel

Clk

busW

RegWr

32

busA

32busB

5 5 5

Rw Ra Rb32 32-bitRegisters

Rs Rt

Equa

l?

Cond

PC Ext

Inst Address

Calculate (PC+4) as well as (imm16+PC+4) and choose one

Calculate the “condition” part of the branch op.

Page 36: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-36 Tarun Soni, Summer’03

CPU: The Aggregate Datapathim

m16

32

ALUctr

Clk

busW

RegWr

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst

Extender

Mux

3216imm16

ALUSrcExtOp

Mux

MemtoReg

Clk

Data InWrEn32 Adr

DataMemory

MemWrA

LU

Equal

Instruction<31:0>

0

1

0

1

01

<21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRtRs

=

Adder

Adder

PC

Clk

00Mux

4

nPC_sel

PC E

xt

Adr

InstMemory

Still need to worry about Instruction Decode

Page 37: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-37 Tarun Soni, Summer’03

CPU: Datapath: High-level view• Register file and ideal memory:

– The CLK input is a factor ONLY during write operation– During read operation, behave as combinational logic:

• Address valid => Output valid after “access time.”

Critical Path (Load Operation) = PC’s Clk-to-Q + Instruction Memory’s Access Time + Register File’s Access Time + ALU to Perform a 32-bit Add + Data Memory Access Time + Setup Time for Register File Write + Clock Skew

Clk

5

Rw Ra Rb32 32-bitRegisters

RdA

LU

Clk

Data In

DataAddress Ideal

DataMemory

Instruction

InstructionAddress

IdealInstruction

Memory

Clk

PC

5Rs

5Rt

16Imm

32

323232

A

B

Nex

t Add

ress

Page 38: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-38 Tarun Soni, Summer’03

CPU: Control Signals

ALUctrRegDst ALUSrcExtOp MemtoRegMemWr Equal

Instruction<31:0>

<21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

nPC_sel

Adr

InstMemory

DATA PATH

Control

Op

<21:25>

Fun

RegWr

Page 39: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-39 Tarun Soni, Summer’03

CPU: Control Signals: Meaning

Adr

InstMemory

• Rs, Rt, Rd and Imed16 hardwired into datapath• nPC_sel: 0 => PC <– PC + 4; 1 => PC <– PC + 4 + SignExt(Im16) || 00

Adder

Adder

PC

Clk

00Mux

4

nPC_sel

PC Extim

m16

Page 40: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-40 Tarun Soni, Summer’03

CPU: Control Signals: Meaning• ExtOp: “zero”, “sign”• ALUsrc: 0 => regB; 1 =>

immed• ALUctr: “add”, “sub”, “or”

32

ALUctr

Clk

busW

RegWr

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst

Extender

Mux

3216imm16

ALUSrcExtOp

Mux

MemtoReg

Clk

Data InWrEn32 Adr

DataMemory

MemWr

ALU

Equal

0

1

0

1

01

° MemWr: write memory° MemtoReg: 1 => Mem° RegDst: 0 => “rt”; 1 =>

“rd”° RegWr: write dest register

=

Page 41: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-41 Tarun Soni, Summer’03

CPU: Control Signals for various operations

inst Register Transfer

ADD R[rd] <– R[rs] + R[rt]; PC <– PC + 4

ALUsrc = RegB, ALUctr = “add”, RegDst = rd, RegWr, nPC_sel = “+4”

SUB R[rd] <– R[rs] – R[rt]; PC <– PC + 4

ALUsrc = RegB, ALUctr = “sub”, RegDst = rd, RegWr, nPC_sel = “+4”

ORi R[rt] <– R[rs] + zero_ext(Imm16); PC <– PC + 4

ALUsrc = Im, Extop = “Z”, ALUctr = “or”, RegDst = rt, RegWr, nPC_sel = “+4”

LOAD R[rt] <– MEM[ R[rs] + sign_ext(Imm16)]; PC <– PC + 4

ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemtoReg, RegDst = rt, RegWr, nPC_sel = “+4”

STORE MEM[ R[rs] + sign_ext(Imm16)] <– R[rs]; PC <– PC + 4

ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemWr, nPC_sel = “+4”

BEQ if ( R[rs] == R[rt] ) then PC <– PC + sign_ext(Imm16)] || 00 else PC <– PC + 4

nPC_sel = EQUAL, ALUctr = “sub”

Page 42: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-42 Tarun Soni, Summer’03

CPU: Control Signals: Logic Design

• nPC_sel <= if (OP == BEQ) then EQUAL else 0• ALUsrc <= if (OP == “000000”) then “regB” else “immed”• ALUctr <= if (OP == “000000”) then funct

elseif (OP == ORi) then “OR”elseif (OP == BEQ) then “sub”

else “add”• ExtOp <= _____________• MemWr <= _____________• MemtoReg <= _____________• RegWr: <=_____________• RegDst: <= _____________

Page 43: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-43 Tarun Soni, Summer’03

CPU: Control Signals: Logic Design

• nPC_sel <= if (OP == BEQ) then EQUAL else 0• ALUsrc <= if (OP == “000000”) then “regB” else “immed”• ALUctr <= if (OP == “000000”) then funct

elseif (OP == ORi) then “OR” elseif (OP == BEQ) then “sub” else “add”

• ExtOp <= if (OP == ORi) then “zero” else “sign”• MemWr <= (OP == Store)• MemtoReg <= (OP == Load)• RegWr: <= if ((OP == Store) || (OP == BEQ)) then 0 else 1• RegDst: <= if ((OP == Load) || (OP == ORi)) then 0 else 1

Page 44: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-44 Tarun Soni, Summer’03

CPU: Example: Load

32

ALUctr

Clk

busW

RegWr

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst

Extender

Mux

3216imm16

ALUSrcExtOp

Mux

MemtoReg

Clk

Data InWrEn32 Adr

DataMemory

MemWrA

LUEqual

Instruction<31:0>

0

1

0

1

01

<21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRtRs

=

imm

16

Adder

Adder

PC

Clk

00Mux

4

nPC_sel

PC Ext

Adr

InstMemory

sign ext

addrt+4

R[rt] <- Mem[R[rs] + SignExt[imm16]]Viz., lw rt, rs, imm16

Page 45: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-45 Tarun Soni, Summer’03

CPU: The abstract version

• Logical vs. Physical Structure

DataOut

Clk

5

Rw Ra Rb32 32-bitRegisters

Rd

ALU

Clk

Data In

DataAddress Ideal

DataMemory

Instruction

InstructionAddress

IdealInstruction

Memory

Clk

PC

5Rs

5Rt

32

323232

A

B

Nex

t Add

ress

Control

Datapath

Control Signals Conditions

Page 46: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-46 Tarun Soni, Summer’03

CPU: The real thing

Page 47: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-47 Tarun Soni, Summer’03

CPU: 5 steps to design

• 5 steps to design a processor– 1. Analyze instruction set => datapath requirements– 2. Select set of datapath components & establish clock methodology– 3. Assemble datapath meeting the requirements– 4. Analyze implementation of each instruction to determine setting of control points that

effects the register transfer.– 5. Assemble the control logic

• MIPS makes it easier– Instructions same size– Source registers always in same place– Immediates same size, location– Operations always on registers/immediates

• Single cycle datapath => CPI=1, CCT => long

Page 48: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-48 Tarun Soni, Summer’03

CPU: Control Section

• The Five Classic Components of a Computer

Control

Datapath

Memory

ProcessorInput

Output

Page 49: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-49 Tarun Soni, Summer’03

CPU: Add Instruction

• add rd, rs, rt

– mem[PC] Fetch the instruction from memory

– R[rd] <- R[rs] + R[rt] The actual operation

– PC <- PC + 4 Calculate the next instruction’s address

op rs rt rd shamt funct061116212631

6 bits 6 bits5 bits5 bits5 bits5 bits

Page 50: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-50 Tarun Soni, Summer’03

CPU: The Add Instruction

Instruction Fetch Unit at the Beginning of Add

PC E

xt

• Fetch the instruction from Instruction memory: Instruction <- mem[PC]– This is the same for all instructions

Adr

InstMemory

Adder

Adder

PC

Clk

00Mux

4

nPC_sel

imm

16Instruction<31:0>

Page 51: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-51 Tarun Soni, Summer’03

CPU: The Add Instruction

The Single Cycle Datapath during Add

32

ALUctr = Add

Clk

busW

RegWr = 1

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst = 1

Extender

Mux

Mux

3216imm16

ALUSrc = 0

Mux

MemtoReg = 0

Clk

Data InWrEn

32Adr

DataMemory

32

MemWr = 0A

LU

InstructionFetch Unit

Clk

Zero

Instruction<31:0>• R[rd] <- R[rs] + R[rt]

0

1

0

1

01<21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

op rs rt rd shamt funct061116212631

nPC_sel= +4

Page 52: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-52 Tarun Soni, Summer’03

CPU: The Add Instruction

Instruction Fetch Unit at the End of Add

• PC <- PC + 4– This is the same for all instructions except: Branch and Jump

Adr

InstMemory

Adder

Adder

PC

Clk

00Mux

4

nPC_sel

imm

16Instruction<31:0>

Page 53: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-53 Tarun Soni, Summer’03

CPU: The Or Immediate Instruction

• R[rt] <- R[rs] or ZeroExt[Imm16]

op rs rt immediate016212631

32

ALUctr =

Clk

busW

RegWr =

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst =

Extender

Mux

Mux

3216imm16

ALUSrc =

ExtOp =

Mux

MemtoReg =

Clk

Data InWrEn

32Adr

DataMemory

32

MemWr = A

LU

InstructionFetch Unit

Clk

Zero

Instruction<31:0>

0

1

0

1

01<21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

nPC_sel =

Page 54: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-54 Tarun Soni, Summer’03

CPU: The Or Immediate Instruction

The Single Cycle Datapath during Or Immediate

32

ALUctr = Or

Clk

busW

RegWr = 1

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst = 0

Extender

Mux

Mux

3216imm16

ALUSrc = 1

ExtOp = 0

Mux

MemtoReg = 0

Clk

Data InWrEn

32Adr

DataMemory

32

MemWr = 0A

LU

InstructionFetch Unit

Clk

Zero

Instruction<31:0>

• R[rt] <- R[rs] or ZeroExt[Imm16]

0

1

0

1

01<21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

op rs rt immediate016212631

nPC_sel= +4

Page 55: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-55 Tarun Soni, Summer’03

CPU: The Load Instruction

The Single Cycle Datapath during Load

32

ALUctr = Add

Clk

busW

RegWr = 1

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst = 0

Extender

Mux

Mux

3216imm16

ALUSrc = 1

ExtOp = 1

Mux

MemtoReg = 1

Clk

Data InWrEn

32Adr

DataMemory

32

MemWr = 0A

LU

InstructionFetch Unit

Clk

Zero

Instruction<31:0>

0

1

0

1

01<21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

• R[rt] <- Data Memory {R[rs] + SignExt[imm16]}

op rs rt immediate016212631

nPC_sel= +4

Page 56: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-56 Tarun Soni, Summer’03

CPU: The Store Instruction

The Single Cycle Datapath during Store• Data Memory {R[rs] + SignExt[imm16]} <- R[rt]

op rs rt immediate016212631

32

ALUctr =

Clk

busW

RegWr =

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst =

Extender

Mux

Mux

3216imm16

ALUSrc =

ExtOp =

Mux

MemtoReg =

Clk

Data InWrEn

32Adr

DataMemory

32

MemWr = A

LU

InstructionFetch Unit

Clk

Zero

Instruction<31:0>

0

1

0

1

01<21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

nPC_sel =

Page 57: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-57 Tarun Soni, Summer’03

CPU: The Store Instruction

The Single Cycle Datapath during Store

32

ALUctr = Add

Clk

busW

RegWr = 0

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst = x

Extender

Mux

Mux

3216imm16

ALUSrc = 1

ExtOp = 1

Mux

MemtoReg = x

Clk

Data InWrEn

32Adr

DataMemory

32

MemWr = 1A

LU

InstructionFetch Unit

Clk

Zero

Instruction<31:0>

0

1

0

1

01<21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

• Data Memory {R[rs] + SignExt[imm16]} <- R[rt]

op rs rt immediate016212631

nPC_sel= +4

Page 58: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-58 Tarun Soni, Summer’03

CPU: Datapath during branch

32

ALUctr = Subtract

Clk

busW

RegWr = 0

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst = x

Extender

Mux

Mux

3216imm16

ALUSrc = 0

ExtOp = x

Mux

MemtoReg = x

Clk

Data InWrEn

32Adr

DataMemory

32

MemWr = 0A

LU

InstructionFetch Unit

Clk

Zero

Instruction<31:0>

0

1

0

1

01<21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

• if (R[rs] - R[rt] == 0) then Zero <- 1 ; else Zero <- 0

op rs rt immediate016212631

nPC_sel= “Br”

Page 59: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-59 Tarun Soni, Summer’03

CPU: Datapath during branch

Instruction Fetch Unit at the End of Branch

• if (Zero == 1) then PC = PC + 4 + SignExt[imm16]*4 ; else PC = PC + 4

op rs rt immediate016212631

Adr

InstMemory

Adder

Adder

PC

Clk

00Mux

4

nPC_sel

imm

16Instruction<31:0>

Page 60: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-60 Tarun Soni, Summer’03

CPU: Creating control from Datapath

ALUctrRegDst ALUSrcExtOp MemtoRegMemWr Equal

<21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

nPC_sel

Adr

InstMemory

DATA PATH

Control

Op

<21:25>

Fun

RegWr

Page 61: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-61 Tarun Soni, Summer’03

CPU: Control Signals

inst Register Transfer

ADD R[rd] <– R[rs] + R[rt]; PC <– PC + 4

ALUsrc = RegB, ALUctr = “add”, RegDst = rd, RegWr, nPC_sel = “+4”

SUB R[rd] <– R[rs] – R[rt]; PC <– PC + 4

ALUsrc = RegB, ALUctr = “sub”, RegDst = rd, RegWr, nPC_sel = “+4”

ORi R[rt] <– R[rs] + zero_ext(Imm16); PC <– PC + 4

ALUsrc = Im, Extop = “Z”, ALUctr = “or”, RegDst = rt, RegWr, nPC_sel = “+4”

LOAD R[rt] <– MEM[ R[rs] + sign_ext(Imm16)]; PC <– PC + 4

ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemtoReg, RegDst = rt, RegWr, nPC_sel = “+4”

STORE MEM[ R[rs] + sign_ext(Imm16)] <– R[rs]; PC <– PC + 4

ALUsrc = Im, Extop = “Sn”, ALUctr = “add”, MemWr, nPC_sel = “+4”

BEQ if ( R[rs] == R[rt] ) then PC <– PC + sign_ext(Imm16)] || 00 else PC <– PC + 4

nPC_sel = “Br”, ALUctr = “sub”

Page 62: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-62 Tarun Soni, Summer’03

CPU: Summary of Control Signals

add sub ori lw sw beq jumpRegDstALUSrcMemtoRegRegWriteMemWritenPCselJumpExtOpALUctr<2:0>

1001000x

Add

1001000x

Subtract

01010000

Or

01110001

Add

x1x01001

Add

x0x0010x

Subtract

xxx0001x

xxx

op target address

op rs rt rd shamt funct061116212631

op rs rt immediate

R-type

I-type

J-type

add, sub

ori, lw, sw, beq

jump

funcop 00 0000 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010Appendix A

10 0000See 10 0010 We Don’t Care :-)

Page 63: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-63 Tarun Soni, Summer’03

CPU: Summary of Control Signals

The Concept of Local Decoding

R-type ori lw sw beq jumpRegDstALUSrcMemtoRegRegWriteMemWriteBranchJumpExtOpALUop<N:0>

1001000x

“R-type”

01010000

Or

01110001

Add

x1x01001

Add

x0x0010x

Subtract

xxx0001x

xxx

op 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010

MainControl

op6

ALUControl(Local)

func

N

6ALUop

ALUctr3

ALU

Page 64: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-64 Tarun Soni, Summer’03

CPU: Encoding of ALUop

• In this exercise, ALUop has to be 2 bits wide to represent:– (1) “R-type” instructions– “I-type” instructions that require the ALU to perform:

• (2) Or, (3) Add, and (4) Subtract• To implement the full MIPS ISA, ALUop has to be 3 bits to represent:

– (1) “R-type” instructions– “I-type” instructions that require the ALU to perform:

• (2) Or, (3) Add, (4) Subtract, and (5) And (Example: andi)

MainControl

op6

ALUControl(Local)

func

N

6ALUop

ALUctr3

R-type ori lw sw beq jumpALUop (Symbolic) “R-type” Or Add Add Subtract xxx

ALUop<2:0> 1 00 0 10 0 00 0 00 0 01 xxx

Page 65: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-65 Tarun Soni, Summer’03

CPU: Decoding of the ‘func’ field

R-type ori lw sw beq jumpALUop (Symbolic) “R-type” Or Add Add Subtract xxx

ALUop<2:0> 1 00 0 10 0 00 0 00 0 01 xxx

MainControl

op6

ALUControl(Local)

func

N

6ALUop

ALUctr3

op rs rt rd shamt funct061116212631

R-type

funct<5:0> Instruction Operation10 000010 001010 010010 010110 1010

addsubtractandorset-on-less-than

ALUctr<2:0> ALU Operation000001010110111

AddSubtract

AndOr

Set-on-less-than

Recall

ALUctr

ALU

Page 66: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-66 Tarun Soni, Summer’03

CPU: Truth table for ALUctr

R-type ori lw sw beqALUop(Symbolic) “R-type” Or Add Add Subtract

ALUop<2:0> 1 00 0 10 0 00 0 00 0 01

ALUop funcbit<2> bit<1> bit<0> bit<2> bit<1> bit<0>bit<3>

0 0 0 x x x x

ALUctrALUOperation

Add 0 1 0bit<2> bit<1> bit<0>

0 x 1 x x x x Subtract 1 1 00 1 x x x x x Or 0 0 11 x x 0 0 0 0 Add 0 1 01 x x 0 0 1 0 Subtract 1 1 01 x x 0 1 0 0 And 0 0 01 x x 0 1 0 1 Or 0 0 11 x x 1 0 1 0 Set on < 1 1 1

funct<3:0> Instruction Op.00000010010001011010

addsubtractandorset-on-less-than

Page 67: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-67 Tarun Soni, Summer’03

CPU: Logic Equation ALUctr[2]

The Logic Equation for ALUctr<2>

ALUop funcbit<2> bit<1> bit<0> bit<2> bit<1> bit<0>bit<3> ALUctr<2>

0 x 1 x x x x 11 x x 0 0 1 0 11 x x 1 0 1 0 1

• ALUctr<2> = !ALUop<2> & ALUop<0> + ALUop<2> & !func<2> & func<1> & !func<0>

This makes func<3> a don’t care

Page 68: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-68 Tarun Soni, Summer’03

CPU: Logic Equation ALUctr[1]

The Logic Equation for ALUctr<1>

ALUop funcbit<2> bit<1> bit<0> bit<2> bit<1> bit<0>bit<3>

0 0 0 x x x x 1ALUctr<1>

0 x 1 x x x x 11 x x 0 0 0 0 11 x x 0 0 1 0 11 x x 1 0 1 0 1

• ALUctr<1> = !ALUop<2> & !ALUop<0> + ALUop<2> & !func<2> & !func<0>

Page 69: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-69 Tarun Soni, Summer’03

CPU: Logic Equation ALUctr[0]

The Logic Equation for ALUctr<0>

ALUop funcbit<2> bit<1> bit<0> bit<2> bit<1> bit<0>bit<3> ALUctr<0>

0 1 x x x x x 11 x x 0 1 0 1 11 x x 1 0 1 0 1

• ALUctr<0> = !ALUop<2> & ALUop<0> + ALUop<2> & !func<3> & func<2> & !func<1> & func<0> + ALUop<2> & func<3> & !func<2> & func<1> & !func<0>

Page 70: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-70 Tarun Soni, Summer’03

CPU: ALU Control block

The ALU Control Block

ALUControl(Local)

func

3

6ALUop

ALUctr3

• ALUctr<2> = !ALUop<2> & ALUop<0> + ALUop<2> & !func<2> & func<1> & !func<0>

• ALUctr<1> = !ALUop<2> & !ALUop<0> + ALUop<2> & !func<2> & !func<0>

• ALUctr<0> = !ALUop<2> & ALUop<0> + ALUop<2> & !func<3> & func<2> & !func<1> & func<0> + ALUop<2> & func<3> & !func<2> & func<1> & !func<0>

Page 71: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-71 Tarun Soni, Summer’03

CPU: Main Control

R-type ori lw sw beq jumpRegDstALUSrcMemtoRegRegWriteMemWriteBranchJumpExtOpALUop (Symbolic)

1001000x

“R-type”

01010000

Or

01110001

Add

x1x01001

Add

x0x0010x

Subtract

xxx0001x

xxx

op 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010

ALUop <2> 1 0 0 0 0 xALUop <1> 0 1 0 0 0 xALUop <0> 0 0 0 0 1 x

MainControl

op6

ALUControl(Local)

func

3

6

ALUop

ALUctr3

RegDstALUSrc

:

Page 72: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-72 Tarun Soni, Summer’03

CPU: Main Control The “Truth Table” for RegWrite

R-type ori lw sw beq jumpRegWrite 1 1 1 0 0 0

op 00 0000 00 1101 10 0011 10 1011 00 0100 00 0010

• RegWrite = R-type + ori + lw= !op<5> & !op<4> & !op<3> & !op<2> & !op<1> & !op<0> (R-type) + !op<5> & !op<4> & op<3> & op<2> & !op<1> & op<0> (ori) + op<5> & !op<4> & !op<3> & !op<2> & op<1> & op<0> (lw)

op<0>

op<5>. .op<5>. .

<0>

op<5>. .

<0>

op<5>. .

<0>

op<5>. .

<0>

op<5>. .

<0>

R-type ori lw sw beq jumpRegWrite

Page 73: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-73 Tarun Soni, Summer’03

CPU: Main Control PLA Implementation of the Main Control

op<0>

op<5>. .op<5>. .

<0>

op<5>. .

<0>

op<5>. .

<0>

op<5>. .

<0>

op<5>. .

<0>

R-type ori lw sw beq jumpRegWrite

ALUSrc

MemtoRegMemWrite

BranchJump

RegDst

ExtOp

ALUop<2>ALUop<1>ALUop<0>

Page 74: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-74 Tarun Soni, Summer’03

CPU Putting it All Together: A Single Cycle Processor

32

ALUctr

Clk

busW

RegWr

3232

busA

32busB

55 5

Rw Ra Rb32 32-bitRegisters

Rs

Rt

Rt

RdRegDst

Extender

Mux

Mux

3216imm16

ALUSrc

ExtOp

Mux

MemtoReg

Clk

Data InWrEn

32Adr

DataMemory

32

MemWrA

LU

InstructionFetch Unit

Clk

Zero

Instruction<31:0>

0

1

0

1

01<21:25>

<16:20>

<11:15>

<0:15>

Imm16RdRsRt

MainControl

op6

ALUControlfunc

6

3ALUop

ALUctr3

RegDst

ALUSrc:

Instr<5:0>

Instr<31:26>

Instr<15:0>

nPC_sel

Page 75: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-75 Tarun Soni, Summer’03

CPU Worst Case Timing (Load)

Clk

PCRs, Rt, Rd,Op, Func

Clk-to-Q

ALUctr

Instruction Memoey Access Time

Old Value New Value

RegWr

Old Value New Value

Delay through Control Logic

busARegister File Access Time

Old Value New Value

busBALU Delay

Old Value New Value

Old Value New Value

New ValueOld Value

ExtOp Old Value New Value

ALUSrc Old Value New Value

MemtoReg

Old Value New Value

Address

Old Value New Value

busW Old Value New

Delay through Extender & Mux

RegisterWrite Occurs

Data Memory Access Time

Page 76: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-76 Tarun Soni, Summer’03

CPU: Single Cycle Solution

• Long cycle time:– Cycle time must be long enough for the load instruction:

PC’s Clock -to-Q +Instruction Memory Access Time +Register File Access Time +ALU Delay (address calculation) +Data Memory Access Time +Register File Setup Time +Clock Skew

• Cycle time for load is much longer than needed for all other instructions

Page 77: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-77 Tarun Soni, Summer’03

CPU: Single Cycle Solution

° Single cycle datapath => CPI=1, CCT => long

° 5 steps to design a processor• 1. Analyze instruction set => datapath requirements• 2. Select set of datapath components & establish clock methodology• 3. Assemble datapath meeting the requirements• 4. Analyze implementation of each instruction to determine setting of control points

that effects the register transfer.• 5. Assemble the control logic

° Control is the hard part

° MIPS makes control easier• Instructions same size• Source registers always in same place• Immediates same size, location• Operations always on registers/immediates

Control

Datapath

Memory

ProcessorInput

Output

Page 78: CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually…

CS141-L4-78 Tarun Soni, Summer’03

CPU: Interrupts

° Datapath for interrupts

° Interrupt: basically hardware line requesting an immediate jump

° PC = Int[I] if Int[I] = 1;

° May or maynot save registers

° May or maynot be maskable.

° Useful for multitasking control & real-time processing

° Signal Processing

° Harder to implement in case of a multi-cycle/pipelines system !