27
Lecture 5. MIPS Processor Design Single-Cycle MIPS #2 Prof. Taeweon Suh Computer Science Education Korea University ECM534 Advanced Computer Architecture

Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

  • Upload
    deepak

  • View
    178

  • Download
    2

Embed Size (px)

DESCRIPTION

ECM534 Advanced Computer Architecture. Lecture 5. MIPS Processor Design Single-Cycle MIPS #2. Prof. Taeweon Suh Computer Science Education Korea University. Single-Cycle MIPS. Again, keep in mind that microarchitecture is composed of 2 interacting parts Datapath Control - PowerPoint PPT Presentation

Citation preview

Page 1: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Lecture 5. MIPS Processor DesignSingle-Cycle MIPS #2

Prof. Taeweon SuhComputer Science Education

Korea University

ECM534 Advanced Computer Architecture

Page 2: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Single-Cycle MIPS

• Again, keep in mind that microarchitecture is composed of 2 interacting parts Datapath Control

• Let’s execute some example instructions on what we have designed so far

• Then, we are going to design control logic in detail

2

Page 3: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Single-Cycle MIPS - lw

• Let’s start with a memory access instruction - lw

3

CLK

A RD

InstructionMemory

A1

A3WD3

RD2

RD1WE3

A2

CLK

RegisterFile

A RDData

MemoryWD

WEPCPC' Instr

CLK

op rs rt imm6 bits 5 bits 5 bits 16 bits

I-Type• STEP 1: Instruction Fetch

lw $2, 80($0)

Page 4: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Single-Cycle MIPS - lw

• STEP 2: Decoding Read source operands from register file

4

Instr

CLK

A RD

InstructionMemory

A1

A3WD3

RD2

RD1WE3

A2

CLK

RegisterFile

A RDData

MemoryWD

WEPCPC'

25:21

CLK

lw $2, 80($0) op rs rt imm6 bits 5 bits 5 bits 16 bits

I-Type

Page 5: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Single-Cycle MIPS - lw

• STEP 2: Decoding Sign-extend the immediate

5

SignImm

CLK

A RD

InstructionMemory

A1

A3WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

A RDData

MemoryWD

WEPCPC' Instr 25:21

15:0

CLK

module signext(input [15:0] a, output [31:0] y); assign y = {{16{a[15]}}, a};endmodule

lw $2, 80($0) op rs rt imm6 bits 5 bits 5 bits 16 bits

I-Type

Page 6: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Single-Cycle MIPS - lw

6

SignImm

CLK

A RD

InstructionMemory

A1

A3WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

A RDData

MemoryWD

WEPCPC' Instr 25:21

15:0

SrcB

ALUResult

SrcA Zero

CLK

ALUControl2:0

ALU

010

op rs rt imm6 bits 5 bits 5 bits 16 bits

I-Type

• STEP 3: Execution Compute the memory address

lw $2, 80($0)

Page 7: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Single-Cycle MIPS - lw

7

A1

A3WD3

RD2

RD1WE3

A2

SignImm

CLK

A RD

InstructionMemory

CLK

Sign Extend

RegisterFile

A RDData

MemoryWD

WEPCPC' Instr 25:21

15:0

SrcB20:16

ALUResult ReadData

SrcA

RegWrite

Zero

CLK

ALUControl2:0

ALU

0101

op rs rt imm6 bits 5 bits 5 bits 16 bits

I-Type

• STEP 4: Execution Read data from memory and write it to register file

lw $2, 80($0)

Page 8: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Single-Cycle MIPS – PC

• CPU starts fetching the next instruction from PC+4

8

SignImm

CLK

A RD

InstructionMemory

+

4

A1

A3WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

A RDData

MemoryWD

WEPCPC' Instr 25:21

15:0

SrcB20:16

ALUResult ReadData

SrcA

PCPlus4

Result

RegWrite

Zero

CLK

ALUControl2:0

ALU

0101

module adder(input [31:0] a, b, output [31:0] y); assign y = a + b;endmodule

adder pcadd1(.a (pc), .b (32'b100) .y (pcplus4));

Page 9: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Single-Cycle MIPS - sw• Let’s execute another memory access instruction - sw

sw instruction needs to write data to memory

9

SignImm

CLK

A RD

InstructionMemory

+

4

A1

A3WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

A RDData

MemoryWD

WEPCPC' Instr 25:21

20:16

15:0

SrcB20:16

ALUResult ReadData

WriteData

SrcA

PCPlus4

Result

MemWriteRegWrite

Zero

CLK

ALUControl2:0

ALU

10100

Example: sw $2, 84($0) op rs rt imm6 bits 5 bits 5 bits 16 bits

I-Type

Page 10: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Single-Cycle MIPS - add, sub, and, or

• Let’s consider arithmetic and logical instructions - add, sub, and, or Write ALUResult to register file Note that R-type instructions write to rd

field of instruction (instead of rt)

10

SignImm

CLK

A RD

InstructionMemory

+

4

A1

A3WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

01

01

A RDData

MemoryWD

WE01

PCPC' Instr 25:21

20:16

15:0

SrcB

20:16

15:11

ALUResult ReadData

WriteData

SrcA

PCPlus4WriteReg4:0

Result

RegDst MemWrite MemtoRegALUSrcRegWrite

Zero

CLK

ALUControl2:0

ALU

0varies1 001

op rs rt rd shamt funct6 bits 5 bits 5 bits 5 bits 5 bits 6 bits

R-Type

Page 11: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Single-Cycle MIPS - beq• Let’s consider a branch instruction - beq

Determine whether register values are equal Calculate branch target address (BTA) from sign-extended immediate and PC+4

11

SignImm

CLK

A RD

InstructionMemory

+

4

A1

A3WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

01

01

A RDData

MemoryWD

WE01

PC01

PC' Instr 25:21

20:16

15:0

SrcB

20:16

15:11

<<2

+

ALUResult ReadData

WriteData

SrcA

PCPlus4

PCBranch

WriteReg4:0

Result

RegDst Branch MemWrite MemtoRegALUSrcRegWrite

Zero

PCSrc

CLK

ALUControl2:0

ALU

01100 x0x 1

Example: beq $4,$0,around op rs rt imm6 bits 5 bits 5 bits 16 bits

I-Type

Page 12: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Single-Cycle MIPS - or

12

SignImm

CLK

A RD

InstructionMemory

+

4

A1

A3WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

01

01

A RDData

MemoryWD

WE01

PC01

PC' Instr 25:21

20:16

15:0

5:0

SrcB

20:16

15:11

<<2

+

ALUResult ReadData

WriteData

SrcA

PCPlus4

PCBranch

WriteReg4:0

Result

31:26

RegDst

BranchMemWriteMemtoReg

ALUSrc

RegWrite

OpFunct

ControlUnit

Zero

PCSrc

CLK

ALUControl2:0

ALU

0010

01

0

0

1

0

• Let’s see how or instruction works out in the implementation with control signals

op rs rt rd shamt funct6 bits 5 bits 5 bits 5 bits 5 bits 6 bits

R-Type

Page 13: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Single-Cycle MIPS

13

SignImm

CLK

A RD

InstructionMemory

+

4

A1

A3WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

01

01

A RDData

MemoryWD

WE01

PC01

PC' Instr 25:21

20:16

15:0

5:0

SrcB

20:16

15:11

<<2

+

ALUResult ReadData

WriteData

SrcA

PCPlus4

PCBranch

WriteReg4:0

Result

31:26

RegDst

BranchMemWriteMemtoReg

ALUSrc

RegWrite

OpFunct

ControlUnit

Zero

PCSrc

CLK

ALUControl2:0

ALU

• As mentioned, CPU is designed with datapath and control• Now, let’s delve into the ALU and control part design

Page 14: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

ALU (Arithmetic Logic Unit)

14

ALU

N N

N3

A B

Y

F

N = 32 in 32-bit processor

+

2 01

A B

Cout

Y

3

01

F2

F1:0

[N-1] S

NN

N

N

N NNN

N

2Zero

Extend

// slt (set less than)// $t0 = 1 if $t1 < $t2slt $t0, $t1, $t2

adder

F2:0 Function

000 A & B001 A | B010 A + B011 not used

100 A & ~B101 A | ~B110 A - B

111 SLT

Page 15: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Verilog Code – ALU

15

module alu(input [31:0] a, b, input [2:0] alucont, output reg [31:0] result, output zero);

wire [31:0] b2, sum, slt;

assign b2 = alucont[2] ? ~b:b; // addition (sub) assign sum = a + b2 + alucont[2]; assign slt = sum[31]; // SLT

always@(*) begin case(alucont[1:0]) 2'b00: result <= a & b2; // A & B 2'b01: result <= a | b2; // A | B 2'b10: result <= sum; // A + B, A - B 2'b11: result <= slt; // SLT endcase end

// for branch assign zero = (result == 32'b0);

endmodule

ALU

N N

N3

A B

Y

FF2:0 Function

000 A & B

001 A | B

010 A + B

011 not used

100 A & ~B

101 A | ~B

110 A - B

111 SLT+

2 01

A B

Cout

Y

3

01

F2

F1:0

[N-1] S

NN

N

N

N NNN

N

2

ZeroE

xtend

Page 16: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Control Unit

16

RegDst

BranchMemWriteMemtoReg

ALUSrcOpcode5:0

ControlUnit

ALUControl2:0Funct5:0

MainDecoder

ALUOp1:0

ALUDecoder

RegWriteOpcode and funct fields come from the fetched instruction

op rs rt rd shamt funct6 bits 5 bits 5 bits 5 bits 5 bits 6 bits

R-Type

op rs rt imm6 bits 5 bits 5 bits 16 bits

I-Type

op addr6 bits 26 bits

J-Type

Page 17: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Control Unit - ALU Control

17

ALUOp1:0 Meaning

00 Add

01 Subtract

10 Look at Funct

11 Not Used

ALUOp1:0 Funct ALUControl2:000 X 010 (add)

X1 X 110 (subtract)

1X 100000 (add) 010 (add)

1X 100010 (sub) 110 (subtract)

1X 100100 (and) 000 (and)

1X 100101 (or) 001 (or)

1X 101010 (slt) 111 (slt)

RegDst

BranchMemWriteMemtoReg

ALUSrcOpcode5:0

ControlUnit

ALUControl2:0Funct5:0

MainDecoder

ALUOp1:0

ALUDecoder

RegWrite

• Memory access instructions (lw, sw) need to use ALU to calculate memory target address (addition)

• Branch instructions (beq, bne) need to use ALU for the equality check (subtraction)

• Implementation is completely dependent on hardware designers• But, the designers should make sure the implementation is

reasonable enough

Page 18: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Control Unit - Main Decoder

18

Instruction Op5:0 RegWrite RegDst AluSrc Branch MemWrite MemtoReg ALUOp1:0

R-type 000000

lw 100011

sw 101011

beq 000100

RegDst

BranchMemWriteMemtoReg

ALUSrcOpcode5:0

ControlUnit

ALUControl2:0Funct5:0

MainDecoder

ALUOp1:0

ALUDecoder

RegWrite

ALUOp1:0 Meaning

00 Add

01 Subtract

10 Look at Funct field

11 Not Used

1 1 0 0 0 10

10

0

0 1 0 0 1 00X 1 0 1 X 00X 0 1 0 X 01

0

Page 19: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

How about Other Instructions?

19

SignImm

CLK

A RD

InstructionMemory

+

4

A1

A3WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

01

01

A RDData

MemoryWD

WE01

PC01

PC' Instr 25:21

20:16

15:0

5:0

SrcB

20:16

15:11

<<2

+

ALUResult ReadData

WriteData

SrcA

PCPlus4

PCBranch

WriteReg4:0

Result

31:26

RegDst

BranchMemWriteMemtoReg

ALUSrc

RegWrite

OpFunct

ControlUnit

Zero

PCSrc

CLK

ALUControl2:0

ALU

Example: addi $t0, $t1, -14

• Now, we are done with the control part design• Let’s examine if the design is able to execute other instructions

Page 20: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Control Unit - Main Decoder

20

Instruction Op5:0 RegWrite RegDst AluSrc Branch MemWrite MemtoReg ALUOp1:0

R-type 000000 1 1 0 0 0 0 10lw 100011 1 0 1 0 0 1 00sw 101011 0 X 1 0 1 X 00beq 000100 0 X 0 1 0 X 01addi 001000 1 0 1 0 0 0 00

Page 21: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Control Unit - Main Decoder

21

SignImm

CLK

A RD

InstructionMemory

+

4

A1

A3WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

01

01

A RDData

MemoryWD

WE01

PC01

PC' Instr 25:21

20:16

15:0

5:0

SrcB

20:16

15:11

<<2

+

ALUResult ReadData

WriteData

SrcA

PCPlus4

PCBranch

WriteReg4:0

Result

31:26

RegDst

BranchMemWriteMemtoReg

ALUSrc

RegWrite

OpFunct

ControlUnit

Zero

PCSrc

CLK

ALUControl2:0

ALU

• How about jump instructions? j op addr

6 bits 26 bits

J-Type

Page 22: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Control Unit - Main Decoder

22

SignImm

CLK

A RD

InstructionMemory

+

4

A1

A3WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

01

01

A RDData

MemoryWD

WE01

PC01 PC' Instr 25:21

20:16

15:0

5:0

SrcB

20:16

15:11

<<2

+

ALUResult ReadData

WriteData

SrcA

PCPlus4

PCBranch

WriteReg4:0

Result

31:26

RegDst

BranchMemWriteMemtoReg

ALUSrc

RegWrite

OpFunct

ControlUnit

Zero

PCSrc

CLK

ALUControl2:0

ALU

01

25:0 <<2

27:0 31:28

PCJump

Jump

• We added new hardware to support the j instruction A logic to compute the target address Mux and control signal

op addr6 bits 26 bits

J-Type

Page 23: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Control Unit - Main Decoder

23

Instruction Op5:0 RegWrite RegDst AluSrc Branch MemWrite MemtoReg ALUOp1:0 Jump

R-type 000000 1 1 0 0 0 0 10 0

lw 100011 1 0 1 0 0 1 00 0

sw 101011 0 X 1 0 1 X 00 0

beq 000100 0 X 0 1 0 X 01 0

addi 001000 1 0 1 0 0 0 00 0

j 000100

• There should be one more output (jump) in the main decoder to support the jump instructions

0 X X X 0 X XX 1

Page 24: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Verilog Code - Main Decoder and ALU Control

24

module maindec(input [5:0] op, output memtoreg, memwrite, output branch, alusrc, output regdst, regwrite, output jump, output [1:0] aluop);

reg [8:0] controls;

assign {regwrite, regdst, alusrc, branch, memwrite, memtoreg, jump, aluop} = controls;

always @(*) begin case(op) 6'b000000: controls <= 9'b110000010; // R-type 6'b100011: controls <= 9'b101001000; // lw 6'b101011: controls <= 9'b001010000; // sw 6'b000100: controls <= 9'b000100001; // beq 6'b001000: controls <= 9'b101000000; // addi 6'b000010: controls <= 9'b000000100; // j default: controls <= 9'bxxxxxxxxx; // ??? endcase end

endmodule

module aludec(input [5:0] funct, input [1:0] aluop, output reg [2:0] alucontrol);

always @(*) begin case(aluop) 2'b00: alucontrol <= 3'b010; // add 2'b01: alucontrol <= 3'b110; // sub default: case(funct) // RTYPE 6'b100000: alucontrol <= 3'b010; // ADD 6'b100010: alucontrol <= 3'b110; // SUB 6'b100100: alucontrol <= 3'b000; // AND 6'b100101: alucontrol <= 3'b001; // OR 6'b101010: alucontrol <= 3'b111; // SLT default: alucontrol <= 3'bxxx; // ??? endcase endcase end

endmodule

RegDst

BranchMemWriteMemtoReg

ALUSrcOpcode5:0

ControlUnit

ALUControl2:0Funct5:0

MainDecoder

ALUOp1:0

ALUDecoder

RegWrite

ALUOp1:0 Meaning

00 Add

01 Subtract

10 Look at Funct

11 Not Used

Page 25: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Single-Cycle MIPS Performance• How fast is the single-cycle processor?• Clock cycle time (frequency) is limited by the critical path

The critical path is the path that takes the longest time What do you think the critical path is?

• The path that lw instruction goes through

25

SignImm

CLK

A RD

InstructionMemory

+

4

A1

A3WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

01

01

A RDData

MemoryWD

WE01

PC01

PC' Instr 25:21

20:16

15:0

5:0

SrcB

20:16

15:11

<<2+

ALUResult ReadData

WriteData

SrcA

PCPlus4

PCBranch

WriteReg4:0

Result

31:26

RegDst

BranchMemWriteMemtoReg

ALUSrc

RegWrite

OpFunct

ControlUnit

Zero

PCSrc

CLK

ALUControl2:0

ALU1

0100

1

0

1

0 0

Page 26: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Single-Cycle MIPS Performance• Critical path of single-cycle MIPSTc = tpcq_PC + tmem + max(tRFread, tsext) + tmux + tALU + tmem + tmux + tRFsetup

• In most implementations, limiting paths are: memory (instruction and data), ALU, register file.Tc = tpcq_PC + 2tmem + tRFread + 2tmux + tALU + tRFsetup

26

SignImm

CLK

A RD

InstructionMemory

+

4

A1

A3WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

01

01

A RDData

MemoryWD

WE01

PC01

PC' Instr 25:21

20:16

15:0

5:0

SrcB

20:16

15:11

<<2

+

ALUResult ReadData

WriteData

SrcA

PCPlus4

PCBranch

WriteReg4:0

Result

31:26

RegDst

BranchMemWriteMemtoReg

ALUSrc

RegWrite

OpFunct

ControlUnit

Zero

PCSrc

CLK

ALUControl2:0

ALU1

0100

1

0

1

0 0 Elements Parameter

Register clock-to-Q tpcq_PC

Multiplexer tmux

ALU tALU

Memory read tmem

Register file read tRFread

Register file setup tRFsetup

Page 27: Lecture 5. MIPS Processor Design Single-Cycle MIPS #2

Korea Univ

Example

27

Tc = tpcq_PC + 2tmem + tRFread + 2tmux + tALU + tRFsetup = [30 + 2(250) + 150 + 2(25) + 200 + 20] ps = 950 ps

Elements Parameter Delay (ps)

Register clock-to-Q tpcq_PC 30

Multiplexer tmux 25

ALU tALU 200

Memory read tmem 250

Register file read tRFread 150

Register file setup tRFsetup 20

• Assuming that the CPU should execute 100 billion instructions to run your program, what is the execution time of the program on a single-cycle MIPS processor?Execution Time = (#instructions) x (cycles/instruction) x (seconds/cycle) = (100 × 109) x (1) x (950 × 10-12 s)

= 95 seconds

fc = 1/Tc

fc = 1/950ps = 1.052GHz