24
The Processor Data Path & Control Chapter 5 Part 1 - Introduction and Single Clock Cycle Design N. Guydosh 2/29/04

The Processor Data Path & Control Chapter 5 Part 1 - Introduction and Single Clock Cycle Design

  • Upload
    venus

  • View
    40

  • Download
    0

Embed Size (px)

DESCRIPTION

The Processor Data Path & Control Chapter 5 Part 1 - Introduction and Single Clock Cycle Design. N. Guydosh 2/29/04. Introduction. Starting point: The specification of the MIPS instruction set drives the design of the hardware. Will restrict design to integer type instructions - PowerPoint PPT Presentation

Citation preview

Page 1: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

The ProcessorData Path & Control

Chapter 5 Part 1 - Introduction and Single Clock Cycle Design

N. Guydosh2/29/04

Page 2: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

Introduction• Starting point:

– The specification of the MIPS instruction set drives the design of the hardware.– Will restrict design to integer type instructions– Arithmetic element design from chapter 4.

• Identify common functions to all instructions, and within instruction classes – easy to do in a RISC architecture– Instruction fetch– Access one or more registers– Use ALU

• Asserted signals – a high or low level of a signal which implies a logically “true” condition … an “action” level. The text will only assert a logically high level, ie., a “1”.

• Clocking– Assume “edge triggered” clocking (as opposed to level sensitive).– A storage circuit or flip-flop stores a value on the clock transition edge.– Model is flip-flops with combinational logic between them– Propagation delay through combinations logic between storage elements determines clock cycle length.– Single clock cycle vs. multi-clock cycle design approach

Page 3: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

Example of Edge Triggering

Page 4: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

Example of Edge TriggeringSetting and sampling the same state elementin the same clock cycle:

This is allowable if the delays through the combinational Logic is sufficiently long relative to the clock cycle timeIn this example, state element B captures a value based on the original value of A, and then A gets modified to a new value

Based on Fig 5.3

Page 5: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

Single vs Multi-clock Cycle Design

• Start out with a single “long” clock cycle for each instruction .– Entire instruction gets executed in a single clock pulse

– Controller is pure combinational logic

– Design is simple

– You would think that a single clock cycle per instruction execution would give us super high performance – but not so:

Slowest instruction determines speed of all instructions.

– Ultimately we will go with a multi-clock cycle design – let each instruction run as fast is it could go – bottle neck is now not the slowest instruction, but the slowest “phase of execution” within an instruction – many instructions may never have this phase – penalize only those instructions employing the “slow phases”

• Because various phases of the instructions need the same hardware resource, & all is needed at the same time (clock pulse)

– Some hardware is redundant – another disadvantage of single phaseExamples:2 memories: instruction and data memory 2 adders and an ALU

Page 6: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

Single Clock Cycle with Design Summary• Has a performance bottleneck

– The clock cycle time is determined by the longest path in the machine – The simple jmp instruction will take as long as the load word (lw) – The instruction which uses the longest data path dictates the time for all others.

• What about a variable time clock design?– Still a single clock – Clock pulse interval is a function of the opcode – Average time for instruction theoretically improves

But– It difficult to implement - lots of overhead to overcome

• But what the hey! Let’s start simple with a single clock cycle design for simplicity reasons and later convert to multi-clock cycle.

Page 7: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

RegistersRegister #

Data

Register #

Datamemory

Address

Data

Register #

PC Instruction ALU

Instructionmemory

Address

Basic Abstract View of the Data Path

Shows common functions for most instructionsNote state vs combinational elements

Fig. 5.1

Page 8: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

Data Path for Instruction Fetching Single Clock Cycle

PC

Instructionmemory

Readaddress

Instruction

4

Add

Fig. 5.5

Page 9: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

Basic Data Path for R-type InstructionSingle Clock Cycle

InstructionRegisters

Writeregister

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Writedata

ALUresult

ALUZero

RegWrite

ALU operation3

Orange lines are for control- will design controls later

Fig. 5.7

Page 10: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

Adding the Data Path for lw & sw InstructionSingle Clock Cycle

Instruction

16 32

RegistersWriteregister

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Datamemory

Writedata

Readdata

Writedata

Signextend

ALUresult

ZeroALU

Address

MemRead

MemWrite

RegWrite

ALU operation3

Implements:lw $t1, offset_value($t2)sw $t1, offset_value($t2)The offset_value is a 16 bit signed immediate field & must be sign extendedto 32 bits

Immediate offset data

Fig. 5.9

Page 11: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

Adding the Data Path for beq InstructionSingle Clock Cycle

16 32Sign

extend

ZeroALU

Sum

Shiftleft 2

To branchcontrol logic

Branch target

PC + 4 from instruction datapath

Instruction

Add

RegistersWriteregister

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Writedata

RegWrite

ALU operation3

Implements beq $t1, $t2, offsetOffset is a signed 16 bit immediate field, & thus must besign extended. In addition we shift left by 2 (make low bits are 00)to address to a word boundary

To PC

Fig. 5.10

Page 12: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

Putting It All Together Single Clock Cycle

PC

Instructionmemory

Readaddress

Instruction

16 32

Add ALUresult

Mux

Registers

WriteregisterWritedata

Readdata 1

Readdata 2

Readregister 1Readregister 2

Shiftleft 2

4

Mux

ALU operation3

RegWrite

MemRead

MemWrite

PCSrc

ALUSrc

MemtoReg

ALUresult

ZeroALU

Datamemory

Address

Writedata

Readdata M

ux

Signextend

Add

j instruction to be added laterNeed controls circuits to drive control lines in orange.Two control units will be design: ALU Control & “Main Control

Incremented PC or beq branch address

unsuccessful branch

Successful branch

Fig. 5.13

Page 13: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

ALU Control Unit Single Clock Cycle

ALU control input ALU function

000 and001 or010 add110 subtract111 set on less than

Desired outputs of ALU control unit (inputs to ALU)

See ALU design from chapter 4, pp. 238-239.The most significant bit in ALU control input is Bnegate of fig. 4.19The two least significant bits are the “ALU operation” MUX input in fig 4.17:00 is “and”, 01 is “or”, 10 is “add”, 11 is “set on less than”.

Page 14: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

ALU Control Unit (continued) Single Clock Cycle

Define an intermediate pair of control lines called ALUopwhich partially associates instruction opcodes with ALU control inputs.ALUop will be generated by the main controller as an input to ALU controller.ALU Controller will also need the instruction function field as input to do the job.Remember the instruction function is completely determined by opcode and Function field. Theoretically, we could have fed the op-code directly to the ALU control unit rather than ALUop, but the opcode is already decoded in he main controller, so simple use this result

Page 15: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

ALU Control Unit (continued) Single Clock Cycle

Truth table which implements the ALU controllerCompletely specifies the ALU controller.

Page 16: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

ALU Control Unit Implementation Single Clock Cycle

Figure from 1st ed of book

Page 17: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

What We Have So FarSingle Clock Cycle

MemtoReg

MemRead

MemWrite

ALUOp

ALUSrc

RegDst

PC

Instructionmemory

Readaddress

Instruction[31– 0]

Instruction [20– 16]

Instruction [25– 21]

Add

Instruction [5– 0]

RegWrite

4

16 32Instruction [15– 0]

0Registers

WriteregisterWritedata

Writedata

Readdata 1

Readdata 2

Readregister 1Readregister 2

Signextend

ALUresult

Zero

Datamemory

Address Readdata M

ux

1

0

Mux

1

0

Mux

1

0

Mux

1

Instruction [15– 11]

ALUcontrol

Shiftleft 2

PCSrc

ALU

Add ALUresult

just added in

Fig. 5.17

Page 18: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

Designing the Main Control Unit Single Clock Cycle

Page 19: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

Designing the Main Control Unit (continued) Single Clock Cycle

Page 20: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

Designing the Main Control Unit (continued) Single Clock Cycle

Page 21: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

Main Control Unit Implementation Single Clock Cycle

Figure from 1st ed of book

Page 22: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

Putting It All Together Again Single Clock Cycle

PC

Instructionmemory

Readaddress

Instruction[31– 0]

Instruction [20 16]

Instruction [25 21]

Add

Instruction [5 0]

MemtoRegALUOpMemWrite

RegWrite

MemReadBranchRegDst

ALUSrc

Instruction [31 26]

4

16 32Instruction [15 0]

0

0Mux

0

1

Control

Add ALUresult

Mux

0

1

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Signextend

Mux

1

ALUresult

Zero

PCSrc

Datamemory

Writedata

Readdata

Mux

1

Instruction [15 11]

ALUcontrol

Shiftleft 2

ALUAddress

Fig 5.19

Use this for R-type, memory, & beq instructions scenarios.

Page 23: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

Addition of the Unconditional Jump Single Clock Cycle

• We now add one more op code to our single cycle design:– Op code 2: “j”– The format is op field 28-31 is a “2”– Remaining 26 low bits is the immediate target address

• The full 32 bit target address is computed by concatenating:– Upper 4 bits of PC+4– 26 bit immediate field of the jump instruction– Bits 00 in the lowest positions (word boundary)– See text chapter 3, p. 150

• An additional control line from the main controller will have to be generated to select this “new” instruction

• A two bit shifter is also added to get the two low order zeros

Page 24: The Processor Data Path & Control Chapter 5 Part 1  -  Introduction and Single Clock Cycle Design

Final Design with jump Instruction Single Clock Cycle

Shiftleft 2

PC

Instructionmemory

Readaddress

Instruction[31– 0]

Datamemory

Readdata

Writedata

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Instruction [15– 11]

Instruction [20– 16]

Instruction [25– 21]

Add

ALUresult

Zero

Instruction [5– 0]

MemtoRegALUOpMemWrite

RegWrite

MemReadBranchJumpRegDst

ALUSrc

Instruction [31– 26]

4

Mux

Instruction [25– 0] Jump address [31– 0]

PC+4 [31– 28]

Signextend

16 32Instruction [15– 0]

1

Mux

1

0

Mux

0

1

Mux

0

1

ALUcontrol

Control

Add ALUresult

Mux

0

1 0

ALU

Shiftleft 226 28

Address

Fig. 5.29