68
ENG241 Digital Design Week #9 Register Transfer and Data Paths

ENG241 Digital Design Week #9 Register Transfer and Data Paths

Embed Size (px)

Citation preview

Page 1: ENG241 Digital Design Week #9 Register Transfer and Data Paths

ENG241 Digital Design

Week #9 Register Transfer and Data Paths

Page 2: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 2

Week #9 Topics

Data Paths and Operations The Arithmetic/Logic Unit

Register Transfer Operations Micro-Operations

Multiplexer-Based Transfer Bus-Based Transfer Complete Data Path Design Pipelining

Page 3: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 3

Resources

Chapter #7, Mano Sections 7.2 Register Transfers 7.3 Register Transfer Operations 7.4 VHDL and RTL 7.5 Micro Operations 7.6 Multiplexer Based Transfers 7.8 Bus Based Transfers

Page 4: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 4

Parts of CPUs Datapath

Registers, Multiplexors, Adders, Subtractors and logic to perform operations on them (Comb Logic)

Control unit Generates signals to control data-path Accepts status signals to perform sequencing

Control Data Path

Page 5: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 5

Memory and I/O

Control Unit + Data Path + Memory + Input Output = Micro-Micro-computer Systemcomputer System

MEMORYInput and Output

Page 6: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 6

Arithmetic/Logic Unit (ALU)

The ALU is a combinational circuit that performs a set of basic arithmetic and logic operations. An adder can perform

addition, subtraction, … Select lines are used to

determine the operation to be performed.

Page 7: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 7

ALU Design using Hierarchy

This ALU has: 2 control lines S0,S1

for arithmetic S2 selects logical ops

Start designing in parts

Page 8: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 8

One Stage ALU Design a 1-bit Arithmetic unit Design a 1-bit Logic unit Combine the two units to form a 1-bit Arithmetic/Logic Replicate as many times to form an n-bit ALU

ENG241/Digital Design

Page 9: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 9

Arithmetic Circuit

The basic component of an arithmetic circuit is a: N-bit Ripple Carry Adder (Parallel Adder). By controlling the data inputs to the parallel adder, it is

possible to obtain different types of arithmetic operations (Cin is also an input)

Select lines S0, S1 can be used to control input Y. Why?

Page 10: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 10

Looking Inside

Table Functionality. How to design the B

Input Logic?

B InputLogic

What possible functionality can I achieve if I control the ‘Y’ Value to the n-bit Adder?

Page 11: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 11

Design of B Select Logic Use an 8-to-1 Mux (Straight forward Solution). Or … use a 4-to-1 mux! Can we do better? YES: simplify the expression from the truth table

using a K-Map

Page 12: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 12

1-bit (Single Stage) Arithmetic Circuit

The B logic is nothing but a 2-to-1 Mux instead of the 4-to-1 Mux

Page 13: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 13

4-Bit Circuit

Duplicating the one stage four times will produce a 4-bit circuit

Page 14: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 14

Logic Section Design

Generous number of operations

Page 15: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 15

Arithmetic/Logic Unit

The logic circuit can be combined with the arithmetic circuit to produce an ALU.

I. Selection variables S1 and S0 can be commoncommon to both circuits to both circuits,

II. A third selection variable S2 can be used to differentiate between the logic and arithmetic operations.

Page 16: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 16

One Stage Arithmetic Circuit

Page 17: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 17

One Stage Logic Circuit

Page 18: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 18

One Stage ALU

Mux to choose Arithmetic or Logic

Page 19: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 19

n-bit ALU

Duplicate the one stage n times!!

Page 20: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 20

Resulting Control The one stage ALU can provide

I. 8 arithmetic, and II. 4 logic operations.

Page 21: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Register Transfer Language (RTL) Register Transfer Language (RTL): used to

describe CPU organization in high-level terms RTL expressions are made up of elements

which describe the registers being manipulated, and the micro-ops being performed on them

Here are the basic components of RTL expressions:

Fall 2014 ENG241/Digital Design 21

Page 22: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 22

Register Transfer Language (RTL)

Registers named in uppercase PC, IR (instruction), R3

The operations on the data in registers are called microoperations

Page 23: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 23

Micro-Operations

Basic operations of the datapath Example:

1. Moving data from one register to another2. Adding the contents of two registers3. Incrementing the contents of a register

The control unit provides the signals that sequence the micro-operations in a prescribed manner

The results of a currently executing micro-operation may determine both the sequence of control signals and the sequence of future micro-operations to be executed (e.g. BNE)

A micro operation is expected to complete in one clock

Page 24: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 24

RTL

Transfer from R1 to R2 R2 R1

1. R2 is destination2. R1 is source

Conditional If(K1 = 1) then (R2 R1)

K1: R2 R1 as a shorter form

Page 25: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 25

Transfer

K1: R2 R1 Transfer at the clock edge When K1 is high n bits wide

Page 26: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 26

Symbols

Note memory transfers DR M[AR] (contents of Memory)

Page 27: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 27

Syntax not VHDL (similar)

Page 28: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 28

Types of Microoperations

1. Transfer – (have just looked at)2. Arithmetic3. Logic4. Shift

Page 29: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 29

Arithmetic

Basic ops (addition, subtraction, ..) R0 R1 + R2

Subtraction by 2’s complement

Page 30: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 30

Notation is Shorthand for Hardware

Consider and

211:1 RRRKX 1211:1 RRRXK

Note overflow and carry

registers

Page 31: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 31

Logic Microoperations

OR notation a little confusing

shows two types of syntax for ORs211:)21( RRRKK

Page 32: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 32

Shift Microoperations

Here just the basic one-bit shifts

Bit falls off the end, zero shifted in

Page 33: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 33

Multiplexer-Based Transfers

There are occasions when a register receives data from two or more different sources at different times.

Recall that multiplexers are used to conditionally transfer values from the input to the output.

Page 34: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 34

Multiplexer-Based Transfers

Consider

Which can also be expressed as

Block diagram?

20)12()10()11( RRthenKifelseRRthenKif

20:21,10:1 RRKKRRK

Page 35: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 35

Multiplexer Block Diagram

20:21,10:1 RRKKRRK

Page 36: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 36

Detailed

Page 37: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 37

Bus-Based Transfers

How about when there are lots of registers?

We can use buses and send data over common set of wires Busses are more efficient scheme for

transferring data between registers!

Page 38: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 38

Bus-Based Transfers

A Bus is a shared transfer path. It is characterized by a set of common lines

(i) Data + (ii) Control, (iii) Status The control signals for the logic select a

single source and one or more destinations on any clock cycle.

SRC1

SRC2

DEST1

DEST2

Page 39: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 39

Simple Case: using Muxes!

Signals S1, S0 select the source

Signals L0, L1, L2 enable loading of the registers.

The single bus (on the right) can achieve more transfers than system on the left! One mux One output bus

Page 40: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 40

Transfers

Only single source About ½ the hardware Select/Load Signals (table) Limitations!

Page 41: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 41

Three-State Bus

Remember three-state drivers allow having multiple outputs share wire Note the small inverted triangle

denotes the 3-state output of the register.

A bus can be constructed with the three state buffers.

Many three state buffer outputs can be connected together to form a bit line of a bus less delay less delay than multiplexer based

systems

Page 42: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 42

Same Example with 3-State

Notice that both systems in the figure have the same capability in term of transfers.

However the 3-state bus has:

1.1. Fewer wiresFewer wires2.2. Easier to expandEasier to expand!

Page 43: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 43

Memory Transfers

Usually one or more buses associated with memory Address Data

Note that memory can be slower, so may have to use complex timing Address on one clock cycle Data latched at later clock cycle

Page 44: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 44

Properties of Memory

1.1. VolatileVolatile Memory disappears if power goes out

Typical computer RAM Static RAM (SRAM), Dynamic RAM (DRAM)

2.2. NonvolatileNonvolatile ROM Flash memories Magnetic memories like disk, tape

Page 45: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 45

Simple View of RAM

Of some word size n Some capacity 2k

k bits of address line A read line A write line

Page 46: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 46

Memory Transfer

Read: DR M[AR] where M denotes Memory, DR denotes Data RegisterData Register, and AR denotes Address RegisterAddress Register

Write: M[AR] DR Write: M[A1] D2

Page 47: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 47

Memory Transfer

Page 48: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 48

Data Paths --> ALU + Storage

Computer Systems often employ a number of storage elements in conjunction with a shared operation unit called an Arithmetic/Logic Unit (ALU) to form data path.

To perform a micro operation, the contents of a specified source registers are applied to the inputs of the shared ALU.

The ALU performs an operation, and the result of this operation is transferred to a destination register.

Page 49: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 49

Data Paths, single clock cycle

Since the ALU is designed as a pure combinational circuit, the entire register transfer operation from the source registers, through the ALU, and into the destination register is performed in one clock cycle.

Page 50: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 50

Datapath

A Simple bus-based data path: four registers, an ALU, and a shifter.

Each register is connected to two multiplexers to form ALU input buses A and B (Register File)

Another Mux is used to choose between Registers and a constant.

Functional Unit: ALU and a shifter

Page 51: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 51

Datapath

Blue signals are generated by control

Decoder along with the Load-enable signal determines the destination Register (R0,R1,R2,R3)

Page 52: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 52

Datapath

MB Select determines if the source B is a Register or Constant.

G Select determines the operation to be performed by ALU.

MF Select determines if the output is the ALU or Shifter

MD Select determines if the input to the Register File is the Function Unit or external Data.

Page 53: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 53

Datapath

Four status bits are shown (V,C,N,Z) that can be used by the control unit

It is useful to have certain information based on the results of an ALU operation available for use by the control unit to make decisions.???

Make Corrections Skip an instruction Loops If/Else Statements …

Page 54: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 54

Example: R1R2+R3

Signals? A, B select MB Select G Select MF Select MD Select Destination (D) Load enable

What about timing?

Page 55: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 55

Timing

All can occur in one clock, but Signals must be available in time to

propagate through muxes, ALU and Be at Register inputs by next pos-edge

Page 56: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014

Datapath

Higher-level view for hierarchical design

Can replace modules with same interface but different implementation

Page 57: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 57

Performance Improvement

In addition to providing a data path that performs the necessary register transfer micro operations, we need to be concerned about the speed or rate at which the micro operations are performed. How?

I. First we need to know the maximum speed by which our data path can be run.

II. Then we will explore how we can make it faster. (Pipelining)

Page 58: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 58

PipeliningPipelining

Pipelining exploits parallelism at the instruction level.

Pipelining is an implementation technique in which multiple instructions are overlapped in execution.

Today pipelining is key to making processors fast.

Page 59: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 59

Pipelining: ExamplePipelining: Example

Laundry Ann, Brian, Cathy, Dave

each have one load of clothes to wash, dry, and fold

Washer takes 3030 minutes

Dryer takes 40 40 minutes

“Folder” takes 2020 minutes

A B C D

Page 60: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 60

Sequential LaundrySequential Laundry

Sequential laundry takes (90 x 4 = 360 minutes) 6 hours for 4 loads If they learned pipelining, how long would laundry take?

A

B

C

D

30 40 20 30 40 20 30 40 20 30 40 20

6 PM 7 8 9 10 11 Midnight

Task

Order

Time

Page 61: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 61

Pipelining LessonsPipelining Lessons

Tot Time: 210 minutes!! versus 360 with no pipelining

Potential speedup = Number pipe stages

Unbalanced lengths of pipe stages reduces speedup

Time to “fill” pipeline and time to “drain” it reduces speedup

Pipelining doesn’t help latency of single task, it helps throughput of entire workload

A

B

C

D

6 PM 7 8 9

Task

Order

Time

30 40 40 40 40 20

Page 62: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 62

Assembly Line Analogy to Data Path Pipeline

A custom product being built may pass the assembly line many times before it is completed.

A conveyor belt moves components from stage to stage

This technique increases throughput

Page 63: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 63

Conventional Data Path Timing

The figure shows the maximum delay values for each of the components of a typical data path:

1. 4ns (3ns + 1ns) to read two operands from register file.

2. 4ns to perform an operation.3. 4ns (1ns + 1ns) to write info

back Total 12 ns to perform a

single micro operation. The rate of execution is then set

at 1/12ns = 83.3MHz Can we make it faster?

Page 64: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 64

Pipelined Data Path Timing

We can break the delay of 12ns by inserting registers between the different components of the system.

A register is inserted between the function unit and the register file (OF)

Another register can be inserted between the function unit and MUX D. (EX + WB)

3 stage pipeline: OF / EX / WB

The maximum delay now is 5ns allowing a maximum clock frequency of 200 MHz

Page 65: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 65

Pipelining

3 Stages Operand Fetch Execute Write Back

Page 66: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 66

Pipelining

Conventional data path 7 x 12ns = 84ns Pipelined data path 9 x 5ns = 45ns

Page 67: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 67

Summary

Data PathsData Paths are an essential part of any CPU. ALUsALUs (Arithmetic Logic Units) are at the

heart of any Data Path. MultiplexorsMultiplexors and Tri-State buffers Tri-State buffers are used

extensively in Data Paths (data movement) PipeliningPipelining is a technique to improve

throughputthroughput by overlapping instruction execution.

Page 68: ENG241 Digital Design Week #9 Register Transfer and Data Paths

Fall 2014 ENG241/Digital Design 68