29
1 EECS 427 Lecture 6: Project architecture and intro logic styles EECS 427 F09 Lecture 6 1 Reading: handout, 6.2 Reminders CAD3 is due next Wednesday You have until Thursday noon to submit your design You have until Thursday noon to submit your design Looking ahead: HW3 – Project initial proposal Due Wednesday 10/7 Based on answering a series of questions. Template is posted Quiz1 Wednesday 10/14 2 5 weeks away! Quiz1 Wednesday 10/14, 2.5 weeks away!

EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

  • Upload
    lymien

  • View
    219

  • Download
    5

Embed Size (px)

Citation preview

Page 1: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

1

EECS 427Lecture 6: Project architecture and

intro logic styles

EECS 427 F09 Lecture 6 1

Reading: handout, 6.2

Reminders

• CAD3 is due next WednesdayYou have until Thursday noon to submit your design– You have until Thursday noon to submit your design

• Looking ahead:– HW3 – Project initial proposal

• Due Wednesday 10/7• Based on answering a series of questions. Template is posted

Quiz1 Wednesday 10/14 2 5 weeks away!– Quiz1 Wednesday 10/14, 2.5 weeks away!

Page 2: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

2

Last Time – Logical Effort

NHh

GFBH1ˆ

Path effort

Optimal stage effort

iini

iiniiout

Nip

Cg

hCfC

NHpt

Hh

,,,

1

ˆ

Optimal stage effort

Optimal path delay

Stage sizing

1. Compute path effort2. Compute optimal stage effort3. Add buffers (determine optimal number of stages)4. Compute fan-out f of each stage5. Size individual gates (working backward or forward)

Last Time – Logical Effort

• Limitations– Assumption of P/N = 2

– Ignores internal capacitances

– Simplistic view of stack effect

– Branched path sizes up proportionally

– Does not account for input slope, nor interconnect capacitance effect

EECS 427 F09 Lecture 6 4

capacitance effect

– Both R and C scale linearly

Page 3: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

3

Lecture Overview

• Project architecture description (handout)

• Static and dynamic logic styles

EECS 427 F09 Lecture 6 5

Project architecture

• 2-stage pipeline, 1 word per instruction– 1st stage of pipe: instruction fetch (IF)

2nd t i t ti d d (ID) t (EX)– 2nd stage: instruction decode (ID), execute (EX)

• 16-bit words, with four 4-bit components– Most significant 4 bits are the operation code (opcode)– Tells which instruction (e.g., ADD, MOV, STOR) is to be

performed– Next 4 bits give the register address to which the result of the

instruction should be written (with a few exceptions)N t 8 bit t i l i f i f ti

EECS 427 F09 Lecture 6 6

– Next 8 bits can contain several pieces of information:• Immediate data to be acted upon (rather than accessing this data

from a register location)• Opcode extensions (since there are more than 24 or 16 ops)• Address of source register to draw data from

Page 4: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

4

Example instructions

• Direct vs. immediate instructions• Add Rsrc Rdest

– Rdest Rdest Add RsrcRdest Rdest Add Rsrc– Where Rdest and Rsrc are register addresses

• Add Imm Rdest– Rdest Rdest Add Imm– Where Imm is 8 bits of data (not an address)

• Typical instructions:– MOV moves data from 1 reg location to another– LOAD loads data from memory to the RF– STOR writes data to memory

EECS 427 F09 Lecture 6 7

y– Control flow instructions (conditional branches, jumps, jump and link)

• Look over baseline instructions and extra instructions, think about target application

• Weste 2nd edition handout is useful as overview of a processor architecture (note it does not exactly reflect our own architecture)

Building Blocks for Digital Architectures

Arithmetic unit- Bit-sliced datapath (adder multiplier shifter comparator etc )- Bit-sliced datapath (adder, multiplier, shifter, comparator, etc.)

Memory- RAM, ROM, Buffers, Shift registers

Control- Finite state machine (PLA, random logic)

- Counters

EECS 427 F09 Lecture 6 8

Interconnect- Switches

- Arbiters

- Bus

Page 5: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

5

A Generic Digital Processor

EECS 427 F09 Lecture 6 9

Bit-Sliced DesignControl

Bit 3

Bit 2

Bit 1

Bit 0

Reg

iste

r

Add

er

Shif

ter

Mul

tipl

exer

Dat

a-In

Dat

a-O

ut

EECS 427 F09 Lecture 6 10

Tile identical processing elements

Page 6: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

6

Project Ideas from the Past

• A Low-Power Dual-VDD Microprocessor forMicroprocessor for General Purpose Correlation Applications

• 143 MHz

• reconfigurable multiplier, customized for correlation algorithms.

• Low-power techniques such as dual-Vdd (2.5/1.8V) and clock

EECS 427 F09 Lecture 6 11

dual Vdd (2.5/1.8V) and clock gating reduced power by 39% without compromising performance.

Project Ideas from the Past

• A 200 MHz 16-bit RISC Floating Point DSP for

Electrocardiogram Systems

• Floating-point DSP intended for medical instrumentation applications, such as electrocardiogram (ECG).

• Dedicated floating point unit (FPU)

EECS 427 F09 Lecture 6 12

( )• The test program performs

FIR filtering on a sample electrocardiogram signal.

Page 7: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

7

Project Ideas

• Memory– SRAM design, sense amplifier, 6T variantsg , p ,– Pulse register, sense-amp-based register

• ALU– Carry look-ahead adder: Kogge-Stone radix 2, radix 4, Brent-Kung,

Ling– Logic styles: PTL, domino, OPL– Multiplier

L• Low power– Sleep mode, low-VDD, body biasing

• Dedicated processing– FFT, CORDIC, FIR

EECS 427 F09 Lecture 6 13

Ratioed LogicVDD

ResistiveN transistors + Load•

PDN

In1

In2

F

RLLoad

Resistive• VOH = VDD

• VOL = RPN

RPN + RL

• Assymetrical response

EECS 427 F09 Lecture 6 14

VSS

2

In3 • Static power consumption

• tpL= 0.69 RLCL

Page 8: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

8

Psuedo-NMOS

VDD

A B C D

FCL

VOH = VDD (similar to complementary CMOS)

k V V VVOL

2

kp V V

2=

EECS 427 F09 Lecture 6 15

kn VDD VTn– VOL 2-------------–

p2

------ VDD VTp– =

VOL VDD VT– 1 1kpkn------–– (assuming that VT VTn VTp )= = =

SMALLER AREA & LOAD BUT STATIC POWER DISSIPATION!!!

Pseudo-NMOS VTC

3.0

1.0

1.5

2.0

2.5

Vou

t[V

]

W/Lp = 4

W/Lp = 2

EECS 427 F09 Lecture 6 16

0.0 0.5 1.0 1.5 2.0 2.50.0

0.5

Vin [V]

W/Lp = 1

W/Lp = 0.25

W/Lp = 0.5

Page 9: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

9

DCVSL

VDD VDD

PDN1

Out

PDN2

Out

AABB

M1 M2

EECS 427 F09 Lecture 6 17

VSS VSS

Differential Cascode Voltage Switch Logic (DCVSL)

DCVSL Example

AA B

A.B A.B

EECS 427 F09 Lecture 6 18

BPMOS stack with NMOS

Page 10: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

10

DCVSL Example

001

01

AA B

A.B A.B

EECS 427 F09 Lecture 6 19

1B

DCVSL Example

AA B

A.B A.B

1

0

0

EECS 427 F09 Lecture 6 20

B1

Page 11: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

11

DCVSL Example

AA B

A.B A.B

10

0

0

EECS 427 F09 Lecture 6 21

B1

DCVSL Example

AA B

A.B A.B

10

0

0

1

EECS 427 F09 Lecture 6 22

B1

Page 12: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

12

DCVSL Example

B

AA B

A.B A.BQQ

EECS 427 F09 Lecture 6 23

DCVSL

• Advantages:Advantages:– No PMOS duality

• Lower input cap.

• Use only NMOS

– Faster than CMOS

Can evaluateB

AA B

A.B A.B

EECS 427 F09 Lecture 6 24

– Can evaluate complex logic trees in 1 stage

Page 13: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

13

DCVSL• Disadvantages:

– Need complementarycomplementary inputs (dual rail)

– Cross-bar current• Sensitive to input

timing– Sizing of PMOS is

hardT l PDN

B

AA B

A.B A.B

EECS 427 F09 Lecture 6 25

• Too large PDN does not switch the output

• Too small Slow rise time

Pass-Transistor Logic

B

Inpu

ts

Switch

Network

OutOut

A

BB

EECS 427 F09 Lecture 6 26

• N transistors

• No static consumption

Page 14: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

14

AND Gate

B

BA

F = AB

EECS 427 F09 Lecture 6 27

0

NMOS-Only Logic

VDD

In

Outx

0.5m/0.25m0.5m/0.25m

1.5m/0.25m

1.0

2.0

3.0

Vo

ltage

[V]

xOut

In

EECS 427 F09 Lecture 6 28

0 0.5 1 1.5 20.0

Time [ns]

Page 15: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

15

NMOS-Only Switch

C = 2.5V C = 2.5 V

A = 2.5 V

B

CL

A = 2.5 V BM2

M1

Mn

EECS 427 F09 Lecture 6 29

Threshold voltage loss causesstatic power consumption

VB does not pull up to 2.5V, but 2.5V - VTN

NMOS has higher threshold than PMOS (body effect)

Level Restoration

VDDVDDLevel Restorer

M2

M1

Mn

Mr

OutA

B

DDLevel Restorer

X

EECS 427 F09 Lecture 6 30

• Advantage: Full Swing

• Restorer adds capacitance, takes away pull down current at X

• Ratio problem

Page 16: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

16

Restorer Sizing

3 0

1.0

2.0

/ 1 0/0 2 W/L =1 25/0 25

W/Lr =1.50/0.25

W/Lr =1.75/0.25

Vol

tage

[V]

3.0•Upper limit on restorer size•Pass-transistor pull-downcan have several transistors in stack

EECS 427 F09 Lecture 6 31

0 100 200 300 400 5000.0

W/Lr =1.0/0.25 W/Lr =1.25/0.25

Time [ps]

CPL

QQ

B

B

QQ

EECS 427 F09 Lecture 6 32

• Dual rail Pass gate logic with differential cascode voltage switch logic– Combination of DCVS and PTL

A A AA

Page 17: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

17

CPL

B

B

QQ

0

1

Vdd 0 10

EECS 427 F09 Lecture 6 33

A A AA0 1 01

CPL

B

B

QQ

Vdd0 01

EECS 427 F09 Lecture 6 34

A A AA0 1 01

Page 18: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

18

CPL

• Difference with DCVSL (advantages):– Inputs drive both

trees

– Better power consumption

B

B

outout

EECS 427 F09 Lecture 6 35

consumption

– Very fast

A A AA

Transmission Gate

CC

A B

C

A B

C

A = 2.5 V

C = 2.5 V

EECS 427 F09 Lecture 6 36

B

CL

C = 0 V

Page 19: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

19

Equivalent Resistance

30

Vout

0 V

2.5 V

2.5 VRn

Rp

10

20

Res

ista

nce

, oh

ms

Rn

Rp

R || R

EECS 427 F09 Lecture 6 37

0.0 1.0 2.00

Vout, V

R Rn || Rp

Transmission Gate XOR

B

A

B

FA

BM2

EECS 427 F09 Lecture 6 38

B

B

M1 M3/M4

Page 20: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

20

Transmission Gate Network

V1 Vi-1

C

2.5 2.5Vi Vi+1

CC

2.5Vn-1 Vn

CC

2.5

In

C0 0 CC 0 CC 0

V1 Vi Vi+1

C

Vn-1 Vn

CC

In

ReqReq Req Req

CC

(a)

(b)m

EECS 427 F09 Lecture 6 39

C

Req Req

C C

Req

C C

Req Req

C C

Req

C

In

m

(c)

Dynamic CMOS

• In static circuits at every point in time (except when switching) the output is connected towhen switching) the output is connected to either GND or VDD via a low resistance path.– fan-in of n requires 2n (n N-type + n P-type)

devices

• Dynamic circuits rely on the temporary storage of signal values on the capacitance of

EECS 427 F09 Lecture 6 40

storage of signal values on the capacitance of high impedance nodes.– requires on n + 2 (n+1 N-type + 1 P-type)

transistors

Page 21: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

21

Dynamic Gate

MpClk Clk Mp

In1

In2 PDN

In3

Me

p

Clk

Out

CL

Out

A

BC

p

EECS 427 F09 Lecture 6 41

Clk Me

Two phase operationPrecharge (CLK = 0)Evaluate (CLK = 1)

Dynamic Gate

MpClk Clk Mp on 1

off

In1

In2 PDN

In3

Me

p

Clk

Out

CL

Out

A

BC

p on

off

1

((AB)+C)

EECS 427 F09 Lecture 6 42

Clk Me

Two phase operationPrecharge (Clk = 0)Evaluate (Clk = 1)

on

Page 22: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

22

Conditions on Output

• Once the output of a dynamic gate is di h d it t b h d i tildischarged, it cannot be charged again until the next precharge operation.

• Inputs to the gate can make at most one transition during evaluation.

• Output can be in the high impedance state

EECS 427 F09 Lecture 6 43

• Output can be in the high impedance state during and after evaluation (PDN off), state is stored on CL

Properties of Dynamic Gates

• Logic function is implemented by the PDN only– number of transistors is N + 2 (versus 2N for static complementarynumber of transistors is N 2 (versus 2N for static complementary

CMOS)

• Full swing outputs (VOL = GND and VOH = VDD)

• Non-ratioed - sizing of the devices does not affect the logic levels

• Faster switching speeds

EECS 427 F09 Lecture 6 44

g p– reduced load capacitance due to lower input capacitance (Cin)

– reduced load capacitance due to smaller output loading (Cout)

– no Isc, so all the current provided by PDN goes into discharging CL

Page 23: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

23

Properties of Dynamic Gates

• Overall power dissipation usually higher than static CMOS– no static current path ever exists between VDD and GND

(including Psc)– no glitching– higher transition probabilities– extra load on Clk

• PDN starts to work as soon as the input signals

EECS 427 F09 Lecture 6 45

• PDN starts to work as soon as the input signals exceed VTn, so VM, VIH and VIL equal to VTn

– low noise margin (NML)

• Needs a precharge/evaluate clock

Charge Leakage

CLK

CL

Clk

Clk

Out

A

Mp

Me

VOut

Precharge

Evaluate

EECS 427 F09 Lecture 6 46

Leakage sources

g

Dominant component is subthreshold current

Page 24: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

24

Keeper

Clk

Keeper

CL

Clk

Clk

Me

Mp

A

B

Out

Mkp

EECS 427 F09 Lecture 6 47

Same approach as level restorer for pass-transistor logic

Charge Sharing

Ch d i i ll C

CL

Clk

Clk

CAB=0

A

OutMp

Charge stored originally on CL

is redistributed (shared) over CL and CA leading to reduced robustness

EECS 427 F09 Lecture 6 48

Clk CBMe

Page 25: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

25

Charge Sharing

case 1) if Vout < VTn

Clk Mp

VDD

CLVDD CLVout t Ca VDD VTn VX – +=

or

Vout Vout t VDD–CaCL-------- VDD VTn VX – –= =

case 2) if Vout > VTnB 0

X

CL

Ca

A

Out

p

Ma

Mb

EECS 427 F09 Lecture 6 49

Vout VDD

CaCa CL+----------------------

–=

) Vout VTn

CbClk Me

Domino Logic

In1

In2 PDN

In3

MpClkOut1

In4 PDN

In5

MpClkOut2

Mkp

1 11 0

0 00 1

EECS 427 F09 Lecture 6 50

MeClk MeClk

Page 26: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

26

Cascading Dominos

Ini

Inj

PDN Ini PDNInj

Ini PDNInj

Ini PDNInj

Clk

Clk

Clk Clk Clk

Clk Clk Clk

EECS 427 F09 Lecture 6 51

Like falling dominos!

Clk Clk Clk

Properties of Domino Logic

• Only non-inverting logic can be implemented

• Very high speed– static inverter can be skewed, only L-H transition

– Input capacitance reduced – smaller logical effort

EECS 427 F09 Lecture 6 52

Page 27: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

27

Design with Domino LogicVDD

Clk

VDD

VDD

Mp

PDN

Clk

In1

In2

In3

Out1Mp

PDN

Clk

In4

Out2

Mr

Can be eliminated!

EECS 427 F09 Lecture 6 53

MeClk MeClk

Inputs = 0during precharge

Footless Domino

VDD VDD VDD

Clk Mp

Out1

In1

1 0

Clk Mp

Out2

In2

Clk Mp

Outn

InnIn3

1 0

0 1 0 1 0 1

1 0 1 0

EECS 427 F09 Lecture 6 54

The first gate in the chain needs a foot switchPrecharge is rippling – short-circuit currentA solution is to delay the clock for each stage

Page 28: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

28

Dual-Rail Domino

Clkonoff

A

B

M

Mp

Clk

ClkOut = AB

!A !B

MkpClk

Out = ABMkp Mp

1 0 1 0

EECS 427 F09 Lecture 6 55

MeClk

Solves the problem of non-inverting logic

np-CMOS

MClk

In1

In2 PDN

In3

M

Mp

Clk

ClkOut1

In4 PUN

In5

Me

MpClk

Clk

Out2(to PDN)

1 11 0

0 00 1

EECS 427 F09 Lecture 6 56

MeClk p

Only 0 1 transitions allowed at inputs of PDN Only 1 0 transitions allowed at inputs of PUN

Page 29: EECS 427 F09 Lecture 6 1 - University of · PDF fileEECS 427 F09 Lecture 6 1 ... Microprocessor forMicroprocessor for General Purpose ... • The test program performs FIR filtering

29

NORA Logic

MClk MeClk

In1

In2 PDN

In3

Me

Mp

Clk

ClkOut1

In4 PUN

In5

e

MpClk

Out2(to PDN)

1 11 0

0 00 1

EECS 427 F09 Lecture 6 57

to otherPDN’s

to otherPUN’s

WARNING: Very sensitive to noise!

Summary

• Ratioed logic – improved loads– Psudo-NMOS – static current, low noise margin– DCVSL – no static current but cross-over current

• Pass-transistor circuits – simplified logic– PTL – threshold drop, causing static current in following gate– Transmission gate – no threshold drop– CPL – one side pulls up and the other pulls down

• Dynamic circuits – high performance– Dynamic logic – non-ratioed, dynamic power only, no static current, higher

activity, low noise margin

EECS 427 F09 Lecture 6 58

– Domino logic – can be safely cascaded, only non-inverting logic– Footless domino – ripple precharge, delayed clock, extra power– Dual-rail domino– NP CMOS