21
Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Embed Size (px)

Citation preview

Page 1: Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Assume array size is 256 (mult: 4ns, add: 2ns)

1

*

A[i] B[i]

+

temp

sum

Path delay 6nsCycle time: 6nsClock rate: 166.6MHz166.6x106 MAC/sec

Option-1

Page 2: Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Assume array size is 256 (mult: 4ns, add: 2ns)

2

*

A[i] B[i]

+

temp

sum

Path delay 6nsCycle time: 6nsClock rate: 166.6MHz166.6x106 MAC/sec

*

A[i] B[i]

*

A[i+1] B[i+1]

*

A[i+2] B[i+2]

*

A[i+3] B[i+3]

+ +

+

+

temp

sum

Path delay : 10nsCycle time: 10nsClock rate: 100MHz400x106 MAC/sec

Option-2

Page 3: Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Assume array size is 256 (mult: 4ns, add: 2ns)

3

*

A[0] B[0]

*

A[1] B[1]

*

A[2] B[2]

*

A[3] B[3]

+ +

+

temp

sum

Path delay : 4 + 8*2 + 2 = 22nsCycle time: 22nsClock rate: 45.5MHz11.36x109 MAC/sec

*

A[254] B[254]

*

A[255] B[255]

+

. . . . .

Adder tree ( 8 levels)

Option-3

Page 4: Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Assume array size is 256 (mult: 4ns, add: 2ns)

4

*

A[i] B[i]

+

temp

sum

Path delay 6nsCycle time: 6nsClock rate: 166.6MHz166.6x106 MAC/sec

*

A[i] B[i]

*

A[i+1] B[i+1]

*

A[i+2] B[i+2]

*

A[i+3] B[i+3]

+ +

+

+

sum

temp

Critical path delay : 4nsCycle time: 4nsClock rate: 250MHz109 MAC/sec

Option-4

Page 5: Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Exercise: Serial vs. PipelinedAssume array size is N (add: 2ns)

5

+

A[i] B[i]

+

A[i+1] B[i+1]

+

A[i+2] B[i+2]

+

A[i+3] B[i+3]

+ +

+

temp

• Serial:• 12 cycles• cycle time: 6ns

• Pipelined:• 3 + 11 = 14 cycles• cycle time: 2ns

• ~2.57x

• Serial: • 25,000 cycles• cycle time: 6ns

• Pipelined: • 3 + 24999 = 25002 cycles• cycle time: 2ns

• ~2.99x

+

A[i] B[i]

+

A[i+1] B[i+1]

+

A[i+2] B[i+2]

+

A[i+3] B[i+3]

+ +

+

tempN=100,000 N=48

Page 6: Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Sequential Logic Design Non-Ideal Flip-Flop Behavior

Flip-flop samples D at clock edge and D must be stable when sampled Similar to a photograph, D must be stable around clock edge If not, metastability can occur

Setup time: tsetup = time before clock edge data must be stable (i.e. not changing)

Hold time: thold = time after clock edge data must be stable

Aperture time: ta = time around clock edge data must be stable (ta = tsetup + thold)

6

CLK

tsetup

D

thold

ta

Page 7: Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Sequential Logic Design Non-Ideal Flip-Flop Behavior

• Propagation delay: tpcq = time after clock edge that the output Q is guaranteed to be stable (i.e., to stop changing)

• Contamination delay: tccq = time after clock edge that Q might be unstable (i.e., start changing)

7

CLK

tccq

tpcq

Q

Page 8: Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Sequential Logic Design Non-Ideal Flip-Flop Behavior

The delay between registers has a minimum and maximum delay, dependent on the delays of the circuit elements

8

CL

CLKCLK

R1 R2

Q1 D2

(a)

CLK

Q1

D2

(b)

Tc

Page 9: Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Sequential Logic Design Non-Ideal Flip-Flop Behavior

Depends on the maximum delay from register R1 through combinational logic to R2

The input to register R2 must be stable at least tsetup before clock edge

9

CLK

Q1

D2

Tc

tpcq tpd tsetup

CL

CLKCLK

Q1 D2

R1 R2 Tc ≥

Page 10: Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Sequential Logic Design Non-Ideal Flip-Flop Behavior

Depends on the maximum delay from register R1 through combinational logic to R2

The input to register R2 must be stable at least tsetup before clock edge

10

CLK

Q1

D2

Tc

tpcq tpd tsetup

CL

CLKCLK

Q1 D2

R1 R2 Tc ≥ tpcq + tpd + tsetup

tpd ≤

Page 11: Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Sequential Logic Design Non-Ideal Flip-Flop Behavior

Depends on the maximum delay from register R1 through combinational logic to R2

The input to register R2 must be stable at least tsetup before clock edge

11

CLK

Q1

D2

Tc

tpcq tpd tsetup

CL

CLKCLK

Q1 D2

R1 R2

Tc ≥ tpcq + tpd + tsetup

tpd ≤ Tc – (tpcq + tsetup)

Page 12: Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Sequential Logic Design Non-Ideal Flip-Flop Behavior

12

CLK CLK

A

B

C

D

X'

Y'

X

Y

Timing Characteristicstccq = 30 ps

tpcq = 50 ps

tsetup = 60 ps

thold = 70 ps

tpd = 35 ps

tcd = 25 pstpd =

tcd =

Setup time constraint:

Tc ≥

fc =

Page 13: Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Sequential Logic Design Non-Ideal Flip-Flop Behavior

13

CLK CLK

A

B

C

D

X'

Y'

X

Y

Timing Characteristicstccq = 30 ps

tpcq = 50 ps

tsetup = 60 ps

thold = 70 ps

tpd = 35 ps

tcd = 25 pstpd = 3 x 35 ps = 105 ps

Setup time constraint:

Tc ≥ (50 + 105 + 60) ps = 215 ps

fc = 1/Tc = 4.65 GHz

Page 14: Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Clock Skew

The clock doesn’t arrive at all registers at same time Skew: difference between two clock edges Perform worst case analysis to guarantee dynamic discipline

is not violated for any register – many registers in a system!

14

t skew

CLK1

CLK2

CL

CLK2CLK1

R1 R2

Q1 D2

CLKdelay

CLK

Page 15: Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Clock Skew

In the worst case, CLK2 is earlier than CLK1

15

CLK1

Q1

D2

Tc

tpcq tpd tsetuptskew

CL

CLK2CLK1

R1 R2

Q1 D2

CLK2

Tc ≥

Page 16: Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Clock Skew

In the worst case, CLK2 is earlier than CLK1

16

CLK1

Q1

D2

Tc

tpcq tpd tsetuptskew

CL

CLK2CLK1

R1 R2

Q1 D2

CLK2

Tc ≥ tpcq + tpd + tsetup + tskew

tpd ≤

Page 17: Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Clock Skew

In the worst case, CLK2 is earlier than CLK1

17

CLK1

Q1

D2

Tc

tpcq tpd tsetuptskew

CL

CLK2CLK1

R1 R2

Q1 D2

CLK2

Tc ≥ tpcq + tpd + tsetup + tskew

tpd ≤ Tc – (tpcq + tsetup + tskew)

Page 18: Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Sequential Logic Design Metastability

Violating setup/hold time can lead to bad situation known as metastable state Metastable state: Any flip-flop state other

than stable 1 or 0 Eventually settles to one or other, but we

don’t know which For internal circuits, we can make sure

observe setup time But what if input comes from external

(asynchronous) source, e.g., button press?

Partial solution Insert synchronizer flip-flop for

asynchronous input Special flip-flop with very small setup/hold

time Doesn’t completely prevent metastability

clk

D

Q

setup timeviolation

metastablestate

ai

ai

synchronizer

a

18

Page 19: Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Sequential Logic Design Metastability

One flip-flop doesn’t completely solve problem How about adding more synchronizer flip-flops?

Helps, but just decreases probability of metastability

So how solve completely? Can’t! May be unsettling to new designers. But we just can’t guarantee a design that

won’t ever be metastable. We can just minimize the mean time between failure (MTBF) -- a number often given along with a circuit

ai

synchronizers

lowverylow

veryverylow

incrediblylow

Probability of flip-flop being metastable is…

19

Page 20: Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Exercise

Circuit shown below computes the 4-input AND function using 2-input AND gates. Each 2-input AND gate has a propagation delay of 100ns and a contamination delay of 55ns. Each flip flop has a setup time of 30ns, a hold time of 20ns, a clock-to-Q maximum delay of 70ns, and a clock-to-Q minimum delay of 50ns.

a) If there is no clock skew, what is the maximum operating frequency of the circuit?

b) How much clock skew can the circuit tolerate if it must operate a 2MHz

20

Page 21: Assume array size is 256 (mult: 4ns, add: 2ns) 1 * A[i] B[i] + temp sum Path delay 6ns Cycle time: 6ns Clock rate: 166.6MHz 166.6x10 6 MAC/sec Option-1

Exercise-2 Determine the critical path and clock frequency of the following design

provided. Assume the setup time of a D flip-flop is 10 ns. assume the delay is estimated as 1 ns times the number of gate

inputs. mux delay = 5 ns adder delay = 20 ns

21

 

 

 A B

4-bit Adder

clr

ld

1 0

-1 1

4

4

4

4

Cnt Reg

4

Cnt

cnt_clr

cnt_ld

sel

  State Reg

 

n1

n0

s1 s0

up

en