Redesign control FSM of a multicycle MIPS processor with low power state encoding

Preview:

Citation preview

Redesign control FSM of a multicycle MIPS processor with low power state

encoding

IntroductionMealy and Moore FSMs

IntroductionMultiCycle MIPS controller is actually a moore FSM

Random Encoding

0000(0)

1010(10)

1001(9)

1000(8)

0010(2)

0011(3)

0100(4)

0101(5)

0110(6)

0001(1)

Fetch

lw, sw

lw

sw

RB J

Decode

0111(7)

Exception

1

12 1 3

2

1

2 2 32 2 3

1 2 2

Optimized encoding

0000(0)

0010(2)

0100(4)

0110(6)

0111(7)

0101(5)

0011(3)

1000(8)

1001(9)

0001(1)

Fetch

lw, sw

lw

sw

RB J

Decode

1010(10)

Exception

1

1 12 1

3

1

1 1 11 2 2

2

1

1

Encoding AlgorithmFinding absolute optimal state encoding is NP hard, but there are some huristics:

One Level Tree (OLT) [1]POW3 [2] Spanning Tree Based (STB)WCEC (Weakly Crossed Edge Cuts Ecoding)

[1] Baccheta P., L. Daldoss, D. Sciuto and C. Silvano, “Lower- Power State Assignment Techniques for Finite State Machines”, IEEE International Symposium on Circuits and Systems (ISCAS’00), 2000, pp.II-641-II-644. [2] Benini L. and G. De Micheli, “State Assignment for Lower Power Dissipation”, IEEE Journal of Solid State Circuits, vol. 30, no 3, 1995, pp. 258-268.

Verilog code

• States must be explicitly specified by the user. This can be done by explicitly using the bit pattern (e.g., 3’b101), or by defining a parameter (e.g., parameter s3

3’b101) and using the parameter as the case item.

Part of Verilog code

Ways to further reduce Power

State duplication:Switching activity reduced from 1.27 to 1.17

Duplicate Fetch (12 states)

0000(0)

0010(2)

0100(4)

0110(6)

0111(7)

0101(5)

0011(3)

1101(13)

1011(11)

0001(1)

sw

1000(8)

1001(9)

Fetch1

lw, sw

lw RB J

Decode

Exception

Fetch2

1

1 12 2

2

1

1 1 11

11

2

1

1 1

11 states

0000(0)

0010(2)

0100(4)

0110(6)

0111(7)

0101(5)

0011(3)

1000(8)

1001(9)

0001(1)

Fetch

lw, sw

lw

sw

RB J

Decode

1010(10)

Exception

One Hot Encoding

• E.g. 4 states need 4 flip flops.• Only one flop will be active ("hot") in a state.

1000(8)

0001(1)

0010(2)

0100(4)

AdvantagesLow power (since only one toggle when state changes)

Simple decoding logic. (Greatly reduces the complexity of CL)

Popular method is onehot.

FPGA has lot of registers compared to CPLD. so onehot is more suitable for FPGA

Clock gating

0000(0)

0010(2)

0100(4)

0110(6)

0111(7)

0101(5)

0011(3)

1000(8)

1001(9)

0001(1)

Fetch

lw, sw

lw

sw

RB J

Decode

1010(10)

Exception

Clock gating

D QGating signal

clk

Binary (optimized)

Duplicated

Onehot

Experiment results (powersim)random optimized onehot State

duplicateaverage dynamic power uw

6.8949 4.6450 (32.6%)

0.5913 (91.4%)

4.7089(31.7%)

average glitch poweruw

6.8711 4.6303 (32.6%)

-0.0053 (??) 5.0878(26%)

average leakage power uw

2.3153 2.2786 2.2918 2.3993

Totaluw

9.2103 6.9236 (24.8%)

2.8831 (68.7%)

7.0083(23.9%)

Worst delay 10000 10000 71 10000

number of gates 93 97 82 98

ConclusionComparison between Different Encoding styles

Binary Gray One- Hot Almost One-Hot

Slow Medium Fast Fast

n bits for 2 Same as n bits for n same as One Hotstates binary states

Difficult Medium Easy Easy

less Medium more more

more Both More More registers Combinatorial RegistersMax 5000, Altera Flex

Feature

Speed

Hardware

Decoding

Unused bits

Suitable fordeviceshaving

n

Thank You!

Questions?

Recommended