Preparation of Papers in Two-Column Formatplaza.ufl.edu/cagrawal/Chirag/finalSRAM Reportfor print.d… · Web viewThe write driver circuitry consists of four NMOS transistors and

Preparation of Papers in Two-Column Format

Design and Layout of a 128-bit Static Random

Access Memory

Chirag Agrawal, Benjamin Chai, Abhinav Dubey, Greg Slovin

University of Florida

EEE5322 - VLSI Circuits and Technology

Abstract- This paper presents the design of a 16x8 static random address memory block using .24um technology. The memory block consists of 128 SRAM 6T- cell, a row decoder, a column multiplexer, precharge circuitry, as well as read and write circuitry. The SRAM cells are organized into 16 rows by 8 columns. The key idea of this paper is to explain how each portion of the memory block was designed as well presenting a minimized SRAM area with minimum power dissipation. The final SRAM dimensions was 66.84(um) x 123.84(um). That is equal to an area of 8.25(mm2). The maximum operating speed of the SRAM was 300 MHz and the maximum power dissipation of the memory block is 1.37mW.

I. INTRODUCTION

This SRAM design uses a TSMC 0.24 um Deep Submicron technology. The voltage used is 2.5 V and the SRAM operates at 27C. There are two input clocks being used in the design, the first one being 2, is used for the pre charge circuitry. It has a 40% duty cycle. The other clock 1 which is the main clock also has a 40% duty cycle with a 100ps rise and fall times but is delayed by 50% with respect to 2. There are also seven address bits, A7:1, an input data bit an output data bit as well as a read and write strobe.

The Design Process

The design process began with creating the six transistor SRAM bit. Precharge and read circuitry was then added to make sure data can be read off of the bit. The next step was to add the write driver circuitry. After simulating writing a value to the bit and then reading from it, column and row decoders were made. Then the SRAM bit was replicated 128 times to create the final SRAM memory block. A simulation was then done in order to confirm the correct performance of the SRAM block.

After the final schematic circuit was completed the layout could be started. The layout was created in the same order as the circuitry. The layout was confirmed by passing the DRC and matching all the parameters in LVS.

II. ARCHITECTURE

A. SRAM Cell

The SRAM cell used six transistors. There were four NMOS and two PMOS. The two PMOS transistors were used with two NMOS transistors to create two inverters that were connected to each other in order to latch the data values. The other two NMOS transistors were used to access inverters. The sizings of the transistors were carefully calculated. In order to have the minimum area and the minimum capacitance minimum sizing of the transistors were used. The NMOS transistors in the inverters had to be the largest to have a small resistance and hence avoid the read upset. The width of these NMOS transistors were 0.42(um). Taking the mobility difference between PMOS and NMOS, the PMOS transistors were design to have the minimum width of 0.36(um). To minimize the capacitance on the bitlines and wordlines the two pass NMOS transistors used are of minimum width of 0.36(um). All the transistors have the minimum gate length of .24(um). The 6T SRAM circuitry is shown below in Fig. 1.

Fig. 1. SRAM Cell

B. Precharge Circuitry

The precharge circuitry was used to push the bit and bit bar lines to high before reading. The secondary clock 2 is used to control the precharge circuitry. When the secondary clock was high the bit and bit bar lines were needed to charge to a high value. In order to accomplish this, three PMOS transistors were used. Two were needed to pass a high voltage down the bit lines and the other is used to equalize the voltages across the bit lines for faster clocking and reading by the sense amplifier. PMOS transistors were used instead of NMOS transistors because of their ability to pass high voltages. The two PMOS transistors that charged the bit lines need to have a large width in order to minimize the voltage drop across them. There were eight precharge circuits made, connecting one to each column. The precharge circuitry is shown below in Fig. 2.

Fig. 2. Precharge Circuit

C. Sense Circuitry

The sense circuitry was created in order to read the value stored in the SRAM cell. As clocked sense amplifier saves the power it was used to determine the difference in voltage between the bit and bit bar lines when the SRAM cell was accessed to read. It is implemented using the two PMOS isolation transistor and a regenerative feedback circuitry. When the sense enable is high and is in phase with clock, two isolation NMOS transistors are turned on to connect the bit and bit bar lines to the sense amplifier. An isolation PMOS and NMOS transistor of sense circuitry were also turned on at the same time connecting VDD and ground to the amplifier using inverter. These isolation transistors are used to save the power by keeping the sense amplifier off when it is not in use. All NMOS and PMOS transistors used the minimum width of .36(um) in order to conserve space in the layout. The sensing circuitry is shown below in Fig. 3.

Fig. 3. Sense Circuitry

D. Write Driver

The write driver circuitry consists of four NMOS transistors and an inverter. When the write strobe is high, the top two NMOS transistors are on and hence the bit and bit bar lines are connected to data line transistors .As a result either of them is pulled down depending on the data line bit. All of the transistors in the write driver circuitry have the minimum width of 0.36(um) in order to conserve space in the layout. The write driver circuitry is shown below in Fig. 4.

Fig. 4. Write Driver

E. Column Multiplexer

The Column Multiplexer is a three to eight mux designed using 3:8 decoder and the pass gate. Using address lines A5:7, the mux circuitry chooses which set of bit and bit bar lines are going to be connected to the sense circuitry and the write driver circuitry. The decoder logic consist of two input NOR gates and inverters. Two input NOR gates are used instead of 3 input NOR gates to limit the fan in capacitances. This may result in larger logic circuitry, but the propagation delays are much shorter. Pass gates are used to connect the selected bit and bit bar lines to the sense circuitry and write driver to implement as multiplexer switch. The input reording has been done to minimize the propagation delay considering the probability of the inputs. The column decoder circuitry is shown below in Fig. 5.

Fig. 5. Column Decoder

F. Row Decoder

The row decoder is a 4:16 static decoder implemented using the predecoder stage. Because of predecoder stage it helps to minimize the transistor count and also reduces the propagation delay by a factor of 4. Using address lines A1:4, the decoder circuitry chooses which word line to turn on. Each output of the decoder is anded with the primary clock, 1. This avoids any word lines from being high when the clock is low. As a result of which no two rows are activated at the same time. Two input NOR gates are used instead of larger input NOR gates in the predecoder stage to limit the fan in capacitances and the two inputs nand gates are used in the output stage with the inverters acting as buffers to drive the large loads. Since PMOS devices have lower mobility stacking devices in series must be avoided as much as possible and hence nand logic is implemented in the output stage. The row decoder circuitry is shown below in Fig. 6.

Fig. 6. Row Decoder

III. PERFORMANCE

A. Overall SRAM Area

The entire SRAM block has a width of 66.84(um). It has a height of 123.48(um). Therefore the entire SRAM has an area of 8.25(mm2).

B. Read Access Time

The 50% delay from the rising edge of clock phase 1 to the output data transition from 0 to 1 is 492ps. The 50% delay from the rising edge of clock phase 1 to the output data transition from 1 to 0 is 312 ps.

C. Write Access Time

The 50% delay from the rising edge of clock phase 1 to the final writing of the input data into the memory form 0 to 1 is 920.9 ps. The 50% delay from the rising edge of clock phase 1 to the final writing of the input data into the memory form 1 to 0 is 848.4 ps.

D. Power Dissipation

The maximum operating frequency of the SRAM device is 300 MHz. The average current over 20 clock cycles is 540 uA from the voltage source by the SRAM block. With the voltage source equal to 2.5 V, the power dissipation by the SRAM block over 20 clock cycles, is approximately 1.37mW.

E. Energy-Delay Product

The energy delay product is the product of the average power dissipation by the SRAM and the shortest clock cycle squared. The shortest clock cycle is 13.33 ns. Therefore the energy-delay product of the SRAM is 1.4 x 10-20 watt.sec2.

IV. LAYOUT

The capacitance added to the word lines and bit lines is made up of the diffusion and gate capacitance of the access transistors. The gate capacitance of each access transistor is 0.74 fF and their diffusion capacitance is 0.80 fF. The total capacitance added to the word and bit lines by the access transistors is 3.08 fF. The block diagram of the layout is shown below in Fig. 7.

Fig. 7. SRAM Block Diagram

(COLUMN MUX ) (SRAM Memory Block) (PREDECODER) (ROW DECODER) (Sense Amplifier And Write Driver) (Bit Line Conditioning) Layout of the 128 Bit SRAM Memory Block

V. STATIC NOISE MARGINS

The static noise margins measures the SRAMs hold, read and write stability. The hold static noise margin for the SRAM cell is 1.032. The read static noise margin for the SRAM cell is 0.298. The write static noise margin for the SRAM cell is 0.5.

(Read Cycle output)

(Write Cycle output)

VI. CONCLUSION

This project gave us a valuable opportunity to learn various tools of the custom IC design and also gave us a chance to explore all the challenges which we came across while implementing it. This design and implementation of the SRAM proved to be a very challenging and valuable learning experience. If given the opportunity, we could have tried to reduce the size and delay to the minimum possible limit and could have tried to optimize the trade-off between power dissipation and delay and size area.

REFERENCES

[1] J. Rabaey, A. Chandrakasan, and B. Nikolic, Digital Integrated Circuits, Pearson Prentice Hall, 2nd ed, pp.623-719, 2003.

[2] N. Weste, D. Harris, A. Banerjee, A CMOS VLSI Design, 3rd ed.

Pearson Prentice Hall, 2005, pp.73-160.

Documents

Preparation of Papers in Two-Column Formatplaza.ufl.edu/cagrawal/Chirag/finalSRAM Reportfor print.d… · Web viewThe write driver circuitry consists of four NMOS transistors and