70
VLSI Implementation of 32 K-Bit Sleepy SRAM ACKNOWLEDGEMENT First, I would like to gratefully acknowledge the enthusiastic supervision of our Principal Dr. U. P. Waghe sir for his continuous guidance, and inspiration. Then, I would like to express my sincere gratitude and appreciation to our Head of Department, Dr. P. K. Dakhole sir, for his tremendous support, invaluable guidance and constant encouragement during the course of my studies. The completion of this project and thesis would not have been possible without our Project Guide, Mrs. Pradnya P. Zode madam’s exceptional supervision and everlasting support. I am also grateful to her for providing me with various opportunities to pursue a dynamic and fascinating area of electronics as well as explore opportunities out of the laboratory. I also wish to thank all the members of YCCE’s electronics department, working with them made my time during graduate study a wonderful experience. A countless and sincere thanks also goes to my parents, my family members and my friends, for their continuous support and encouragement throughout my studies without their companionship, life would not have been the same. Page | 1

VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

Embed Size (px)

Citation preview

Page 1: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

ACKNOWLEDGEMENT

First, I would like to gratefully acknowledge the enthusiastic supervision of our Principal

Dr. U. P. Waghe sir for his continuous guidance, and inspiration. Then, I would like to express

my sincere gratitude and appreciation to our Head of Department, Dr. P. K. Dakhole sir, for his

tremendous support, invaluable guidance and constant encouragement during the course of my

studies. The completion of this project and thesis would not have been possible without our

Project Guide, Mrs. Pradnya P. Zode madam’s exceptional supervision and everlasting support. I

am also grateful to her for providing me with various opportunities to pursue a dynamic and

fascinating area of electronics as well as explore opportunities out of the laboratory.

I also wish to thank all the members of YCCE’s electronics department, working with them

made my time during graduate study a wonderful experience. A countless and sincere thanks

also goes to my parents, my family members and my friends, for their continuous support and

encouragement throughout my studies without their companionship, life would not have been the

same.

Page | 1

Page 2: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

ABSTRACT

The most research on the power consumption of circuits has been concentrated on the switching

power and the power dissipated by the leakage current has been relatively minor area. However,

in the current VLSI process, the sub-threshold current becomes the one of the major factors of

the power consumption, especially in high-end memory. To reduce the leakage power in the

SRAM, the power gating method can be applied and a major technique of the power gating is

using sleep transistors to control the sub-threshold current.

In this project, dual threshold voltages are adopted; normal SRAM cells have lower threshold

voltages and the higher threshold voltages control the sleep transistors. The size of sleep

transistors can be chosen by the worst case current and are applied to every block.

For this project, we extend our discussion and present the result on the advantages of using sleep

transistor in terms of delay, area and power reduction.

Page | 2

Page 3: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

LIST OF FIGURES

Fig 1. Trend of Leakage Power vs. Technology

Fig 2. Trend of Area Percentage vs. Technology

Fig 3. Basic 6-Transistor SRAM Memory Cell

Fig 4. Arrangement of 32 K-Bit Sleepy SRAM

Fig 5. Diagram of Sleepy SRAM cell

Fig 6. Conceptual Diagram of SRAM Column

Fig 7. Schematic of 6T SRAM cell

Fig 8. Diagram of Sleepy SRAM cell

Fig 9. Implementation of Basic 6-Transistor SRAM Memory Cell

Fig 10. Implementation of Sleepy SRAM Memory Cell

Fig 11. Simulation Result of 6T SRAM Cell

Fig 12. Simulation Result of Sleepy SRAM Cell

Page | 3

Page 4: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

LIST OF TABLES

Table 1. Major Controllable Parameters for 32 K-bit Sleepy SRAM

Table 2. Target Values for 32K-bit Sleepy SRAM

Table 3. Tools & Models for 32K-bit Sleepy SRAM

Table 4. Transistor Sizing of Sleepy SRAM Cell

Table 5. Transistor Sizing Data for n-type Sleep Transistor

Table 6. Transistor Sizing Data for p-type Sleep Transistor

Table 7. Transistor Sizing of Sleepy SRAM Cell

Table 8. Transistor Sizing of 6-T SRAM Cell

Table 9. Sleepy SRAM Partition Mode

Page | 4

Page 5: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

CONTENTS

CHAPTER 1 - INTRODUCTION ..…………………………..……………………………… 7

CHAPTER 2 - LITERATURE SURVEY ………………………………………………...... 12

2.1 - Why SRAM ? ……………………………………………………………………………. 14

2.2 - Basic 6-T Memory Cell ……………………………………………………………….. 15

2.3 – 32 K-Bit Sleepy SRAM …………………………………………………………….…. 17

CHAPTER 3 - DESIGN ISSUES …………………………………………………………... 20

CHAPTER 4 - TOOLS …………………………………………………………………….... 25

4.1 – Introduction ……………………………………………………………………........... 25

4.2 - S-EDIT (Schematic Edit) ……………………………………………………………… 25

4.3 - T-EDIT (Simulation Edit) …………………………………………………………….. 26

4.3.1 - DC Operating point Analysis ……………………………………………… 27

4.3.2 - DC Transfer Analysis ………………………………………………………. 28

4.3.3 - Transient Analysis ………………………………………….……………… 29

4.3.4 - AC Analysis ………………………………………………….……………… 29

4.3.5 - Noise Analysis …………………………………………………………….… 30

4.4 - W- EDIT (Waveform Edit) …………………………………………….……….......... 31

4.5 - L-EDIT (Layout Edit) ………………………………………………………………… 31

4.5.1 - L-Edit: An Integrated Circuit Layout Tool ……………………............... 32

Page | 5

Page 6: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

CHAPTER 5 - METHODOLOGY ………………………………………………………… 36

5.1 - Basic 6-T Memory Cell ……………………………………………………….………. 36

5.1.1 - Read Operation .………………………………………………………………… 37

5.1.2 - Write Operation …………………………………………………………........... 38

5.1.3 - Transistor Sizing ………………………………………………………….......... 38

5.2 – Sleepy SRAM Memory Cell .………………………………………………………… 40

5.2.1 - Read Operation .………………………………………………………………… 41

5.2.2 - Write Operation ………………………………………………………………… 41

5.2.3 - Transistor Sizing ………………………………………………………….......... 42

CHAPTER 6 - IMPLEMENTATION ……………………………………………………... 43

CHAPTER 7 - RESULTS …………………………………………………………………... 45

CHAPTER 8 - CONCLUSION …..………………………………………………………… 47

CHAPTER 9 - FUTURE SCOPE ...………………………………………………………… 48

APPENDIX ……………………………………………………………………………………. 49

Page | 6

Page 7: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

CHAPTER - 1

INTRODUCTION

Modern digital systems require the capability of storing and retrieving large amounts of

information at high speeds. Memories are circuits or systems that store digital information in

large quantity. This chapter addresses the analysis and design of VLSI memories, commonly

known as semiconductor memories. Today, memory circuits come in different forms including

SRAM, DRAM, ROM, EPROM, E2PROM, Flash, and FRAM. While each form has a different

cell design, the basic structure, organization, and access mechanisms are largely the same. In this

chapter, we classify the types of memory, and focus on the static RAM design. This topic is

particularly suitable for our study of CMOS digital design as it allows us to apply many of the

concepts presented earlier.

Recent surveys indicate that roughly 30% of the worldwide semiconductor business is due to

memory chips. Over the years, technology advances have been driven by memory designs of

higher and higher density. Electronic memory capacity in digital systems ranges from fewer than

100 bits for a simple function to standalone chips containing 256 Mb (1 Mb = 210 bits) or more

Circuit designers usually speak of memory capacities in terms of bits, since a separate flip-flop

or other similar circuit is used to store each bit. On the other hand, system designers usually state

Memory capacities in terms of bytes (8 bits); each byte represents a single alphanumeric

character. Very large scientific computing systems often have memory capacity stated in terms

of words (32 to 128 bits). Each byte or word is stored in a particular location that is identified by

a unique numeric address. Memory storage capacity is usually stated in units of kilobytes (K

bytes) or megabytes (M bytes). Because memory addressing is based on binary codes, capacities

that are integral powers of 2 are most common. Thus the convention is that, for example, 1K byte

= 1,024 bytes and 64K bytes = 65,536 bytes. In most memory systems, only a single byte or

word at a single address is stored or retrieved during each cycle of memory operation. Dual-port

memories are also available that have the ability to read/write two words in one cycle.

Page | 7

Page 8: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

Power consumption is one of the top concerns of Very Large Scale Integration (VLSI) circuit

design, for which Complementary Metal Oxide Semiconductor (CMOS) is the primary

technology. Today’s focus on low power is not only because of the recent growing demands of

mobile applications. Even before the mobile era, power consumption has been a fundamental

problem. To solve the power dissipation problem, many researchers have proposed different

ideas from the device level to the architectural level and above. However, there is no universal

way to avoid tradeoffs between power, delay and area, and thus designers are required to choose

appropriate techniques that satisfy application and product needs. Power consumption of CMOS

consists of dynamic and static components.

Like most sequencing elements in digital systems, memory cells used in on-chip memories can

be divided into static and dynamic structures. While dynamic structure uses a capacitor, static

structure employs cross-coupled inverters to keep the data. Static memories are faster and more

stable, but require more area per bit. Dynamic power is consumed when transistors are

switching, and static power is consumed regardless of transistor switching. Dynamic power

consumption was previously (at 0.18μ technology and above) the single largest concern for low-

power chip designers since dynamic power accounted for 90% or more of the total chip power.

Therefore, many previously proposed techniques, such as voltage and frequency scaling, focused

on dynamic power reduction. However, as the feature size shrinks, e.g., to 0.09μ and 0.065μ,

static power has become a great challenge for current and future technologies. One of the main

reasons causing the leakage power increase is increase of sub-threshold leakage power. When

technology feature size scales down, supply voltage and threshold voltage also scale down. Sub-

threshold leakage power increases exponentially as threshold voltage decreases. Furthermore, the

structure of the short channel device lowers the threshold voltage even lower.

Developments in embedded memory technology have made large Dynamic Random Access

Memories (DRAMs) and Static Random Access Memories (SRAMs) commonplace in today's

System on Chips (SoCs.) Tradeoffs between large and small memories have made all sizes

practical, enabling SoCs to resemble board-level systems more than ever. Large embedded

memories give a SoC a number of benefits such as improved bandwidth and considerable

performance that can only be achieved through the use of embedded technologies.

Page | 8

Page 9: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

The possibility and success of including embedded DRAM and/or large SRAM blocks in a SoC

depends mainly on manufacturability.

Implementation of embedded DRAMs with a unit cell as small as a single minimum size

transistor and a single trench capacitor offers substantial benefits to a standard-CMOS based

SoC. However, to be effectively beneficial in terms of area and power, a DRAM unit demands a

manufacturing process with high-Vt low-leakage transistors as well as trench capacitors. This

requirement increases the cost of this approach and limits the application of the embedded

DRAMs to specialized SoCs requiring large embedded memory and operating at relatively low

to medium speed. On the other hand, embedded SRAMs are the prominent embedded memories

used in today's SoCs. SRAM's integrability with standard CMOS technology gives it an ample

opportunity to become the highest area consumer of many SoCs ranging from a high

performance server processor to a an HDTV video processor. Unlike DRAMs, SRAMs do not

require data refreshing mechanism. This is because an SRAM cell can store the data indefinitely

as long as it is powered. This feature saves the complex and the area consuming data refreshing

periphery circuits and makes medium size SRAM units a feasible choice for implementation in

the standard CMOS process.

Complementary metal-oxide semiconductor (CMOS) technology development brings the

performance enhancement and new challenges in VLSI circuit design such as process variation

and increasing transistor leakage. The leakage current is expressed as,

I leakage = I0 e (Vgs-Vth) / ηVt

Where I0 = µ0 Cox (W/L) V2t e1.8

Takes more and more proportion in modern VLSI process as semiconductor devices are getting

smaller and smaller. The following figures show the trend of the leakage power in terms of

fabrication process. High-performance VLSI design is steadily required with the development of

CMOS technology.

Page | 9

Page 10: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

Fig 1. Trend of Leakage Power vs. Technology

Fig 2. Trend of Area Percentage vs. Technology

The demand for static random-access memory (SRAM) is increasing with large use of SRAM in

mobile products, System On-Chip (SoC) and high-performance VLSI circuits. As the density of

SRAM increases, the leakage power has become a significant component in chip design.

Page | 10

Page 11: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

A various methods have been adopted to reduce the leakage power. In this project, multi-

threshold voltage is applied to construct sleep transistors that has higher threshold voltage.

However, those multi-threshold voltages must reflect the characteristic of SRAM. That is,

memory is generally a huge cluster of cells so the performance and cost may depend on

clustering for higher threshold voltage or overall layout. Additionally, the analysis should

include about the wire model and transistor sizing as well.

Finally we look at the scaling trends in the speed and power of SRAMs with size and technology

and find that the SRAM delay scales as the logarithm of its size as long as the interconnect delay

is negligible. Non-scaling of threshold mismatches with process scaling, causes the signal swings

in the bit lines and data lines also not to scale, leading to an increase in the relative delay of an

SRAM, across technology generations. The wire delay starts becoming important for SRAMs

beyond the 1Mb generations. Across process shrinks, the wire delay becomes worse, and wire

redesign has to be done to keep the wire delay in the same proportion to the gate delay.

Hierarchical SRAM structures have enough space over the array for using fat wires, and these

can be used to control the wire delay for 4Mb and smaller designs across process shrinks.

Page | 11

Page 12: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

CHAPTER - 2

LITERATURE SURVEY

Read-write random-access memories (RAM) may store information in flip-flop style circuits or

simply as charge on capacitors. Approximately equal delays are encountered in reading or

writing data. Because read-write memories store data in active circuits, they are volatile; that is,

stored information is lost if the power supply is interrupted. The natural abbreviation for read-

write memory would be RWM. However, pronunciation of this acronym is difficult. Instead, the

term RAM is commonly used to refer to read-write random-access memories. If the terms were

consistent, both read-only and read-write memories would be called RAMs.

The two most common types of RAMs are the static RAM (SRAM) and the dynamic RAM

(DRAM). The static and dynamic definitions are based on the same concepts as those introduced

in earlier chapters. Static RAMs hold the stored value in flip-flop circuits as long as the power is

on. SRAMs tend to be high-speed memories with clock cycles in the range of 5 to 50 ns.

Dynamic RAMs store values on capacitors. They are prone to noise and leakage problems, and

are slower than SRAMs, clocking at 50 ns to 200 ns. However, DRAMs are much denser than

SRAMs up to four times denser in a given generation of technology.

Read-only memories (ROMs) store information according to the presence or absence of

transistors joining rows to columns. ROMs also employ the organization and have read speeds

comparable to those for read-write memories. All ROMs are nonvolatile, but they vary in the

method used to enter (write) stored data. The simplest form of ROM is programmed when it is

manufactured by formation of physical patterns on the chip; subsequent changes of stored data

are impossible. These are termed mask-programmed ROMs. In contrast, programmable read-

only memories (PROMs) have a data path present between every row and column when

manufactured, corresponding to a stored 1 in every data position. Storage cells are selectively

switched to the 0 state once after manufacture by applying appropriate electrical pulses to

selectively open (blow out) row-column data paths. Once programmed, or blown, a 0 cannot be

changed back to a 1.

Page | 12

Page 13: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

Erasable programmable read-only memories (EPROMs) also have all bits initially in one binary

state. They are programmed electrically (similar to the PROM), but all bits may be erased

(returned to the initial state) by exposure to ultraviolet (UV) light. The packages for these

components have transparent windows over the chip to permit the UV irradiation. Electrically

erasable programmable read-only memories (EEPROMs, E2PROM, or E-squared PROMs) may

be written and erased by electrical means. These are the most advanced and most expensive form

of PROM. Unlike EPROMs, which must be totally erased and rewritten to change even a single

bit, E2PROMs may be selectively erased. Writing and erasing operations for all PROMs require

times ranging from microseconds to milliseconds. However, all PROMs retain stored data when

power is turned off; thus they are termed nonvolatile.

A recent form of EPROM and E2PROM is termed Flash memory, a name derived from the fact

that blocks of memory may be erased simultaneously. Flash memory of the EPROM form is

written using the hot-electron effect1 whereas E2PROM Flash is written using Fowler-Nordheim

(FN) tunneling2. Both types are erased using FN tunneling. Their large storage capacity has

made this an emerging mass storage medium. In addition, these types of memories are beginning

to replace the role of ROMs on many chips, although additional processing is required to

manufacture Flash memories in a standard CMOS technology.

Memories based on ferroelectric materials, so-called FRAMs or FeRAMs, can also be designed

to retain stored information when power is off. The Perovskite crystal material used in the

memory cells of this type of RAM can be polarized in one direction or the other to store the

desired value. The polarization is retained even when the power supply is removed, thereby

creating a nonvolatile memory. However, semiconductor memories are preferred over

ferroelectric memories for most applications because of their advantages in cost, operating speed,

and physical size. Recently, FRAMs have been shown to be useful nonvolatile memory in certain

Applications such as smart cards and may be more attractive in the future due to their extremely

high storage density.1 Hot electrons are created by applying a high field in the channel region. These electrons enter the oxide and raise the threshold voltage of a device. Devices with this higher threshold voltage are viewed as a stored “1”. Devices with the lower threshold voltage represent a stored “0”.2 Fowler-Nordheim tunneling occurs through thin insulating material such as thin-oxide associated with the gate. Current flows through the oxide by tunneling through the energy barrier.

Page | 13

Page 14: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

As microprocessors and other electronics applications get faster and faster, the need for large

quantities of data at very high speeds increases, while providing the data at such high speeds gets

more difficult to accomplish. As microprocessor speeds increase from 25 MHz to 100 MHz, to

250 MHz and beyond, systems designers have become more creative in their use of cache

memory, interleaving, burst mode and other high-speed methods for accessing memory. The old

systems sporting just an on-chip instruction cache, a moderate amount of DRAM and a hard

drive have given way to sophisticated designs using multilevel memory architectures. One of the

primary building blocks of the multi-level memory architecture is the data cache.

Features of SRAM are

Data is stored as long as supply is applied

Fast – so used where speed is important (e.g., caches)

Differential outputs

Low power consumption

Compatible with CMOS technology

2.1 - Why SRAM ?

SRAM cells are usually used to implement memories that require short access times, low power

dissipation, and tolerance to environmental conditions. There are many reasons to use an SRAM

or a DRAM in a system design. Design tradeoffs include density, speed, volatility, cost, and

features. All of these factors should be considered before you select a RAM for your system

design.

Speed - The primary advantage of an SRAM over a DRAM is its speed. The fastest

DRAMs on the market still require five to ten processor clock cycles to access the first bit

of data. Although features such as EDO and Fast Page Mode have improved the speed

with which subsequent bits of data can be accessed, bus performance and other

limitations mean the processor must wait for data coming from DRAM. Fast,

synchronous SRAMs can operate at processor speeds of 250 MHz and beyond, with

access and cycle times equal to the clock cycle used by the microprocessor. With a well

Page | 14

Page 15: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

designed cache using ultra-fast SRAMs, conditions in which the processor has to wait for

a DRAM access become rare.

Density - Because of the way DRAM and SRAM memory cells are designed, readily

available DRAMs have significantly higher densities than the largest SRAMs. Thus,

when 64 Mb DRAMs are rolling off the production lines, the largest SRAMs are

expected to be only 16 Mb.

Volatility - While SRAM memory cells require more space on the silicon chip, they have

other advantages that translate directly into improved performance. Unlike DRAMs,

SRAM cells do not need to be refreshed. This means they are available for reading and

writing data 100% of the time.

Cost - If cost is the primary factor in a memory design, then DRAMs win hands down. If,

on the other hand, performance is a critical factor, then a well-designed SRAM is an

effective cost performance solution.

Custom features - Most DRAMs come in only one or two flavors. This keeps the cost

down, but doesn't help when you need a particular kind of addressing sequence, or some

other custom feature. SRAMs are tailored, via metal and substrate, for the processor or

application that will be using them. Features are connected or disconnected according to

the requirements of the user. Likewise, interface levels are selected to match the

processor levels.

2.2 - Basic 6-T Memory Cell

The memory cell shown here forms the basis for most static random-access memories in

CMOS technology. It uses six transistors to store and access one bit.

The four transistors in the center form two cross-coupled inverters.

Due to the feedback structure, a low input value on the first inverter will generate a high

value on the second inverter, which amplifies (and stores) the low value on the second

inverter. Similarly, a high input value on the first inverter will generate a low input value

on the second inverter, which feeds back the low input value onto the first inverter.

Therefore, the two inverters will store their current logical value, whatever value that is.

Page | 15

Page 16: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

The two lines between the inverters are connected to two separate bit lines via two n-

channel pass-transistors (BLbar and BL).

SRAM cells conventionally have 6-transistors as seen in Fig. 3, Transistors P2, N2, P1 and, N1

comprise a pair of cross-coupled CMOS inverters that use positive feedback to store a value.

Transistors N3 and N4 are two pass transistors that allow access to the storage nodes for reading

and writing. To write a value into an SRAM cell, the new value and its complement are driven

on the bit lines, and then the word line is raised. The new value will overwrite the old value,

since the bit lines are actively driven by write circuitry. To read a value from an SRAM cell, the

bit lines are pre-charged high, and then the word line is raised turning on the pass transistors.

Because one of the internal storage nodes is low, one of the bit lines starts discharging. A sense

amplifier, which is connected to the bit lines, senses which of the bit lines is discharging and

reads the stored value.

The demand for power reduction of the SRAM units has compelled many researchers toward

innovative low-power circuits. Six transistors have been widely recognized as a suitable choice

for low-power applications. An SRAM memory cell is a bi-stable flip-flop made up of four to six

transistors. The flip-flop may be in either of two states that can be interpreted by the support

circuitry to be a 1 or a 0. Many of the SRAMs on the market use a four transistor cell with a poly

silicon load. Suitable for medium to high performance, this design has a relatively high leakage

current and consequently high standby current. Four transistor designs may also be more

susceptible to various types of radiation induced soft errors. SRAMs use a six transistor memory

cell (also called a six-device cell) that is highly stable, relatively impervious to soft errors, and

has low leakage and standby currents. Recognizing the superiority of the six-device cell while

trying to avoid using the extra chip real estate, many industry SRAM producers are migrating

slowly.

Page | 16

Page 17: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

Fig 3. Basic 6-Transistor SRAM Memory Cell

2.3 – 32 K-Bit Sleepy SRAM

First, the sort of SRAM is to be determined. SRAM is roughly divided into two groups, sense

amp SRAM and normal SRAM without sense amp. Sense Amp using SRAM is better for small

signal handling and it is true that this kind SRAM has advantages over normal one. But a

disadvantage is sense amp using SRAM takes difficulty in handling threshold voltages. So in this

project, normal 6T SRAM is to be used as the main area we are interested in is the leakage power

reduction using multi-threshold voltages. There are many factors for 32K-bit SRAM, but this

project will focus on the major parameters that can directly affect the indices we are interested

in. Key parameters are listed as,

Parameters Values

Supply Voltage 5 V

nMOS Threshold Voltage Vt,HI = 0.5V, Vt,LO = 0.38V

pMOS Threshold Voltage Vt,HI = -0.5V, Vt,LO = -0.38V

Table 1. Major Controllable Parameters for 32 K-bit Sleepy SRAM

Page | 17

Page 18: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

Fig 4. Arrangement of 32 K-Bit Sleepy SRAM

Another fact that must be considered is that memory is quite slower compared with a processor

unit, and because memory is a sort of size critical device, the overall area should be limited at a

proper level. This is why sleep transistors are applied partially not to the whole system. The

target values for 32K-bit SRAM are arranged below.

Gain/Overhead Target

Power Reduction 40~50%

Area Overhead Leakage Control Transistor 10~15%

Delay Overhead

Worst-case Delay 0% Increased

Best-case Delay 20% Increased

Table 2. Target Values for 32K-bit Sleepy SRAM

Page | 18

Page 19: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

Delay overhead might mislead that the overall delay is increased 20%. The delay here separates

the best and worst cases, so the maximum latency remains the same; the fastest latency before

might not be kept. And considering the performance is generally determined by the worst-case

delay, the targeted value can be interpreted as zero delay increased with large leakage power

reduction. And the transistor models and tools for the design, implementation and testing is,

Tool/Simulator S-Edit

Technology 0.20µm

Transistor Model tsmc20N, tsmc20P

Table 3. Tools & Models for 32K-bit Sleepy SRAM

Page | 19

Page 20: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

CHAPTER – 3

DESIGN ISSUES

Transistor sizing for SRAM can be approached in two ways. One is the basic 6T transistor

sizing. For the function of SRAM cell, read & write stability needs to be guaranteed. In read

stability, N1 transistor is required to be much larger than N5 transistor to make sure that node

between N1 and N5 transistors must not flip. When in write mode, bit lines (BL or BL_b)

overpower cell with new value. However, high bit lines must not overpower inverters during

read operation. That results in the determination of sizing P3 transistor weaker than N5

transistor.

Fig 5. Diagram of Sleepy SRAM cell

Page | 20

Page 21: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

Transistor W/L Ratio

N1 600nm/200nm

N2 600nm/200nm

N3 200nm/200nm

N4 200nm/200nm

N5 300nm/200nm

N6 300nm/200nm

P1 300nm/200nm

P2 300nm/200nm

P3 600nm/400nm

P4 600nm/400nm

Table 4. Transistor Sizing of Sleepy SRAM Cell

The sleep transistors for pull-up and pull-down network are used to 6T SRAM cell for the

purpose of reducing the leakage current. Once the 6T SRAM sizing is determined, we are able to

start to size the sleep transistors in heuristic way. In sizing sleep transistors, we need to approach

with the following mathematical equations that state SRAM performance with existence of sleep

transistors and leakage current. For n-type MOSFET, when the sleep transistor is used, delay is

increased with VX, the voltage at the node between N1 & N3.

For n-type MOSFET, N1 should be in saturation mode when conducting the maximum current.

τd α ( ( CLVDD ) / ( VDD - VtL,n ) α )

τdsleep α ( ( CLVDD ) / ( VDD – VX,n - VtL,n ) α )

Suppose Δp the rate of tolerance for the delay penalty, then

( τd,n / τd,nsleep ) = 1 - ∆p,n

Page | 21

Page 22: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

And setting the scaling factor, α = 1 gives,

VX,n = ∆p,n ( VDD - VtL,n )

So the amount of current flowing through the linearly operating sleep transistor calculated as,

Isleep,n = µn Cox ( W/L )sleep,n ( ( ( VDD - VtH,n ) . VX,n ) – (VX,n2 /2) )

By the similar fashion, the leakage current through p-type sleep transistor is found as,

τd,p α ( ( CLVDD ) / ( -VDD - |VtL,p| ) α )

τdsleep α ( ( CLVDD ) / ( –VX,p + |VtL,p| ) α )

( VDD – VX,p ) = ∆p ( VDD - |VtL,p| )

Isleep,p = µp Cox ( W/L )sleep,p ( ( ( -VDD + |VtH,p| ) . (VX,p - VDD ) ) – ( (VX,p – VDD )2 / 2 ) )

The arranged sizing data for n-type sleep transistor follow as,

Type ∆penalty Rate (W/L)sleep Icalculated Imeasured VX

nMOS

0.197 0.50 6.394E-05 A 6.403E-05 A 5.752E-01 V

0.130 1.00 9.181E-05 A 9.206E-05 A 3.796E-01 V

0.100 1.50 1.098E-04 A 1.100E-04 A 2.920E-01 V

0.081 2.00 1.212E-04 A 1.231E-04 A 2.365E-01 V

0.063 3.00 1.443E-04 A 1.420E-04 A 1.840E-01 V

0.050 4.00 1.549E-04 A 1.530E-04 A 1.460E-01 V

0.042 5.00 1.641E-04 A 1.612E-04 A 1.226E-01 V

Table 5. Transistor Sizing Data for n-type Sleep Transistor

Page | 22

Page 23: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

And for the p-type sleep transistor,

Type ∆penalty Rate (W/L)sleep Icalculated Imeasured VX

pMOS

0.171 0.50 1.356E-04 A 1.337E-04 A 4.993E-01 V

0.123 1.00 2.070E-04 A 2.079E-04 A 3.592E-01 V

0.089 1.50 2.338E-04 A 2.316E-04 A 2.599E-01 V

0.070 2.00 2.506E-04 A 2.513E-04 A 2.044E-01 V

0.051 3.00 2.797E-04 A 2.766E-04 A 1.489E-01 V

0.039 4.00 2.889E-04 A 2.919E-04 A 1.139E-01 V

0.033 5.00 3.076E-04 A 3.002E-04 A 9.636E-02 V

Table 6. Transistor Sizing Data for p-type Sleep Transistor

For both n-type & p-type, sizing was selected to be (W/L) = 1, because memory is a size critical

devices and only SRAM cell capable of tolerating up to 50% delay penalty will have sleep

transistors. In other words, all the sizing listed in the above tables do not increase the worst-case

delay so once delay requirement is met, then transistor size should meet the other key

requirement such as area load.

The last one is sizing for the peripheral transistors of SRAM. Basically the operation of SRAM is

pre charging and evaluating, and reminding each of bit line has large capacitances, discharging

transistors should be large enough to evaluate the signal fast. And pre-charge transistors should

be weak in order that writing function operates efficiently.

Page | 23

Page 24: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

Fig 6. Conceptual Diagram of SRAM Column

Transistor W/L

N1 400nm/200nm

N2 400nm/200nm

N3 400nm/200nm

N4 400nm/200nm

P1 300nm/200nm

P2 300nm/200nm

Table 7. Sizing of SRAM Peripheral Transistor

Page | 24

Page 25: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

CHAPTER – 4

TOOLS

4.1 - Introduction

Tanner Tool is a SPICE Computer Analysis Programmed for Analog Integrated circuits. Tanner

Tool consists of the following Engine Machines –

1. S-EDIT (Schematic Edit)

2. T-EDIT (Simulation Edit)

3. W-EDIT (Waveforms Edit)

4. L-EDIT (Layout Edit)

Using these Engine tools, SPICE Programs provide facility to the user to design and simulate

new ideas in analog integrated circuits before going to the time consuming and costly process of

chip fabrication.

4.2 - S-EDIT (Schematic Edit)

S-Edit is hierarchy of files, modules and pages. It introduces symbol and schematic modes. S-

Edit provides the facility of –

1. Beginning a design

2. Viewing, drawing and editing of objects

3. Design connectivity

4. Properties, net lists and simulations

5. Instance and browser schematic and symbol mode

In S-Edit, the available components from the library can be selected to make the schematic of the

desired circuit. It explains the design process in detail in terms of file module operation and

module [23]. Effective schematic design requires a working knowledge of the S-Edit design files

consist of modules.

Page | 25

Page 26: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

A module is a functional unit of design such as a transistor, a gate and an amplifier. Modules

contain two components –

a) Primitives – Geometrical objects created with drawing tools.

b) Instances – References to other modules in file. The instanced module is the original.

Two viewing modes of the S-Edit are-

a) Schematic mode – This mode helps in creating or viewing a schematic.

b) Symbol mode – It represents symbol of a larger functional unit such as operational

amplifier.

4.3 - T-EDIT (Simulation Edit)

The heart of T-Spice operation is the output file (also known as the circuit description, the net list

and the input deck). This is a plain text file that contains the device statement and simulation

commands, drawn from the SPICE circuit description language with which T-Spice constructs a

model of the circuit to be simulated. Input files can be created and modified with any text editor.

T-Spice is a tool used for simulation of the circuit. It provides the facility of

a) Design Simulation

b) Simulation commands

c) Device Statements

d) User-defined External Models

e) Small Signal and noise models

T-spice uses Kirchhoff’s Current Law (KCL) to solve circuit problems. To T-Spice, a circuit is a

set of devices attached to the nodes. The voltage at all nodes represents the circuit state. T-Spice

solves for a set of node voltage that satisfied KCL (implying that sum of currents flowing into

each node is zero).

In order to evaluate whether a set of node voltages is a solution, T-Spice computers and sums all

the current flowing out of each device into nodes connected to it (its terminals). The relationship

between the voltages at device terminals and the currents through the terminal is determined by

the device model for a resistor of resistance R is

Page | 26

Page 27: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

I = ∆V/R

Where ∆V represents the voltage difference across the device.

4.3.1 - DC Operating point Analysis

DC operating point analysis finds a circuit’s steady-state condition, obtained (in principle) after

the input voltages have been applied for an infinite amount of time. The .include command

causes T-Spice to read in the contents of the model file ml2_125.md for the evaluation of

transistors m1n and m1p. This file (which must be in the same directory as invert1.sp) consists of

two .model commands, describing two MOSFET models called nmos and pmos-

.model nmos nmos

+ Level=2 Ld=0.0u Tox=225.00E-10

+ Nsub=1.066E+16 Vto=0.622490 Kp=6.326640E-05

+ Gamma=0.639243 Phi=0.31 Uo=1215.74

+ Uexp=4.612355E-2 Ucrit=174667 Delta=0.0

+ Vmax=177269 Xj=0.9u Lambda=0.0

+ Nfs=4.55168E+12 Neff=4.68830 Nss=3.00E+10

+ Tpg=1.000 Rsh=60 Cgso=2.89E-10

+ Cgdo=2.89E-10 Cj=3.27E-04 Mj=1.067

+ Cjsw=1.74E-10 Mjsw=0.195

.model pmos pmos

+ Level=2 Ld=0.03000u Tox=225.000E-10

+ Nsub=6.575441E+16 Vto=-0.63025 Kp=2.635440E-05

+ Gamma=0.618101 Phi=0.541111 Uo=361.941

+ Uexp=8.886957E-02 Ucrit=637449 Delta=0.0

+ Vmax=63253.3 Xj=0.112799u Lambda=0.0

+ Nfs=1.668437E+11 Neff=0.64354 Nss=3.00E+10

+ Tpg=-1.000 Rsh=150 Cgso=3.35E-10

+ Cgdo=3.35E-10 Cj=4.75E-04 Mj=0.341

+ Cjsw=2.23E-10 Mjsw=0.307

Page | 27

Page 28: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

ml2_125.md assigns values to various Level 2 MOSFET model parameters for both n-type and

p-type devices. When read by the input file, these parameters are used to evaluate Level 2

MOSFET model equations, and the results are used to construct internal tables of current and

charge values. Values read or interpolated from these tables are used in the computations called

for by the simulation. Two transistors, m1n and m1p, are defined in invert1.sp. These are

MOSFETs, as indicated by the key letter m, which begins their names. Following each transistor

name are the names of its terminals. The required order of terminal names is: drain-gate-source-

bulk. Then the model name (nmos or pmos in this example), and physical characteristics such as

length and width, is specified. The .op command performs a DC operating point calculation and

writes the results to the file specified in the Simulate > Start Simulation dialog. The output file

lists the DC operating point information for the circuit described by the input file.

4.3.2 - DC Transfer Analysis

DC transfer analysis is used to study the voltage or current at one set of points in a circuit as a

function of the voltage or current at another set of points. This is done by sweeping the source

variables over specified ranges, and recording the output. The .dc command, indicating transfer

analysis, is followed by a list of sources to be swept, and the voltage ranges across which the

sweeps are to take place.

For example, for inverter with dc input Vin and output out, Vin will be swept from 0 to 3 volts in

0.02 volt increments, and Vdd will be swept from 2 to 4 volts in 0.5 volt increments. The transfer

analysis will be performed as follows: Vdd will be set at 2 volts and vin will be swept over its

specified range; Vdd will then be incremented to 2.5 volts and Vin will be reswept over its range;

and so on, until Vdd reaches the upper limit of its range. The .dc command ignores the values

assigned to the voltage sources Vdd and Vin in the voltage source statements, but they must still

be declared in those statements. The results for nodes in and out are reported by the .print dc

command to the specified destination.

Page | 28

Page 29: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

4.3.3 - Transient Analysis

Transient analysis provides information on how circuit elements vary with time. The basic T-

Spice command for transient analysis has three modes. In the default mode, the DC operating

point is computed, and T-Spice uses this as the starting point for the transient simulation.

.tran 2n 600n

.print tran in out

For the commands shown above, The .tran command specifies the characteristics of the transient

analysis to be performed: it will last for 600 nanoseconds, with time steps no larger than 2

nanoseconds.

4.3.4 - AC Analysis

AC analysis characterizes the circuit’s behavior dependence on small-signal input frequency. It

involves three steps:

(1) Calculating the DC operating point,

(2) Linearizing the circuit and

(3) Solving the Linearized circuit for each frequency.

Vin1 in1 GND 2

Vdd Vdd GND 5.0

vbias vbias GND 0.8

vdiff in2 in1 -0.0007 AC 1 90

.ac DEC 5 1 100MEG

.print ac vdb(out)

.print ac vp(out)

.acmodel opamp1m.out {*}

Page | 29

Page 30: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

For the commands shown above, three voltage sources (besides Vdd) are defined. Vdiff sets the

DC voltage difference between nodes in2 and in1 to –0.0007 volts; its AC magnitude is 1 volt

and its AC phase is 90 degrees.

Vin1 sets node in1 to 2 volts, relative to GND.

vbias sets node vbias to 0.8 volts, relative to GND.

The .ac command performs an AC analysis. Following the .ac keyword is information

concerning the frequencies to be swept during the analysis. In this case, the frequency is swept

logarithmically, by decades (DEC); 5 data points are to be included per decade; the starting

frequency is 1 Hz and the ending frequency is 100 MHz.

The two .print commands write the voltage magnitude (in decibels) and phase (in degrees),

respectively, for the node out to the specified file. The .acmodel command writes the small-

signal model parameters and operating point voltages and currents for all circuit devices

(indicated by the wildcard symbol *) to the file opamp1m.out.

This example will generate two output files: opamp1.out, specified by the Simulate > Start

Simulation command, and opamp1m.out, specified by the .acmodel command.

4.3.5 - Noise Analysis

Real circuits, of course, are never immune from small, “random” fluctuations in voltage and

current levels. In T-Spice, the influence of noise in a circuit can be simulated and reported in

conjunction with AC analysis. The purpose of noise analysis is to compute the effect of the noise

associated with various circuit devices on an output voltage or voltages as a function of

frequency. Noise analysis is performed in conjunction with AC analysis; if the .ac command is

missing, then the .noise command is ignored. With the .ac command present, the .noise

command causes noise analysis to be performed at the same frequencies: starting at 1 Hz, ending

at 100 MHz, 5 data points per decade. The .noise command takes two arguments: the output at

which the effects of noise are to be computed, and the input at which the noise can be considered

to be concentrated for the purposes of estimating the equivalent noise spectral density.

Page | 30

Page 31: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

4.4 - W- EDIT (Waveform Edit)

The ability to visualize the complex numerical data resulting from VLSI circuit simulation is

critical to testing, understanding and improving these circuits. W-Edit is a waveform viewer that

provides ease of use, power and speed in flexible environment designed for graphical data

representation. The advantages of W-Edit include;

a. Tight integration with T-Spice, Tanner EDA’s circuit level simulator. W-Edit can

chart data generated by T-Spice directly, without modification of the output text

data files. The data can also be charted dynamically as it is produced during the

simulation.

b. Chart can automatically configure for the type of data being presented.

c. A data is treated by W-Edit as a unit called a trace. Multiple traces from different

output files can be viewed simultaneously in single or several windows; traces can

be copied and moved between charts and windows. Trace arithmetic can be

performed on existed tracing to create new ones.

d. Chart views can be panned back and forth and zoomed in and out, including

specifying the exact X-Y co-ordinate range.

e. Properties of axes, traces, rides, charts, text and colors can be customized.

Numerical data is input to W-Edit in the form of plain or binary text files. Header and comment

information supplied by T-Spice is used for automatic chart configuration. Run time update of

results is made possible by linking W-Edit to a running simulation in T- Spice. W-Edit saves

data with chart, trace, axis and environment settings in files with the WDB (W-Edit Database).

4.5 - L-EDIT (Layout Edit)

It is a tool that represents the masks that are used to fabricate an integrated circuit. It describes

the layout design in terms of files, cells and mask primitives. On the layout level the component

parameters are totally different from schematic level. So, it provides the facility to the user to

analyze the response of circuit before forwarding it to the time consuming and costly process of

Page | 31

Page 32: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

fabrication. There are rules for designing layout diagram of a schematic circuit using which user

can compare the output response with the expected one.

4.5.1 - L-Edit : An Integrated Circuit Layout Tool

In L-Edit layers are associated with masks used in fabrication process. Different layers can be

conveniently represented by different colors and patterns. L-Edit describes a layout design in

terms of files, cells, instances and mask primitives. One may load as many files as desired into

memory. A file may be composed of any number of sets. These cells may be hierarchically

related, as in a typical design, or they may be independent, as in a “library” file. Cells may

contain any number or combination of mask primitives and instances of other cells.

Cells : The Basic Building Blocks

The basic building block of the integrated circuit design in L-edit is a cell. Design layout occurs

within cells. A cell can:

Contain part or all of entire design.

Be referenced in other cells as a sub-cell, or instance.

Be made up entirely of instances of other cells.

Contain original drawn objects, or primitives.

Be made up entirely of primitives or a combination of primitives and instances of other

cells.

Hierarchy

L-Edit supports fully hierarchical mask design. Cells may contain instances of other cells. An

instance is a reference to a cell; should you edit the instanced cell, the change is reflected in all

the instances of that cell. Instances simplify the process of updating a design, and also reduce

data within the instanced cell instead, only a reference to the instanced cell is stored, along with

the information on the position of instance and on how the instance may be rotated or mirrored.

There is no preset limit to the size or complexity of hierarchy. Cells may contain instances of

Page | 32

Page 33: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

other cells that in turn contain instances of other cells, to an arbitrary number of levels (subject

only to hardware constraints). L-Edit does not use a “separated” hierarchy: instances and

primitives may coexist in the same cell at any level in the hierarchy. Design files are self-

contained. The “pointer” to a cell contained in an instance always points to a cell within the same

design file. When cells are copied from one file to another, L-Edit automatically copies across

any cells that are instanced by the copied cell, to maintain the self-contained nature of the

destination file.

Design Rules

Manufacturing constraints can be defined in L-Edit as design rules. Layout can be checked

against these design rules.

Design Features

L-Edit is a full-custom mask editor. Manual layout can be accomplished more quickly because of

L-Edit’s intuitive user interface. In addition, one can construct special structures to utilize a

technology without, worrying about problems caused by automatic transformations.

Phototransistors, guard bars, vertical and horizontal bipolar transistors, static structures and

Schottky diodes, for example, are as easy to design in CMOS-bulk technology as are

conventional MOS transistors.

Floor plans

L-Edit is a manual floor-planning tool. One has the choice of displaying in outline, identified

only by name, or as fully fleshed-out mask geometry. When he displays his design in outline, he

can manipulate the arrangement of the cells in the design quickly and easily to achieve the

desired floor plan. One can manipulate instances at any level in the hierarchy, with insides

hidden or displayed, using the same graphical move/select operations or rotation/mirror

commands that he use on primitive mask geometry.

Page | 33

Page 34: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

Memory Limits

In L-Edit, one can make your design files as large as one like, given available RAM and disk

space.

Hard Copy

L-Edit provides the capability to print hard copy of the design. A multistage option allows very

large plots to be printed to a specific scale on multiple 8 ½ x 11 inch page. An L-Edit macro is

available to support large-format, high-resolution, color plotting on inkjet plotters.

Variable Grid

L-Edit’s grid options support lambda-based design as well as micron based and mil-based

design.

Error Recovery

L-Edit’s error-trapping mechanism catches system errors and in most cases provides a mean to

recover without losing or damaging data.

L-Edit Module

L-EditTM : a layout editor

L-Edit ¤ ExtractTM: a layout extractor

L-Edit ¤ DRCTM: a design rule checker

L-Edit is a full featured, high performance, interactive, graphical mask layout editor. L- Edit

generates layouts quickly and easily, supports fully hierarchical design, and allows an unlimited

number of layers, cells, and level of hierarchy. It includes all major drawing primitives and

supports 90O, 45O, and all-angle drawing modes.

Page | 34

Page 35: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

L-Edit ¤ Extract creates SPICE- compatible circuit netlists from L-Edit layouts. It can

recognize active and passive devices, sub circuits and the most common device parameters,

including resistance, capacitance, device length, width, and area, and device source and drain

area.

L-Edit ¤ DRC features user- programmable rules and handles minimum width, exact width,

minimum space, minimum surround, non-exist, overlap, and extension rules. It can handle full

chip and region-only DRC. DRC offers Error Browser and Object browser functions for quickly

and easily cycling through rule-checking errors.

Page | 35

Page 36: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

CHAPTER – 5

METHODOLOGY

5.1 - Basic 6-T Memory Cell

Memory cells are the key components of any SRAM unit. An SRAM cell can store one bit of

data. An SRAM cell comprises two back-to-back connected inverters forming a latch and two

access transistors. Access transistors serve for read and write access to the cell. An SRAM cell

offers the following basic properties.

Retention - An SRAM cell is able to retain the data indefinitely as long as it is powered.

Read - An SRAM cell is able to communicate its data. This operation does not affect the

data i.e., Read operation is non-destructive.

Write - The data of an SRAM cell can be set to any binary value regardless of its original

data.

A number of SRAM cell topologies have been reported in the past decade. Among these

topologies, resistive load four-transistor (4T) cell, load less 4T cell and six transistor (6T) SRAM

cell have received attention in practice, owing to their symmetry in storing logic `one' and logic

`zero'. The data retention in the 4T SRAM cells is ensured by the leakage current of the access

transistors. Hence, they are not proper candidates for low- power applications. On the other hand,

the data stability in a 6T SRAM cell is independent of the leakage current. Moreover, 6T

configuration exhibits a significantly higher tolerance against noise which is an important benefit

especially in the scaled technologies where the noise margins are shrinking. That is the main

reason for the popularity of the 6T SRAM cell in low-power SRAM units instead of the 4T

configurations.

A 6T SRAM cell consists of two cross-coupled CMOS inverters and two access transistors. The

output (input) of the inverters construct the internal nodes of the cell. Once active, the access

transistor facilitates the communication of the cell internal nodes with the input/output ports of

the cell. The input/output ports of the cell are called bitlines (BL and BLbar) Bitlines are a shared

Page | 36

Page 37: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

data communications medium among the cells on the same column in an array of cells.

Consequently, they have high capacitive loading. The read and write operations are conducted

through the bitlines.

Fig 7. Schematic of 6T SRAM cell

Single bit SRAM memory cell is shown in Figure 7 Static latches are used in the SRAM cell.

SRAM cell is made up of flip flop comprising of two cross coupled inverters. Two access

transistors are used to access the stored data in the cell. These transistors are turned ON/OFF by

the control line called word line(WL). Generally this word line is connected to the output of row

decoder circuits. When WL=VDD the SRAM cell is connected to bit line(BL) and complement of

bit line (BLbar) allowing both read and write operations. Read-write operation is carried out by

the help of access transistors.

5.1.1 - Read Operation

Consider node Y as reference node of the SRAM cell. Cell is said to be storing 1 if node Y is

high at VDD and node Ybar is at 0V. For the reverse voltage conditions cell is said to be storing

zero. Let us assume that cell is storing 1.Before the read operation starts BL and BLbar lines are

precharged to VDD/2. When the WL is activated the current flows through M5 and M6. Now

current from VDD will flow through M1 and M5 charging the bit line capacitance, say CBL. The

Page | 37

Page 38: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

existing capacitance on the line BLbar, say CBLbar discharges through the transistors M6 and M4.

This process develops a voltage difference between node Y and node Ybar which is sensed by

the sense amplifier to detect it as 1. Similarly a 0 in the cell is also detected by the sense

amplifier.

5.1.2 - Write Operation

Let us consider the write operation of zero to the cell which is storing a value of 1. For this, sense

amplifiers and precharge circuits are disabled. The cell is selected by activating the

corresponding WL signal. To write zero to the cell, BL line held low and BLbar line is raised to

VDD by the write circuit. Thus the node Ybar is pulled up towards the VDD/2 while node Y is

pulled down to VDD/2. When the voltage crosses this level on two nodes feedback action starts.

Parasitic capacitances developed by M3, M5 and M4, M6 are charged and discharged

respectively. Ultimately node Y stabilizes at the value 1. Since these parasitic capacitances

offered by transistors are comparatively much lesser than the bit line capacitances, write

operation is faster than read operation.

5.1.3 - Transistor Sizing

Transistor W/L Ratio

M1 / P1 300nm/200nm

M2 / P2 300nm/200nm

M3 / N1 300nm/200nm

M4 / N2 300nm/200nm

M5 / N3 600nm/200nm

M6 / N4 600nm/200nm

Table 8. Transistor Sizing of 6-T SRAM Cell

Page | 38

Page 39: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

The W/L ratio of the transistor is selected to provide the gate with current driving capability in

both the directions equal to that of the basic inverter. From the basic inverter design (W/L)n is

usually 1.5 to 2 and for a matched design, (W/L)p=(µn/µp)(W/L)n. The SRAM cell must be

designed such a way that, during read operation, the changes in Y and Ybar are small enough to

prevent the cell from changing its state. Generally two back to back coupled inverters of the

SRAM cell is designed so that Kn and Kp are matched. This design places the inverter threshold

at VDD/2. The size of the access transistors are usually made 2 to 3 times wider than Kn of the

inverters.

To achieve optimum operation of the cell following (W/L) ratio is choosed for different

transistors. A minimum ratio of 2 is required for NMOS transistors of inverters and 4 is

necessary for PMOS transistors. Access transistors must be made double wider or more by

providing a W/L ratio of more than 4. But these set of ratios does not match with the design rule

of Cadence Virtuoso layout editor for 0.18 micron technology. For 0.18 µ technology minimum

width for an NMOS transistor comes out to be 0.6 µ. Thus (W/L) ratio is 3.33. For PMOS

transistor the ratio becomes 6.66. This implies a width of 1.2 µ. Based on the SPICE simulation

results and its analysis, W/L ratio for access transistor is kept at 9.99. This refers to a gate width

of 1.8 µ.

Page | 39

Page 40: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

5.2 – Sleepy SRAM Memory Cell

Fig 8. Diagram of Sleepy SRAM cell

Starting from the design specification to the generation of mask layout, layout design of an

integrated circuit has several processing steps which have to be carefully exercised. These steps

include design of transistor level schematic, SPICE simulation of the circuit according to the

designed W/L ratios of the individual transistors, drawing of the layout using a layout editor,

design rule check, parasitic extraction and final simulation and verification. These all processing

methods are inevitable for the error free operation of chip and similar methodology is followed

for the design of 32 Kbyte Sleepy SRAM IC. Basic building block of the Sleepy SRAM is

SRAM cell which stores one bit data. Using common bit lines data can be read and written to the

SRAM cell. SPICE, being an industry standard tool for circuit simulation and analysis, is used

for the simulation and analysis of Sleepy SRAM cell and subsequently for the whole design.

Precharge circuit; sense amplifier and read-write circuits complete the one Sleepy SRAM

memory. The memory is arranged in row- column matrix which facilitates easy addressing of

memory bits and also provides design flexibility. Once the functionality of one memory cell

Page | 40

Page 41: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

array is proved it can be duplicated several times with minor design change in the I/O control

circuitry.

Single bit Sleepy SRAM memory cell is shown in Figure. Static latches are used in the SRAM

cell. SRAM cell is made up of flip flop comprising of two cross coupled inverters. Two access

transistors are used to access the stored data in the cell. These transistors are turned ON/OFF by

the control line called word line (WL). Generally this word line is connected to the output of row

decoder circuits. When WL=VDD the SRAM cell is connected to bit line (BL) and complement of

bit line (BLbar) allowing both read and write operations. Read-write operation is carried out by

the help of access transistors.

5.2.1 - Read Operation

Consider node Y as reference node of the SRAM cell. Cell is said to be storing 1 if node

Y is high at VDD and node Ybar is at 0V. For the reverse voltage conditions cell is said to be

storing zero. Let us assume that cell is storing 1.Before the read operation starts BL and BLbar

lines are precharged to VDD/2. When the WL is activated the current flows through M5 and M6.

Now current from VDD will flow through M1 and M5 charging the bit line capacitance, say CBL.

The existing capacitance on the line BLbar, say CBLbar discharges through the transistors M6 and

M4. This process develops a voltage difference between node Y and node Ybar which is sensed

by the sense amplifier to detect it as 1. Similarly a 0 in the cell is also detected by the sense

amplifier.

5.2.2 - Write Operation

Let us consider the write operation of zero to the cell which is storing a value of 1. For

this, sense amplifiers and precharge circuits are disabled. The cell is selected by activating the

corresponding WL signal. To write zero to the cell, BL line held low and BLbar line is raised to

VDD by the write circuit. Thus the node Ybar is pulled up towards the VDD/2 while node Y is

pulled down to VDD/2. When the voltage crosses this level on two nodes feedback action starts.

Parasitic capacitances developed by M3, M5 and M4, M6 are charged and discharged

respectively. Ultimately node Y stabilizes at the value 1. Since these parasitic capacitances

Page | 41

Page 42: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

offered by transistors are comparatively much lesser than the bit line capacitances, write

operation is faster than read operation.

5.2.3 - Transistor Sizing

Transistor sizing remains the same as given in Chapter – 3, Table 4, Page no. 21.

Page | 42

Page 43: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

CHAPTER - 6

IMPLEMENTATION

Fig 9. Implementation of Basic 6-Transistor SRAM Memory Cell

Page | 43

Page 44: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

Fig 10. Implementation of Sleepy SRAM Memory Cell

Page | 44

Page 45: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

CHAPTER – 7

RESULTS

Fig 11. Simulation Result of 6T SRAM Cell

Page | 45

Page 46: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

Fig 12. Simulation Result of Sleepy SRAM Cell

Page | 46

Page 47: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

CHAPTER – 8

CONCLUSION

The leakage power of non-sleepy SRAM is much larger than sleepy SRAM. The simulation of

sleepy 32K-bit SRAM in Tanner 13, shows 40% of power saving without getting worst-case

delay increased. The sleepy partition is nearer to the output and non-sleepy is farther from the

output. As expected, this is for holding the same worst-case delay of 32K-bit SRAM. Additional

area increase is 50% per sleepy SRAM cell

( Dimensionof sleepy SRAM cellDimensionof nonsleepy SRAM cell )=

1170 x (14001000 ) λ

1170 λ=1.40

Mode # Sleepy Cell # Non-Sleepy Cell

100% Sleepy 32768 0

75% Sleepy 24576 8192

50% Sleepy 16384 16384

25% Sleepy 8192 24576

Non-Sleepy 0 32768

Table 9. Sleepy SRAM Partition Mode

If power reduction is the only factor then 100% sleepy mode seems to be the best choice,

however delay and area constraints make different decision.

Page | 47

Page 48: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

CHAPTER – 9

FUTURE SCOPE

Since the fabricated chips failed to work as expected in simulation, the SRAM design will be re-

fabricated in future. Before re-sending it to fabrication, some under-suspected failed parts of the

chip will be modified and additional testing IO pins will be added to check the internal signals.

After fixing the problem of the designed SRAM functionality, more practical research about

applying this sub-threshold memory chip should be carried out.

The targeted applications of this ultra low-power SRAM design are the low-power biomedical

and space applications. Therefore, the designed sub-threshold SRAM is targeted to be integrated

in digital systems which inevitably involve sub-threshold microprocessors. Hence the design and

fabrication of a sub-threshold microprocessor is one of the main research topics following the

sub-threshold SRAM design.

This ultra low-power SRAM chip can also be applied in standalone manner. In this case, we need

to design a low-power level shifter with reasonable speed which can form interfacing circuit

between sub-threshold SRAM and normal digital processing circuits. There are some existing

level shifter designs available now. However, they can rarely handle the level shifting between

sub-threshold voltage and standard IO voltages, i.e. 2.5 V or 3.3 V for CMOS technology. One

potential level shifter candidate is a comparator-based shifter which compares the logic output of

sub-threshold SRAM with a reference voltage of half Vdd and gives the output at standard 2.5 V

or 3.3 V IO voltages.

The capacity of 32K-bits in this project is not enough for modern digital systems, especially for

some signal processing or data collection systems. Therefore, the design of sub threshold SRAM

macro with larger memory capacity is also an interesting work.

Page | 48

Page 49: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

APPENDIX

ACHIEVEMENTS

Participated And Presented Paper Titled, “VLSI Implementation Of 32 K-Bit Sleepy SRAM” in

SPANDAN-2011, A National Level Conference On Recent Trends In Engineering And

Technology, Organized By Department Of Electronics Engineering, Yeshwantrao Chavan

College Of Engineering, Nagpur.

COST TABLE

The software used for the project i.e. S-EDIT (Tanner Tools v13) was readily available in the

Project laboratory of the college.

CONTACT DETAILS

Mr. Nikhil M. Ghoradkar

Contact No :- (+91) 9970546932

Email Id :- [email protected]

Mr. Gaurav V. Jadhav

Contact No :- (+91) 9975661305

Email Id :- [email protected]

Page | 49

Page 50: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

BIBLOGRAPHY

N. Weste et al., “Principles of CMOS VLSI Design (3rd Ed. 2005)”, Addison-Wesley.

“CMOS Analog Circuit Design (2nd edition)”, Allen-Holberg.

“CMOS SRAM Circuit Design and Parametric Testing Nano-Scaled Technologies” ,

Andrei Pavlov - Manoj Sachdev

REFERENCES

JOON-SUNG YANG / GAHNGSOO MOON (DEC. 9, 2005), “32K-BIT SLEEPY SRAM”

K. Zhang, “SRAM Design on 65-nm CMOS Technology With Dynamic Sleep Transistor for Leakage Reduction (Apr., 2005)”, IEEE Journal of Solid-State Circuits Vol. 40, No. 4.

V. Rayapati, “Interconnect Propagation Delay Modeling and Validation for the 16-MB CMOS SRAM Chip (Aug., 1996)”, IEEE Transactions on Components, Packaging, and Manufacturing Technology Vol. 19, No. 3.

R. Castagnetti et al., “A High-Performance SRAM Technology With Reduced Chip-Level Routing Congestion for SOC (Mar., 2005)”, Proceedings of the Sixth International Symposium on Quality Electronic Design (ISQED’05).

M. Anis, “Design and Optimization of Multithreshold CMOS (MTCMOS) Circuits (Oct., 2003)”, IEEE Transaction on Computer Aided Design of Integrated Circuits and Systems, Vol. 22, No. 10.

H. Zhou et al., “Adaptive Mode-Control: A Low-Leakage, Power-Efficient Cache Design”, Department of Electrical & Computer Engineering, North Carolina State University.

M. Johnson, “Leakage Control With Efficient Use of Transistor Stacks in Single Threshold CMOS (Feb., 2002)”, IEEE Transactions on Very Large Scale Integrated (VLSI) Systems, Vol. 10, No. 1.

B. Calhoun, et al., “A Leakage Reduction Methodology for Distributed MTCMOS (May, 2004)”, IEEE Journal of Solid-State Circuits, Vol. 39, No. 5.

A.Ramalingam et al., “Sleep Transistor Sizing Using Timing Criticality and Temporal Currents (Jan., 2005)”, Proc. Asia South Pacific Design Automation Conference (ASPDAC).

H. Qin et al., “SRAM Leakage Suppression by Minimizing Standby Supply Voltage”, Department of EECS, University of California at Berkeley.

Page | 50

Page 51: VLSI IMPLEMENTATION OF 32KB SLEEPY SRAM THESIS

VLSI Implementation of 32 K-Bit Sleepy SRAM

Page | 51