1 Semiconductor Memories. 2 Semiconductor Memory Classification Read-Write Memory Non-Volatile...

Preview:

Citation preview

1

Semiconductor MemoriesSemiconductor Memories

2

Semiconductor Memory ClassificationSemiconductor Memory Classification

Read-Write MemoryNon-VolatileRead-Write

Memory

Read-Only Memory

EPROM

E2PROM

FLASH

RandomAccess

Non-RandomAccess

SRAM

DRAM

Mask-Programmed

Programmable (PROM)

FIFO

Shift Register

CAM

LIFO

3

Memory Timing: DefinitionsMemory Timing: Definitions

Write cycleRead access Read access

Read cycle

Write access

Data written

Data valid

DATA

WRITE

READ

4

Memory Architecture: DecodersMemory Architecture: Decoders

Word 0

Word 1

Word 2

WordN22

WordN21

Storagecell

M bits M bits

S0

S1

S2

SN22

A0

A1

AK 21

K 5 log2N

SN21

Word 0

Word 1

Word 2

WordN22

WordN21

Storagecell

S0

Input-Output(M bits)

Intuitive architecture for N x M memoryToo many select signals:

N words == N select signals K = log2NDecoder reduces the number of select signals

Input-Output(M bits)

5

2D Memory Architecture2D Memory Architecture

A0

Row

Dec

oder

A1

Aj-1Sense Amplifiers

bit line

word line

storage (RAM) cell

Row

Add

ress

Col

umn

Add

ress

Aj

Aj+1

Ak-1

Read/Write Circuits

Column Decoder

2k-j

m2j

Input/Output (m bits)

amplifies bit line swing

selects appropriate word from memory row

6

Hierarchical Memory ArchitectureHierarchical Memory Architecture

Advantages:Advantages:1. Shorter wires within blocks1. Shorter wires within blocks2. Block address activates only 1 block => power savings2. Block address activates only 1 block => power savings

Globalamplifier/driver

Controlcircuitry

Global data bus

Block selector

Block 0

Rowaddress

Columnaddress

Blockaddress

Blocki BlockP 2 1

I/O

7

Memory Timing: ApproachesMemory Timing: Approaches

DRAM TimingMultiplexed Adressing

SRAM TimingSelf-timed

Addressbus

RAS

RAS-CAS timing

Row Address

AddressBus

Address transitioninitiates memory operation

Address

Column Address

CAS

8

Read-Write Memories (RAMs)Read-Write Memories (RAMs)

Static – SRAM data is stored as long as supply is applied large cells (6 fets/cell) – so fewer bits/chip fast – so used where speed is important (e.g., caches) differential outputs (output BL and !BL) use sense amps for performance compatible with CMOS technology

Dynamic – DRAM periodic refresh required small cells (1 to 3 fets/cell) – so more bits/chip slower – so used for main memories single ended output (output BL only) need sense amps for correct operation not typically compatible with CMOS technology

9

Read-Only Memory CellsRead-Only Memory Cells

WL

BL

WL

BL

1WL

BL

WL

BL

WL

BL

0

VDD

WL

BL

GND

Diode ROM MOS ROM 1default output = 0

MOS ROM 2default output = 1

BL is pulled down to GNDwhen on activation of WL

10

MOS OR ROMMOS OR ROM

WL[0]

VDD

BL[0]

WL[1]

WL[2]

WL[3]

Vbias

BL[1]

Pull-down loads

BL[2] BL[3]

VDD

11

MOS NOR ROMMOS NOR ROM

WL

[0]

GND

BL [0]

WL [1]

WL [2]

WL [3]

VDD

BL [1]

Pull-up devices

BL [2] BL [3]

GND

12

MOS NOR ROM LayoutMOS NOR ROM Layout

Programming using theActive Layer Only, Activemask in fabrication process

Polysilicon

Metal1

Diffusion

Metal1 on Diffusion

Cell (9.5 x 7)

WL

[0]

GND

WL [1]

WL [2]

GND

WL [3]

using diffusion for GND

13

MOS NOR ROM LayoutMOS NOR ROM Layout

Polysilicon

Metal1

Diffusion

Metal1 on Diffusion

Cell (11 x 7)

Programmming usingthe Contact Layer Onlyselective addition ofmetal-to-diffusion contacts

WL

[0]

GND

WL [1]

WL [2]

GND

WL [3]

14

MOS NAND ROMMOS NAND ROM

All word lines high by default with exception of selected row

WL [0]

WL [1]

WL [2]

WL [3]

VDD

Pull-up devices

BL[3]BL [2]BL [1]BL [0]

15

Equivalent Transient Model for MOS NOR ROMEquivalent Transient Model for MOS NOR ROM

Word line parasitics Wire capacitance and gate capacitance Wire resistance (polysilicon)

Bit line parasitics Resistance not dominant (metal) Drain and Gate-Drain capacitance

Model for NOR ROM VDD

Cbit

rword

cword

WL

BL

16

Equivalent Transient Model for MOS NAND ROMEquivalent Transient Model for MOS NAND ROM

Word line parasitics Similar to NOR ROM

Bit line parasitics Resistance of cascaded transistors dominates Drain/Source and complete gate capacitance

Model for NAND ROMVDD

CL

rword

cword

cbit

rbit

WL

BL

17

Decreasing Word Line DelayDecreasing Word Line Delay

Metal bypass

Polysilicon word lineK cells

Polysilicon word lineWL

Driver

(b) Using a metal bypass

(a) Driving the word line from both sides

Metal word line

WL

(c) Use silicides

driving from both sides reduces the worst case delay of the word line by 4 (like buffer insertion to reduce RC delay)

18

Precharged MOS NOR ROMPrecharged MOS NOR ROM

PMOS precharge device can be made as large as necessary,but clock driver becomes harder to design.

WL [0]

GND

BL [0]

WL [1]

WL [2]

WL [3]

VDD

BL [1]

Precharge devices

BL [2] BL [3]

GND

pref

19

Non-Volatile MemoriesNon-Volatile MemoriesThe Floating-gate transistor (FAMOS)The Floating-gate transistor (FAMOS)

Floating gate

Source

Substrate

Gate

Drain

n+ n+_p

tox

tox

Device cross-section Schematic symbol

G

S

D

20

Floating-Gate Transistor ProgrammingFloating-Gate Transistor Programming

0 V

-5V0 V

DS

Removing programming voltage leaves charge trapped

5 V

-2.5 V 5 V

DS

Programming results in higher VT.

20 V

10 V 5 V 20 V

DS

Avalanche injection

21

A “Programmable-Threshold” TransistorA “Programmable-Threshold” Transistor

“0”-state “1”-state

DVT

VWL VGS

“ON”

“OFF”

Applying VWL results in either current to flow (“0” state) or not (“1” state)

22

FLOTOX EEPROMFLOTOX EEPROM

Floating gate

Source

Substratep

Gate

Drain

n1 n1

FLOTOX (floating-gate tunnelingoxide) transistor

Fowler-Nordheim I-V characteristic

20–30 nm

10 nm

-10 V

10 V

I

VGD

23

EEPROM CellEEPROM Cell

WL

BL

VDD

2 transistor cell

EEPROM cell as configured during readoperation. When programmed, the threshold voltage of the FLOTOX ishigher than VDD, disabling it. If not, it actsas a closed switch.

24

Flash EEPROMFlash EEPROM

Control gate

erasure

p-substrate

Floating gate

Thin tunneling oxide

n1source n1drainprogramming

Many other options …

25

Basic Operations in a NOR Flash Memory―Basic Operations in a NOR Flash Memory―EraseErase

S D

12 VG

cell arrayBL0 BL1

open open

WL0

WL1

0 V

0 V

Electrons at the floating gate are ejected to the source by tunneling. Gate is at GND, Source is at 12 V

26

Basic Operations in a NOR Flash Memory―WriteBasic Operations in a NOR Flash Memory―Write

S D

12 V

6 VG

BL0 BL1

6 V 0 V

WL0

WL1

12 V

0 V

For the write operation, a high voltage 12v is applied at gate and 6v applied at drain, i.e., BL to write a “1”. Hot electrons are injected into the floating gate, raising the threshold, effectively turning off the device.

Floating gate has no electrons for “0” stste

Floating gate has electrons for ‘1” state

27

Basic Operations in a NOR Flash Memory―Basic Operations in a NOR Flash Memory―ReadRead

5 V

1 VG

S D

BL0 BL1

1 V 0 V

WL0

WL1

5 V

0 V

For a read operation, the selected word line is raised to 5V, turning on the transistor if “0” state stored, the transistor remains off if “1” state stored.

28

6-transistor CMOS SRAM Cell 6-transistor CMOS SRAM Cell WL

BL

VDD

M5M6

M4

M1

M2

M3

BL

QQ

Consumes power only when switching - no standby power (other than leakage) is consumed

The major job of the pullups is to replenish loss due to leakageSizing of the transistors is critical

29

CMOS SRAM Analysis (Read)CMOS SRAM Analysis (Read)WL

BL

VDD

M 5

M 6

M 4

M1VDDVDD VDD

BL

Q = 1Q = 0

Cbit Cbit

First precharge both bit lines – BL and !BL – to 1

Then discharge !BL through M5 and M1

CR = (W/L)1 / (W/L)5

M5 with bigger L isdesired

30

SRAM Cell Analysis (Read)SRAM Cell Analysis (Read)

!BL=1 BL=1

WL=1

M1

M4

M5M6

Q=1!Q=0

CbitCbit

Read-disturb (read-upset): must carefully limit the allowed voltage rise on !Q to a value that prevents the read-upset condition from occurring while simultaneously maintaining acceptable circuit speed and area constraints

the resistance of M5 must be larger than M1 (M5 with bigger L, so that M1 is faster than M5)

31

CMOS SRAM Analysis (Read)CMOS SRAM Analysis (Read)

0

0

0.2

0.4

0.6

0.8

1

1.2

0.5 1 1.2 1.5 2Cell Ratio (CR)

2.5 3

Vo

ltage

Ris

e (

V)

The voltage does not rise above the threshold for CR > 1.2

Vdd = 2.5VVTn = 0.5V

32

CMOS SRAM Analysis (Write) CMOS SRAM Analysis (Write)

BL = 1 BL = 0

Q = 0

Q = 1

M1

M4

M5

M6

VDD

VDD

WL

1 is stored, trying to write a 0

In order to write the cell, the pass gate M6 must be more conductive than the M4 to allow node Q to be pulled to a value low enough for the inverter pair (M2/M1) to begin amplifying the new data.

Pullup Ratio (PR) = (WM4/LM4)/(WM6/LM6)

33

CMOS SRAM Analysis (Write)CMOS SRAM Analysis (Write)

Pullup Ratio (PR) = (WM4/LM4)/(WM6/LM6)

Vdd = 2.5V|VTp| = 0.5V

p/n = 0.5

34

Cell SizingCell Sizing

Keeping cell size minimized is critical for large caches Minimum sized pull down fets (M1 and M3)

Requires minimum width and longer than minimum channel length pass transistors (M5 and M6) to ensure proper CR

But sizing of the pass transistors increases capacitive load on the word lines and limits the current discharged on the bit lines both of which can adversely affect the speed of the read cycle

Minimum width and length pass transistors Boost the width of the pull downs (M1 and M3) Reduces the loading on the word lines and increases the

storage capacitance in the cell – both are good! – but cell size may be slightly larger

35

6T-SRAM — Layout 6T-SRAM — Layout

VDD

GND

QQ

WL

BLBL

M1 M3

M4M2

M5 M6

36

Resistance-load SRAM CellResistance-load SRAM Cell

Static power dissipation -- Want RL largeBit lines precharged to VDD

M3

RL RL

VDD

WL

Q Q

M1 M2

M4

BL BL

also known as the 4-transistor SRAM cell, reducing

the SRAM size by one-third.

37

SRAM CharacteristicsSRAM Characteristics

38

Multiple Read/Write Port CellMultiple Read/Write Port CellWL2

BL2!BL2

M7 M8

!BL1 BL1

WL1

M1

M2

M3

M4

M5 M6Q!Q

For the case of reads, with more than one pass gate open, the voltage rise in the cell will be larger and thus the size of the pulldown will have to be increased to maintain an acceptably low level (by a factor equal to the number of simultaneous open read ports).

39

3-Transistor DRAM Cell3-Transistor DRAM Cell

No constraints on device ratios

Reads are non-destructive

Value stored at node X when writing a “1” = VWWL-VTn

WWL

BL1

M1 X

M3

M2

CS

BL2

RWL

VDD

VDD - VT

VVDD- VTBL2

BL1

X

RWL

WWL

Core of first popular MOS memories (e.g., first 1Kbit memory from Intel). Cs is data storage (internal capacitance of wiring, M2 gate, and M1 diffusion capacitances

Write: uses WWL and BL1

40

Read: uses RWL and BL2. Assume BL2 precharged to Vdd (or Vdd-Vt). If cell is holding 1, then BL2 goes low – so reads are inverting.

WWL

BL1

M1 X

M3

M2

CS

BL2

RWL

VDD

VDD -VT

VVDD - VTBL 2

BL 1

X

RWL

WWL

Write: uses WWL and BL1

Refresh: read stored data, put its inverse on BL1 and assert WWL (need to do this every 1 to 4 msec)

41

3T-DRAM — Layout3T-DRAM — Layout

BL2 BL1 GND

RWL

WWL

M3

M2

M1

42

1-Transistor DRAM Cell1-Transistor DRAM Cell

M1 X

BL

WL

X Vdd-Vt

WLwrite“1”

BL Vdd

Write: Cs is charged (or discharged) by asserting WL and BLRead: Charge redistribution occurs between CBL and Cs

Cs

read“1”

Vdd/2 sensing

Read is destructive, so must refresh after read

CBL

43

1-T DRAM Cell1-T DRAM Cell

Uses Polysilicon-Diffusion Capacitance

Expensive in Area

M1 wordline

Diffusedbit line

Polysilicongate

Polysiliconplate

Capacitor

Cross-section Layout

Metal word line

Poly

SiO2

Field Oxiden+ n+

Inversion layerinduced byplate bias

Poly

44

DRAM Cell ObservationsDRAM Cell Observations1. 1T DRAM requires a sense amplifier for each bit line,

due to charge redistribution read-out.

2. DRAM memory cells are single ended in contrast to SRAM cells.

3. The read-out of the 1T DRAM cell is destructive; read and refresh operations are necessary for correct operation.

4. Unlike 3T cell, 1T cell requires presence of an extra capacitance that must be explicitly included in the design.

5. When writing a “1” into a DRAM cell, a threshold voltage is lost. This charge loss can be circumvented by bootstrapping the word lines to a higher value than VDD

45

Sense Amp OperationSense Amp Operation

DV(1)

V(1)

V(0)

t

VPRE

VBL

Sense amp activatedWord line activated

VPRE : precharged voltage

46

Static CAM Memory CellStatic CAM Memory Cell

CAM

Bit

Word

Bit

••• CAM

Bit Bit

CAM

Word

Wired-NOR Match Line

Match M1

M2

M7M6

M4 M5M8 M9

M3int

SWord

••• CAM

Bit Bit

S

If Bit=VDD and S = 0, then int charges up to through M3, Match discharge to “0”

Match precharged to VDD

47

CAM in Cache MemoryCAM in Cache Memory

CAM

ARRAY

Input Drivers

Tag HitAddress

SRAM

ARRAY

Sense Amps / Input Drivers

DataR/W

48

Periphery Memory Circuit Periphery Memory Circuit

Decoders Sense Amplifiers Input/Output Buffers Control / Timing Circuitry

49

Row DecodersRow DecodersCollection of 2M complex logic gatesOrganized in regular and dense fashion

(N)AND Decoder

NOR Decoder

_ _ _ _ _ _ _ _ _ _

_

50

Hierarchical DecodersHierarchical Decoders

• • •

• • •

A2A2

A 2A3

WL 0

A2A3A2A 3A2A3

A3 A3A 0A0

A0A 1A 0A1A0A1A0A1

A 1 A1

WL 1

Multi-stage implementation improves performance

NAND decoder usingNAND decoder using2-input pre-decoders2-input pre-decoders

51

Dynamic DecodersDynamic Decoders

Precharge devices

VDD

GND

WL3

WL2

WL1

WL0

A0A0

GND

A1A1

WL3

A0A0 A1A1

WL 2

WL 1

WL 0

VDD

VDD

VDD

VDD

2-input NOR decoder 2-input NAND decoder

All address signals are low during precharge

precharge all outputs high, then GND inactive outputs - active “high” output signals

52

4-input pass-transistor based column 4-input pass-transistor based column decoderdecoder

Advantages: speed (tpd does not add to overall memory access time) Only one extra transistor in signal pathDisadvantage: Large transistor count

A0S0

BL 0 BL 1 BL 2 BL 3

A1

S1

S2

S3

D

2-input

NOR

decoder

53

Pass Transistor Based Column DecoderPass Transistor Based Column DecoderBL3 BL2 BL1 BL0

Data

2 in

put

NO

R d

ecod

erA1

A0

S3

S2

S1

S0

Advantage: speed since there is only one extra transistor in the signal path

Disadvantage: large transistor count

!BL3 !BL2 !BL1 !BL0

!Data

54

4-to-1 tree based column decoder4-to-1 tree based column decoder

Number of devices drastically reducedDelay increases quadratically with # of sections; prohibitive for large decoders

buffersprogressive sizingcombination of tree and pass transistor approaches

Solutions:

BL 0 BL 1 BL 2 BL 3

D

A 0

A 0

A1

A 1

55

Decoder for circular shift-registerDecoder for circular shift-register

VDD

VDD

R

WL0

VDD

f

ff

f

VDD

R

WL1

VDD

f

ff

f

VDD

R

WL2

VDD

f

ff

f• • •

R signal resets the pointer to the first position

56

Sense AmplifiersSense Amplifiers

tpC V

Iav----------------=

make V as smallas possible

smalllarge

Idea: Use Sense Amplifer

outputinput

s.a.smalltransition

57

Differential Sense AmplifierDifferential Sense Amplifier

Directly applicable toSRAMs

M4

M1

M5

M3

M2

VDD

bitbit

SE

Outy

58

Differential Sensing ― SRAMDifferential Sensing ― SRAMVDD

VDD

VDD

VDD

BL

EQ

Diff.SenseAmp

(a) SRAM sensing scheme (b) two stage differential amplifier

SRAM cell i

WL i

2xx

VDD

Output

BL

PC

M3

M1

M5

M2

M4

x

SE

SE

SE

Output

SE

x2x 2x

Precharge Bit lines to Vdd by pulling pc low

59

Latch-Based Sense Amplifier (DRAM)Latch-Based Sense Amplifier (DRAM)

Initialized in its meta-stable point with EQOnce adequate voltage gap created, sense amp enabled with SEPositive feedback quickly forces output to a stable operating point.

EQ

VDD

BL BL

SE

SE

60

Charge-Redistribution AmplifierCharge-Redistribution Amplifier

0.5

1.0

1.5

2.0

2.5

0.0

0.0 1.00 2.00time (nsec)

V

Vin

Vref5 3V

VL

VS

(b) Transient response

3.00

Concept

M2 M3

M1VL VS

Vref

CsmallClarge

Transient Response

61

Charge-Redistribution Amplifier―Charge-Redistribution Amplifier―EPROMEPROM

SE

VDD

WLC

Load

Cascodedevice

Columndecoder

EPROMarray

BL

WL

Vcasc

Out

Cout

Ccol

CBLM1

M2

M3

M4

62

Single-to-Differential ConversionSingle-to-Differential Conversion

How to make a good Vref?

Diff.S.A.Cell

2xx

Output

WL

Vref

BL

12

63

Open bitline architecture with Open bitline architecture with dummy cellsdummy cells

CS CS CS CS

BLL

L L1 L0 R0

CS

R1

CS

L

… …

BLR

VDD

SE

SE

EQ

Dummy cell Dummy cell

64

DRAM Read Process with Dummy CellDRAM Read Process with Dummy Cell3

2

1

00 1 2 3

BL

BL

t (ns)

reading 03

2

1

00 1 2 3

SE

EQ WL

t (ns)

control signals

3

2

1

00 1 2 3

BL

BL

t (ns)

reading 1

65

Voltage RegulatorVoltage Regulator

-

+

VDD

VREF

Vbias

Mdrive

Mdrive

VDL

VDL

VREF

Equivalent Model

66

Charge PumpCharge Pump

CLK

VDD

A BM1

M2Vload

Cload

Cpump

2VDD 2 VT

VDD 2 VT

0 V

VB

Vload

0 V

67

DRAM TimingDRAM Timing

68

RDRAM ArchitectureRDRAM Architecture

memoryarray

Databus

Clocks

Column

Rowdemux packet dec.

packet dec.

Bus

k k3 l

demux

69

Address Transition DetectionAddress Transition Detection

DELAYtdA0

DELAYtdA1

DELAYtdAN2 1

VDD

ATD ATD

70

Reliability and YieldReliability and Yield

71

Sensing Parameters in DRAMSensing Parameters in DRAM

From [Itoh01]

4K

10

100

1000

64K 1M 16M256M 4G 64GMemory Capacity (bits/chip)

CD(1F)

CS(1F)

QS(1C)

Vsmax(mv)

VDD(V)

QS 5 CS VDD/2Vsmax 5 QS/(CS 1 CD)

72

Noise Sources in 1T DRamNoise Sources in 1T DRam

Ccross

electrode

a-particles

leakage CS

WL

BL substrate Adjacent BL

CWBL

73

Open Bit-line Architecture —Cross CouplingOpen Bit-line Architecture —Cross Coupling

SenseAmplifierC

WL1

BL

CBL

CWBL CWBL

CC

WL0

CCBL

C C

WLD WLD WL0 WL1

BL

EQ

74

Folded-Bitline ArchitectureFolded-Bitline Architecture

SenseAmplifier

C

WL1

CWBL

CWBL

C

WL0 WL0 WLD

CC

WL1

CC

WLD

BL CBL

BL CBL

EQ

x

x

y

75

Transposed-Bitline ArchitectureTransposed-Bitline Architecture

SA

Ccross

(a) Straightforward bit-line routing

(b) Transposed bit-line architecture

BL9

BL

BL

BL99

SA

Ccross

BL9

BL

BL

BL99

76

Alpha-particles (or Neutrons)Alpha-particles (or Neutrons)

1 Particle ~ 1 Million Carriers

WL

BL

VDD

n1

a-particle

SiO21

111

11

22

22

22

77

YieldYield

Yield curves at different stages of process maturity(from [Veendrick92])

78

RedundancyRedundancy

MemoryArray

Column Decoder

Redundantrows

Redundantcolumns

RowAddress

ColumnAddress

FuseBank:

79

Error-Correcting CodesError-Correcting Codes

Example: Hamming Codes

with

e.g. B3 Wrong

1

1

0

= 3

80

Redundancy and Error CorrectionRedundancy and Error Correction

81

Sources of Power Dissipation in Sources of Power Dissipation in MemoriesMemories

PERIPHERY

ROWDEC

selected

non-selected

CHIP

COLUMN DEC

nCDEV INTf

mCDEV INTf

CPTV INTf

IDCP

ARRAY

m

n

m(n21)ihld

miact

VDD

VSS

IDD 5 SCiDV if1S IDCP

From [Itoh00]

82

Data Retention in SRAMData Retention in SRAM1.30u

1.10u

900n

700n

500n

300n

100n

0.00 .600 1.20 1.80

Factor 7

0.13 m CMOSm

0.18 m CMOSm

VDD

I lea

kag

e

SRAM leakage increases with technology scaling

83

Suppressing Leakage in SRAMSuppressing Leakage in SRAM

SRAMcell

SRAMcell

SRAMcell

VDD,int

VDD

VDD VDDL

VSS,int

sleep

sleep

SRAMcell

SRAMcell

SRAMcell

VDD,int

sleep

low-threshold transistor

Reducing the supply voltageReducing the supply voltageInserting Extra ResistanceInserting Extra Resistance

84

Data Retention in DRAMData Retention in DRAM101

100

102 1

102 2

102 3

102 4

102 5

102 6

15M 64M 255M 1G 4G 15G 64G

Capacity (bit)

Curr

ent

(A)

3.3 2.5 2.0 1.5 1.2 1.0 0.8

Operating voltage (V)

0.53 0.40 0.32 0.24 0.19 0.16 0.13

Extrapolated threshold voltage at 25 C (V)

IACT

IAC

IDC

Cycle time : 150 nsT 5 75 C,S

From [Itoh00]

85

Case StudiesCase Studies

Programmable Logic Array SRAM Flash Memory

86

PLA versus ROMPLA versus ROM Programmable Logic Array

structured approach to random logic“two level logic implementation”

NOR-NOR (product of sums)NAND-NAND (sum of products)

IDENTICAL TO ROM!

Main differenceROM: fully populatedPLA: one element per minterm

Note: Importance of PLA’s has drastically reduced1. slow2. better software techniques (mutli-level logic

synthesis)But …

87

Programmable Logic ArrayProgrammable Logic Array

GND GND GND GND

GND

GND

GND

VDD

VDD

X0X0 X1 f0 f1X1 X2X2

AND-plane OR-plane

Pseudo-NMOS PLA

88

Dynamic PLADynamic PLA

GND

GNDVDD

VDD

X0X0 X1 f0 f1X1 X2X2

ANDf

ANDf

ORf

ORf

AND-plane OR-plane

89

Clock Signal Generation Clock Signal Generation for self-timed dynamic PLAfor self-timed dynamic PLA

f

tpre teval

f AND

f

f AND

f AND

fOR

fOR

(a) Clock signals (b) Timing generation circuitry

Dummy AND row

Dummy AND row

90

PLA LayoutPLA LayoutVDD GNDAnd-Plane Or-Plane

f0 f1x0 x0 x1 x1 x2 x2

Pull-up devices Pull-up devices

91

4 Mbit SRAM4 Mbit SRAMHierarchical Word-line ArchitectureHierarchical Word-line Architecture

Global word line

Sub-global word line

Block groupselect

Blockselect

Blockselect

Memory cell

Localword line

Block 0

•••

Localword line

Block 1

•••

Block 2...

•••

92

Bit-line CircuitryBit-line Circuitry

Bit-lineload

Blockselect ATD

BEQ

Local WL

Memory cell

I/O lineI/O

B/T

CD

Sense amplifier

CD CD

I/O

B/T

93

Sense Amplifier (and Waveforms)Sense Amplifier (and Waveforms)

BS

I /O I /O

DATA

Blockselect ATD

BSSA SA

BS

SEQ

SEQ

SEQ

SEQSEQ

Dei

I/O Lines

Address

Data-cut

ATD

BEQ

SEQ

DATA

Vdd

GND

SA, SA

Vdd

GND

94

1 Gbit Flash Memory1 Gbit Flash Memory

Sense Latches(10241 32)3 8

Data Caches(10241 32)3 8

Sense Latches(10241 32)3 8

Data Caches(10241 32)3 8

Wo

rd L

ine

Dri

ver

Wo

rd L

ine

Dri

ver

Wo

rd L

ine

Dri

ver

Wo

rd L

ine

Dri

ver

512Mb Memory Array 512Mb Memory Array

BL0 BL1 ····· BL16895 BL16996 BL16897··· BL33791

SGDWL31

WL0SGS

Block0

BLT0

Block1023

Block0

Block1023

Bit Line Control CircuitBLT1

I/O

From [Nakamura02]

95

Writing Flash MemoryWriting Flash MemoryN

um

be

r of

me

mo

ry c

ell

s

0V 1V 2V

Vt of memory cells

Verify level5 0.8 V Word-line level5 4.5 V

(a)

3V 4V

Result of 4 timesprogram

100

0V 1V 2V

Vt of memory cells

3V 4V

102

104

106

108

Evolution of thresholds Final Distribution

From [Nakamura02]

96

125125mmmm22 1Gbit NAND Flash Memory 1Gbit NAND Flash Memory

10.7

mm

11.7mm

2kB

Pa

ge b

uffe

r &

ca

che

Ch

arg

e p

ump

16896 bit lines

32 word lines x 1024 blocks

From [Nakamura02]

97

125125mmmm22 1Gbit NAND Flash Memory 1Gbit NAND Flash Memory

Technology 0.13m p-sub CMOS triple-well 1poly, 1polycide, 1W, 2Al Cell size 0.077m2 Chip size 125.2mm2 Organization 2112 x 8b x 64 page x 1k block Power supply 2.7V-3.6V Cycle time 50ns Read time   25s Program time 200s / page Erase time 2ms / block

Technology 0.13m p-sub CMOS triple-well 1poly, 1polycide, 1W, 2Al Cell size 0.077m2 Chip size 125.2mm2 Organization 2112 x 8b x 64 page x 1k block Power supply 2.7V-3.6V Cycle time 50ns Read time   25s Program time 200s / page Erase time 2ms / block

From [Nakamura02]

98

Semiconductor Memory TrendsSemiconductor Memory Trends(up to the 90’s)(up to the 90’s)

Memory Size as a function of time: x 4 every three years

99

Semiconductor Memory TrendsSemiconductor Memory Trends(updated)(updated)

From [Itoh01]

100

Trends in Memory Cell AreaTrends in Memory Cell Area

From [Itoh01]

101

Semiconductor Memory TrendsSemiconductor Memory Trends

Technology feature size for different SRAM generations

Recommended