Upload
ahmed-tarhan
View
130
Download
0
Embed Size (px)
Citation preview
Integrated Electronic Systems Lab
Advanced Digital Integrated Circuit
Design
Summer Term 2011
Prof. Dr.-Ing. Klaus HofmannM. Tech. Ashok Jaiswal
www.ies.tu-darmstadt.de
Integrated Electronic Systems Lab
Integrated Electronic Systems Lab
Contents1. Introduction
2. Repetition MOS Transistors
3. Short Channel MOS
4. MOS Spice Model
5. CMOS Inverter
6. CMOS Technology
7. CMOS Logic
8. Passtransistor Logic
9. Memory Elements and Dynamic Logic
10. Performance
11. CAD and Design Flow
12. Digital Subsystem Design
13. FSM
14. ASIC Design Concepts
15. Programmable Logic Devices
16. Arithmetic Units
17. Microarchitectures
18. Semiconductor Memory
19. ASIC Design Guidelines
20. Testing
21. Future Trends
Exercises
Integrated Electronic Systems Lab
Integrated Electronic Systems Lab
Advanced Digital Integrated Circuit Design
Summer Term 2011
Prof. Dr.-Ing. Klaus HofmannM.Tech. Ashok Jaiswal
http://www.ies.tu-darmstadt.de
Integrated Electronic Systems Lab 2Organisational
Organisational (I)
• This lecture is intended for students of the following subjects:
– Wirtschaftsingenieurwesen Elektrotechnik (FB1, >= 5. Semester)
– Elektrotechnik und Informationstechnik (FB18, >= 5. Semester)
– Informatik (FB20, nach dem Vordiplom)
– Intern. Master Program Information & Communication Engineering
• Requirements: Electronics, Logic Design(i.e. lecture „Elektronik“ or „Analog Integrated Circuit Design“)
• Courses which can complete this lecture:
– Integrated Electronic Systems Lab. (SS)
– HDL-course and HDL-laboratory (2 weeks, full day course, SS)
– Computer Aided Design for Integrated Circuits (SS)
Integrated Electronic Systems Lab 3Organisational
Organisational (II)Lecture:
Tuesday 800h - 940h in room S3|06/052Friday 800h - 940h in room S3|06/052
Practice: The excercises will take place within the lecture hours (Tue. or Fri., to be announced depending on progress)
Attending Staff:
Prof. Dr.-Ing. Klaus HofmannM.Tech. Ashok JaiswalMerckstrasse 25, 3rd floor
Information:You must register for this lecture and the exam using TUCAN. We will use the TUCAN messaging system to communicate
Consultation hours:Directly after the lecture/exercise, or upon request
Integrated Electronic Systems Lab 4Organisational
Exam
Examination:
Type: written examDate: will be announced by FB18 examination officeDuration: 90 minutesAllowed materials to use: tbdRelevant topics: Topics of lectures and exercises
You must register for this exam using TUCAN! (Some exceptions may apply, e.g. diploma students, or students from non FB-18/20 departments).
Integrated Electronic Systems Lab 5Organisational
Overview
• Introduction
• Repetition MOS Devices
• CMOS Inverter
• CMOS Technology
• Static CMOS Logic
• Synchronous Logic
• Basic Sequential Circuits
• Performance
• CAD - Design Flow
• Digital Subsystem Design
• ASIC Design Concepts
• Arithmetic Units
• Micro Architectures
• Memories
• ASIC Design Guidelines
• Design for Testability
• VLSI in Signal Processing
• VLSI in Communications
• Digital Baseband Design
• Future Nanoscale CMOS
Integrated Electronic Systems Lab 6Organisational
[1] John P. Uyemura: Fundamentals of MOS Digital Integrated Circuits, Addison Wesley, 1988
[2] John P. Uyemura: Circuit Design for CMOS VLSI, Kluwer Academic Publishers, 1992
[3] Neil Weste and Kamran Eshragihian: Principles of CMOS VLSI Design, Addison Wesley
[4] W. Maly: Atlas of IC Technologies: An Introduction to VLSI Processes, The Benjamin/Cummings Publishing Company, 1987
[5] Jan M. Rabaey: Digital Integrated Circuits - A DesignPerspective, Prentice Hallhttp://bwrc.eecs.berkeley.edu/Classes/IcBook/index.html
[6] Richard C. Jaeger: Microelectronic Circuit Design, McGraw-Hill
Literature
Integrated Electronic Systems Lab
1. Introduction
Integrated Electronic Systems Lab 81: Introduction
SoC: Silicon Components Categories
Silicon components
Integrated circuitsDiscrete devices
and optoelectronics
Analog andMixed signal
Logic• Logic• Gate arrays• Cell based• FPLDs• SoC
Memory• DRAMs• SRAMs• Flash• Other
Microcomponets• Microprocessors• Microcontrolers• Microperipherals
Silicon components
Integrated circuitsDiscrete devices
and optoelectronics
Analog andMixed signal
Logic• Logic• Gate arrays• Cell based• FPLDs• Other
Memory• DRAMs• SRAMs• Flash• Other
Microcomponents• Microprocessors• Microcontrollers• Microperipherals
Modern SoCs can integrate different components
Integrated Electronic Systems Lab 91: Introduction
WW Semiconductor Sales 2008Rank Company Origin Revenue
(Mio US$)Market Share (%)
1 Intel Corp. U.S.A. 33767 13.1
2 Samsung South Korea
16902 6.5
3 Toshiba Japan 11081 4.3
4 Texas Instrum. U.S.A. 11068 4.3
5 STMicroelectronics France/ Italy
10325 4.0
6 Renesas Japan 7017 2.7
7 Sony Japan 6950 2.7
8 Qualcomm U.S.A. 6477 2.5
9 Hynix South Korea
6023 2.3
10 Infineon Germany 5954 2.3
Foundries excluded (Revenue: TSMC: 10000 Mio US$, UMC: 3500)
Integrated Electronic Systems Lab 101: Introduction
WW Semiconductor Sales 2008Rank Company Origin Revenue
(Mio US$)Market Share (%)
11 NEC Semi Japan 5826 2.3
12 AMD U.S.A. 5455 2.1
13 Freescale U.S.A. 4933 1.9
14 Broadcom U.S.A. 4643 1.8
15 Panasonic Japan 4473 1.7
16 Micron Tech U.S.A. 4435 1.7
17 NXP Nether-lands
4055 1.6
18 Sharp Japan 3682 1.4
19 Elpida Japan 3599 1.4
21 NVIDIA U.S.A. 3241 1.3
24 Fujitsu Microelec Japan 2757 1.1
Top 25 174464 67.5
TOTAL 258304 100.0
Integrated Electronic Systems Lab 111: Introduction
Example 1: Commodity MicroprocessorIntel Core Duo (Penryn Kernel), 2008
Application area: Mobile Computing, Desktop PC Technology: 45nmHafnium based High-k, Metal Gatelots of (> 6-9) levels of interconnect (Al, Cu)IP Block based design800Mio TransistorsArea: about 140mm2
Selling price: at launch time about 150 US$
Integrated Electronic Systems Lab 121: Introduction
Example 2: Graphics DRAMQimonda 512Mbit GDDR5, 2008
11326.74um
9898um
Application area: high end graphic cards (ATI HD4870)up to 6Gbit/p/s (HD4870: 115GB/s)
Technology: 75nm3 Metal layer interconnect (Al, W)Area: 112mm2
750 Mio TransistorsSelling price: at launch time about 8 US$
Integrated Electronic Systems Lab 131: Introduction
Example 3: Analog/Mixed Signal RFInfineon E-Gold Radio, 2005
Application area: BB+RF Part of entry-level mobile phoneGSM/GPRS QuadbandSupport of Camera, Keyboard, 2 Displays, MP3 ...1st chip that combines logic + RFTechnology: 130nm
Integrated Electronic Systems Lab 141: Introduction
Example 4: AMBInfineon/Qimonda: Advanced Memory Buffer, 2006
Integrated Electronic Systems Lab 151: Introduction
Example 4: AMBInfineon/Qimonda: Advanced Memory Buffer, 2006
High Speed Lanes
DDR2 interface (DQs)
Digital Core LogicDDR2 interface(CAs)
PLL
Application area: High bandwidth server memory buffer (DDR2)Max Transferrate per digital pair: 4,8Gb/s; overall: max 115Gb/sTechnology: 130nm, 6Cu + 1Al LayerArea: 30,5mm2
Power: 4-6W
Integrated Electronic Systems Lab 161: Introduction
Example 4: Power / AreaInfineon/Qimonda: Advanced Memory Buffer, 2006
Power: 1500W∅: 180mm
Power: 6WDie Area: 30,5mm2
Area: 25400mm2
Power Density: 0,059W/mm2
Power Density: 0,196W/mm2
Integrated Electronic Systems Lab 171: Introduction
0.02 0.05 0.1 0.5 1
0.1
0.2
0.5
1
2
5
10
1
2
5
10
20
50
tOX
Vt
Vdd
Gat
e o
xid
e th
ickn
ess
tO
X(n
m)
MOSFET channel length (µm)
CMOS feature size 0.035 µm
1.1 - 1.2 V
Transistors/cm 2 100 M
4 G
Future VLSI chip 20112008
Core voltage (V)
0.022 µm
0.6-0.7 V
40 M
8-16 GDRAM bits /chip
Number of wiring levels 9 12-15
(Source: International Technology Roadmap for Semiconductors 2008 update)
Status of Microelectronics Technology
Integrated Electronic Systems Lab 181: Introduction
Technology Requirements:Inductive effects will become increasingly importantAdditional metal patterns or ground planes for inductive shieldingThinner metallizationLower line-to-line capacitanceIncreasing pitch and thickness at each conductor level to alleviate the impact of interconnect delay
Passivation
Dielectric
Etch stop layer
Dielectric diffusionbarrier
Copper conductorwith metalbarrier liner
Pre-metaldielectric
Tungstencontact plug
Global
Local
Intermediate
Source: SIA Roadmap 1999
Interconnect
Integrated Electronic Systems Lab 191: Introduction
Need to increase Designers Productivity in order to make use of new Technologies
ITRS Roadmap for the Design Technology Requirements (today / near term):
Productivity Gap: Technology vs. CAD
Integrated Electronic Systems Lab 201: Introduction
ITRS Roadmap for the Design Technology Requirements (far term):
Productivity Gap: Beyond 2012
Integrated Electronic Systems Lab
0,1
1
10
100
1000
10000
100000
1000000
1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 2020 2025 2030 2035 2040 2045 2050
Fea
ture
Siz
e (n
ano
met
ers)
Year of First Product Shipment
ITRS Feature Size Projections
uP chan L
DRAM 1/2 p
min Tox
max Tox
Atom
We are here
Bacterium
Virus
Proteinmolecule
DNA moleculethickness
Eukaryoticcell
Human hairthickness
(Sources: 1994-2009 SIA/ITRS roadmaps, 1997 lecture by Gordon Moore)
ITRS roadmap
Integrated Electronic Systems Lab
(source: ITRS ‘08 roadmap)
NOW
.08 µm already available
Intel has verified20 nm transistors
in the lab
ITRS roadmap
Integrated Electronic Systems Lab
Technology Scaling: Notation
• Historically, device feature length scales have decreased by ~12%/year.
– So: feature length l ∝ 0.88year ≡: ⇓– 1/l ∝ (1/0.88)year ≈ 1.14 year ≡: ⇑
• up 14%/year
• Meanwhile, typical CPU die diameters have increased by ~2.3%/year. (Less stable trend.)
– Diameter ∝ 1.023year ≡: ↑– 1/Diameter ∝ 0.978year ≡: ↓
• Quantities that are constant over time are written as ∝ 1 ≡: •
Integrated Electronic Systems Lab
Resistance Scaling
• Fixed-shape wire (any shape):R ∝ l/wt ∝ ⇓/⇓⇓ = ⇑
– All dimensions scalingequally.
– E.g. a local interconnectin a small scaled logicblock / functional unit
• Constant-length thin wire: R∝ •/⇓⇓ = ⇑⇑• Thin cross-chip wire: R∝ ↑/⇓⇓ = ⇑⇑↑ !
– Up 33%/year!
– Long-distance wires have to be extra thick to be fast• But, fewer thick wires can fit!
Current flow
l
w
t
Integrated Electronic Systems Lab
• Fixed-shape structure (any):C ∝ lw/s ∝ ⇓⇓/⇓ = ⇓
– E.g. scaled devices/wires
• Per unit wire length:
– C ∝ •w/s ∝ •⇓/⇓ ∝ • (constant)
• Cross-chip thin wire: C ∝ ↑• Per unit area: C ∝ ••/s ∝ ⇑
– E.g., total on-chip cap./cm2
Capacitance Scaling
w
s
Integrated Electronic Systems Lab
Some 1st-order Semiconductor Scaling Laws
• Voltages V∝⇓ (due to e.g. punch-through )
• Long-term: temperature T∝⇓ (prevents leakage)
• Resistance:
– Fixed-shape wire: R ∝ l/wt ∝ ⇓/⇓⇓ = ⇑– Thin cross-chip wire: R∝ ↑/⇓⇓ = ⇑⇑↑
• Capacitance:
– Fixed-shape structure: C ∝ lw/s ∝ ⇓⇓/⇓ = ⇓– Per unit wire length: C ∝ • (constant)
– Cross-chip wire: C ∝ ↑– Per unit area: C ∝ 1/s ∝ ⇑
Integrated Electronic Systems Lab
Why Voltage Scaling?
• For many years, logic voltages were maintained at fairly constant levels as transistors shrunk
– TTL 5V logic – was standard for many years– later 3.3 V, now: ~1V within leading-edge CPUs
• Further shrinkage w/o voltage scaling is no longer possible, due to various effects:
– Punch-through– Device degradation from hot carriers– Gate-insulator failure– Carrier velocity saturation
• In general, things break down at high field strengths
– constant-field voltage scaling may be preferred
Integrated Electronic Systems Lab
Punch-Through
Moderate bias
e− e−e− e− e−e−
e− e−e−e− e−e−
p+ p+ p+p+
pn n
gateelectrode
Vbias
e− e−e−
e− e−e−
Strong bias
e− e− e−
e− e− e−
e−
e−
Very strong bias
Zero bias
Integrated Electronic Systems Lab
Need for Voltage Scaling
pn n
gateelectrode
Vbias
e− e−e− e− e−e−
e− e−e−e− e−e−
p+ p+ p+p+
e− e−e−
e− e−e−
e− e− e−
e− e− e−
e−
e−
pn n
e−e−e− e−e−e−
e−e−e−e−e−e−
p+p+p+p+
e−e−e−
e−e−e− e−
e−
e−
Smaller size & same voltage →higher electric field strengths →easier punch-through
Vbias
Integrated Electronic Systems Lab
Long-term Temperature Scaling?
• Sub-threshold power dissipation across “off” transistors is based on the leakage current density ∝ exp(−Vt / φT)α
– Vt is the threshold voltage• Must scale down with Vdd, or else transistor can’t turn on!
– φT is the thermal voltage at temperature T• Equal to kBT/q, where q is electron charge magnitude• Voltage spread of individual electrons fr. thermal noise
• As voltages decrease,– leakage power will dominate, devices will become unable to store
charge
• Unless (eventually), T ∝ V ∝ l ∝ ⇓• Only alternative to low T: Scaling halts!
– Probably what must happen, because low temps.imply slow rate of quantum evolution.
Unfortunately,lower T → fewercharge carriers!
Integrated Electronic Systems Lab
Delay Scaling
• Charging time delay t ∝ RC :
– Through fixed shape conductor: RC ∝ ⇑⇓ = •– Thin constant-length wire: RC ∝ ⇑⇑– Via cross-die thin wire: RC ∝ ⇑⇑↑·↑ = up 36%/yr!
– Through a transistor: RC ∝ •·⇓ = ⇓• Implications:
– Transistors increasingly faster than long thin wires.
– Even becoming faster than fixed-shape wires!
– Local communication among chip elements is becoming increasingly favored!
Integrated Electronic Systems Lab
Performance scaling
• Performance characteristics:
– Clock frequency for small, transistor-delay-dominated local structures: f ∝ 1/t ∝ ⇑ (up 14%/yr)
– Transistor density (per area): d = 1/⇓⇓ = ⇑⇑– Perf. density RA = fd = ⇑⇑⇑; chip area: A ∝ ↑↑– Total raw performance (local transitions / chip / time): R = fd A =
⇑⇑⇑↑↑ = 1.55year
• Increases 55% each year!
• Nearly doubles every 18 months (like Moore’s Law).
• Raw performance has (in the past) been harnessed for improvements in serial microprocessor performance.
• Future architectures will need to move to more parallel programming models to fully use further improvements.
Integrated Electronic Systems Lab
Charges & Currents
• Charges & fields:
– Charge on a structure: Q = CV ∝ ⇓⇓– Surface charge density: Q/A ∝ •– Electric field strengths: E = V/l ∝ •
• Currents:
– Peak current densities: J = E/ρ ∝ •– Peak current in a wire: I = JA ∝ ⇓⇓– Channel-crossing times: t = l/v ∝ ⇓
• Due to constant e− saturation velocity v ≈ 200 kmph
– Current in an on-transistor: I = Q/t ∝ ⇓⇓/⇓ = ⇓– Effective trans. on-resistance: R = V/I ∝ ⇓/⇓ = •
• ~4-20 kΩ is typical for a min-sized transistor
Resistivity: Constant
Integrated Electronic Systems Lab
Interconnect Scaling
• Since transistor delay dt scales as ⇓,
• And wire delay dw (w. scaled cross-section size) for a wire of length l scales as
RC ∝ (l/wt)(lw/s) = l2/st ∝ l2/⇓⇓ = l2⇑⇑,
• Then to keep dw < dt (1-cycle access) requires:l2⇑⇑ < ⇓l2 < ⇓/⇑⇑ = ⇓⇓⇓l < ⇓3/2
• So wire length in units of transistor length lt isl/lt < ⇓3/2/⇓ = ⇓1/2 (down 6%/year)
• So number of devices accessible within a constant × dt in 2-D goes as (⇓1/2)2 = ⇓, in 3-D as (⇓1/2)3 = ⇓3/2.
– Circuits must be increasingly local.
Integrated Electronic Systems Lab
Energy and Power
• Energy:– Energy on a structure: E ∝ QV ∝ CV2 ∝ ⇓⇓2 = ⇓3
– Energy per-area: EA ∝ CV2/A ∝ ⇓3/⇓2 = ⇓– Energy densities: E/l3 ∝ ⇓3/⇓3 ∝ • (not a problem)
• Power levels:– Per-area power: PA = EAf ∝ ⇓⇑ = • (not a problem)– Power per die: P = PAA ∝ ↑↑ (up ~5%/year)
• Power-per-performance: PA/RA = •/⇑⇑⇑ = ⇓⇓⇓
• But, if constant-field scaling is not used (and it has not been, very much, and cannot be much further) all the above scaling rates get increased by the square of the field strength (F) scaling rate.
– Because V ∝ F·l, and E and P scale with V2.
Integrated Electronic Systems Lab
3-D Scalability?
• Consider stacking circuits in 3-D within a constant volume.
• # of layers n: •/thickness ∝ •/⇓ ∝ ⇑• Total power: PT = P(flat chip)×n ∝ •⇑ = ⇑• Enclosing surface area AE: •• Power flux (if not recycled): PT/AE = ⇑/• = ⇑
– For this to be possible, coolant velocity &/or thermal conductivity must also increase as ⇑!
• Probably not feasible.
• Power recycling is needed to scale in 3-D!
Integrated Electronic Systems Lab
Types of Limits
• Meindl ‘95 identifies several kinds of limits on VLSI (from most to least fundamental):
– Theoretical limits (focus on energy & delay)• Fundamental limits (such as we already discussed)
• Material limits (dependent on materials used)
• Device limits (dependent on structure & geometry)
• Circuit limits (dependent on circuit styles used)
• System limits (dependent on architecture & packaging)
– Practical limits• Design limits
• Manufacturing limits
Integrated Electronic Systems Lab
Dielectric Constants
• Dielectric constants κ = ε/ε0 = C/C0. κSiO2 ≈ 4
– Want high κ in thin gate dielectrics, • To maximize channel surface-charge density, & thus on-current,
for given VG,on,
• But avoid very low thickness w. high tunneling leakage.
• But, material must also be an insulator! (κSrTi = 310!)
– Want low κ for thick interconnect (“field”) insulators• To minimize parasitic C and delay of interconnects
• Lowest κ possible is that of vacuum (1). Air is close.
– High-k dielectrics under development, used in recent Intel processes
Integrated Electronic Systems Lab
Some Device Limits• MOSFET channel length
– Generally, the lower, the better!• Reduces load capacitance & thus load charging time.
– But, lengths are lower-bounded by the following:• Manufacturing limits, such as lithography wavelengths.• Supply voltage lower-limits to keep a decent Ion/Ioff.• Depletion region thickness due to dopant density limits.• Yield, in the face of threshold variation due to statistical fluctuation in
dopant concentrations.• Source-to-drain tunneling.
• Distributed RC network response time– Limited by:
• ρ of wires (e.g. the recent shift from Al to Cu)• κ of insulators (at most, 4x less than SiO2 is possible) • Widths, lengths of wires: limited by basic geometry
Integrated Electronic Systems Lab
Circuit Limits
• Power supply voltage limits
• Switching energy limits
• Gate delays:
– Fundamentally limited by transistor characteristics, RC network charging times
• each of which are limited as per previous slide
– There is a fastest possible logic gate in any given device technology• esp. considering it has to be switched by similar gates
– Static CMOS & its close relatives (precharged domino, NORA) are probably close to the fastest-possible gates using CMOS transistors in a given tech. generation.
Integrated Electronic Systems Lab
System Limits
• Architectural limits
• Power dissipation
• Heat removal capability of packaging
• Cycle time requirements
• Physical size
Integrated Electronic Systems Lab
Design & Design-Verification Limits
• Increasing complexity (# of devices/chip) leads to continual new challenges in:
– Design organization• modularity vs. efficiency
– Automatic circuit synthesis & layout• circuit optimization
– Design verification• layout-vs-schematic
• logic-level simulation
• analog (e.g. SPICE) modeling
– Testing and design-for-testability• test coverage
Integrated Electronic Systems Lab
Manufacturing Limits
See the ITRS ‘10 roadmap for these.
• Lithography resolution, tools
• Dopant implantation techniques
• Process changes for new device structures
• Assembly & packaging
• Yield enhancement
• Environmental / safety / health considerations
• Metrology (measurement)
• Product cost & factory cost
Integrated Electronic Systems Lab
Possible Endpoints for Electronics
• Merkle’s minimal “quantum FET”
• Mesoscale nanoelectronic devices based on metal or semiconductor “islands”
– E.g. Single-electron transistors, quantum dots, resonant tunneling transistors.
• Various organic molecular electronic devices
– diodes, transistors
• Inorganic atomic-scale devices
– 1-atom-wide chains of conductor/semiconductor atoms precisely positioned on/in substrates
• Superconducting devices
Integrated Electronic Systems Lab
Energy Limits in Electronics
• Origin of CV2/2 switching energy dissipation
• Thermal reliability bounds on CV2 scaling
– Voltage limits
– Capacitance limits
• Leakage trends in MOSFETs
Integrated Electronic Systems Lab
Integrated Electronic Systems Lab 471: Introduction
Challenge: System-on-a-Chip Design ?
Design Complexity
Design Productivity
1975 1980 1985 1990 1995 2000
Gates
RTL
Place & Route
Synthesis
Reuse, IP Cores
System on a Chip
Transistors
Polygons Masks
Chasing the design gap
Integrated Electronic Systems Lab 481: Introduction
Traditional ASIC market
ASICs are customer specific ICsIf application-specific processor: ASIPThe product is made only once an application is found
Non-standard IC
Semicustom
Custom
(application specific)
ASIP
(customer specific)
ASIC
Programmable
One or more customised layers
All layers customised
Circuit with fuse, antifuse or memory that can be programmed
Integrated Electronic Systems Lab 491: Introduction
Market for Systems-on-a-Chip
Area Examples:
MultimediaMobile CommunicationAutomotive...
SoC
-> Domain Specific Computing
WWW
JavaConfigurable
Multi-StandardInfo Plug...
LAN
BroadbandNetwork
Services
MPEG 4-7100 Gop/s 5 Gtr/s 10 Watt
100Mb/sWLAN
<1 Watt
RF20Gop/s
??
Source:Hugo De ManEIS´99, Darmstadt
Integrated Electronic Systems Lab
2. Repetition Transistor Models
Integrated Electronic Systems Lab 512: Transistors
Structure of MOSFET
n+ n+
L
Source (S)Gate (G)
Drain (D)
Channel Region
Body (B)
P-Type Substrate
vD
vGvS
iSiG iD
vB
i B
D
G
S
B
MOSFET - Current through the channel region is controlled with voltage vG
Integrated Electronic Systems Lab 522: Transistors
Inversion
• The bulk has to have the lowest potential to ensure reverse biased pn-junctions (no current must flow between drain/source and bulk!)
• VSB = 0 → in the following we relate all voltages to the source voltage
• VGS > VT → n-channel is induced (blue area between drain and source).
• White area → depletion region
• A current can flow between drain and source, if VDS > 0
• Because the MOSFET is a symmetrical device, source and drain have to be defined: source has always a lower potential than the drain for an n-channel FET!
Integrated Electronic Systems Lab 532: Transistors
Ohmic region
• Increasing VDS to a value VDS > 0leads to a current ID.
• Near the drain the voltage responsible for the inversion is (VGS - VT) - VDS and thus smaller than near the source.
• The channel acts like a linear resistor - that’s why this region of operation is called ohmic.
0.80.60.40.20.00.00e+0
2.00e-4
4.00e-4
6.00e-4
8.00e-4
Drain-Source Voltage (V)
Dra
in-S
ourc
e C
urre
nt (
A)
V = 2 V
V = 3 V
V = 4 V
V = 5 VGS
GS
GS
GS
In this region: iDS ∼ vDS ⇒ Ron
0.5kΩ < Ron < 10kΩ
Integrated Electronic Systems Lab 542: Transistors
Pinch - off
• If VDS rises to the point where it is VGS - VT, there is no voltage near the drain to induce an inversion layer - the channel is pinched offat the drain.
Integrated Electronic Systems Lab 552: Transistors
Saturation
• Further increasing VDS causes the pinch-off point to move in the direction of the source.
• The voltage at the pinch off pointis always VGS - VT.
• When the electrons coming from the source reach the pinch off point, they are injected into the depleted region and the electric field in this region sweeps the electrons form the pinch off point to the drain.
Integrated Electronic Systems Lab 562: Transistors
Output Characteristics
1210864200.00e+0
2.00e-5
4.00e-5
6.00e-5
8.00e-5
1.00e-4
1.20e-4
1.40e-4
1.60e-4
1.80e-4
2.00e-4
2.20e-4
Drain-Source Voltage (V)
Dra
in-S
ourc
e C
urre
nt (
A)
V = 2 V
V = 3 V
V = 4 V
V = 5 V
Pinchoff Locus
V < 1 V
Saturation Region
LinearRegion
GS
GS
GS
GSGS
• VT = 1V
Integrated Electronic Systems Lab 572: Transistors
Channel Length Modulation
Integrated Electronic Systems Lab 582: Transistors
Transfer Characteristics and Depletion Mode MOSFET
• Transfer characteristics: plot of drain current versus gate-source voltage for a fixed drain-source voltage
• If threshold voltage of NMOS transistor negative → depletion mode MOSFET (there exists an implanted n-type channel region)
6420-2-4-50
0
50
100
150
200
250
Gate-Source Voltage (V)
Dra
in-S
ourc
e C
urre
nt (
uA)
Enhancement-Mode
Depletion-Mode
V = -2 VTN V = +2 VTNn+ n+
L
G D
p-type substrate
S
Implanted n-typeChannel Region
B
Integrated Electronic Systems Lab 592: Transistors
P-channel MOSFET (PMOS)
p+
L
Source
Gate Drain
Channel Region
Body
n-type substrate
v > 0BiB
iS
vS v < 0GiG
v < 0DiD
p+
121086420-2-5.00e-5
0.00e+0
5.00e-5
1.00e-4
1.50e-4
2.00e-4
2.50e-4
Source-Drain Voltage (V)
Sou
rce-
Dra
in C
urr
ent
(A)
V < 1 V (V > -1 V)SG
V = 3 V (V = -3V)SG
V = 2 V (V = -2 V)SG
V = 4 V (V = -4V)SG
V = 5 V (V = -5V)SG
NMOS Device PMOS DeviceEnhancement-mode VTN > 0 VTP < 0Depletion-mode VTN < 0 VTP > 0
GS
GS
GS
GS
GS
Integrated Electronic Systems Lab 602: Transistors
IEEE Standard MOS Transistor Circuit Symbols
(b) PMOS enhancement-mode device
D
B
S
G
D
B
S
G
(d) PMOS depletion-mode device
D
S
G
(f) Three-terminal PMOS transistor
D
B
S
G
(a) NMOS enhancement-mode device
D
S
G
(e) Three-terminal NMOS transistor
D
B
S
G
(c) NMOS depletion-mode device
Integrated Electronic Systems Lab 612: Transistors
Summary of MOS Equations
From NMOS to PMOS: Signs of all voltages change
D
B
S
G iDS
D
B
S
G
iSD
Integrated Electronic Systems Lab 622: Transistors
MOS Capacitances - Linear Region
CDB
CSB
C'OLC'OL
C"OX
C"OX
Gate DrainSource
p-type substrate
n-type channel
BulkNMOS device in the linear region
n+ n+
The channel shields the bulk electrode from the gate since the inversion layer acts as conductor between drain and source.
Integrated Electronic Systems Lab 632: Transistors
MOS Capacitances - Saturation
CDB
CSB
C'OL
C'OL C"
OX
GateDrain
Bulk
Source
p-type substrate
n-type channel
NMOS device in saturation
C"OX
n + n +
The channel shields the bulk electrode from the gate since the inversion layer acts as conductor between drain and source. The channel is pinched off and does not
contact the drain n+ region.
Integrated Electronic Systems Lab 642: Transistors
MOS Capacitances - Cutoff
C DBCSB
C'OLC'
OL
GateDrain
Bulk
Source
p-type substrate
NMOS device in cutoff
CGB
Depletion region
n + n +
The gate-bulk capacitance consists of the gate capacitance in series with the
depletion capacitance of the depletion region.
Integrated Electronic Systems Lab 652: Transistors
Small-Signal Models for Field-Effect Transistors (I)
ig
i d
+
-
+
-vgs
vds
The MOSFET represented as a two-port network
- Considering the MOSFET as a three-terminal device.- Small-signal model of the MOSFET is based on the y-parameter
two-port network.
Integrated Electronic Systems Lab 662: Transistors
Small-Signal Models for Field-Effect Transistors
+
-
vgs
+
-
vds
i gi d
g vm gs
rο
G D
S
Small-signal model for the three-terminal MOSFET
Integrated Electronic Systems Lab 672: Transistors
Body Effect in the Four-Terminal MOSFET
+
-
rog vmb bsg vm gs
vds vbs
+
-
+
-
vgs
G
S
D B
A second voltage-controlled current source has been added to model the back-gate transconductance gmb.
Small-Signal model for the four-terminal MOSFET
Integrated Electronic Systems Lab 682: Transistors
High-Frequency MOSFET Small Signal Model
+
-
rog vmb bsg vm gs
vds vbs
+
-
+
-
vgs
G
S
D
B
GDCGBC
BDC
BSCGSC
*D
*S
SR
DR
Integrated Electronic Systems Lab 692: Transistors
High-Frequency MOSFET Small Signal Model
DOXWLC
sionunderdiffu todue Sourceor Drain toGate Overlap :DL
DOXWLC
WLCOX
1BDC
1BSC
OXDOX WLCWLC 21+
OXDOX WLCWLC 21+
0
21
1BC
BDCC +
21
1BC
BSCC +
OXDOX WLCWLC 32+
DOXWLC
0
1BDC
11 32
BCBS CC +
GDC
GSC
BGC
BDC
BSC
Cutoff Ohmic Saturation
Integrated Electronic Systems Lab
3. Short Channel Effects on MOS Transistors
Integrated Electronic Systems Lab3: Short Channel Effects 71
Overview.
• Short Channel
Devices.
• Velocity Saturation
Effect.
• Threshold Voltage
Variations.
• Hot Carrier Effects.
• Process Variations.
(Source: Jan M. Rabaey, Digital Integrated Circuits)
Integrated Electronic Systems Lab3: Short Channel Effects 72
Short Channel Devices.
• As the technology scaling reaches channel lengths less than a micron (L<1µ), second order effects, that were ignored in devices with long channel length (L>1µ), become very important.
• MOSFET‘s owning those dimensions are called „short channel devices“.
• The main second order effects are: Velocity Saturation, Threshold Voltage Variations and Hot Carrier Effects.
L<1µn+ n+
Polysilicon
GateGate Oxyde
Source Drain
p-substrate
Field-Oxyde(SiO2)
p+ stopper
Integrated Electronic Systems Lab3: Short Channel Effects 73
Velocity Saturation Effect (I)
• Review of the Classical Derivation of the Drain Current:
VGS>VT
VDS<<VGS
• Induced channel charge at V(x):
Qi(x)=-COX[VGS-V(x)-VT] (1)
• The current is given as a product of the drift velocity of the carriers vn and the available charge:
ID=-vn(x)Qi(x)W (2)
n+ n+
G
p-substrate
D
SVGS VDS
ID
B
V(x)
xL
MOS transistor and ist bias conditions
Integrated Electronic Systems Lab3: Short Channel Effects 74
Velocity Saturation Effect (II)
• The electron velocity is related to the electric field through the mobility:
(3)
• Combining (1) and (3) in (2):
IDdx=µnCOXW(VGS-V(x)-VT)dV (4)
• Integrating (4) from 0 to L yields the voltage-current relation of the transistor:
(5)
• The behavior of the short channel devices deviates considerablyfrom this model.
• Eq. (3) assumes the mobility µn
as a constant independent of the value of the electric field Ε.
• At high electric field carriers fail to follow this linear model.
• This is due to the velocity saturation effect.
( )dx
dVxv nnn µµ =Ε−=
( ) ⎥⎦
⎤⎢⎣
⎡−−=
2
2DS
DSTGSOXnD
VVVV
L
WCI µ
Integrated Electronic Systems Lab3: Short Channel Effects 75
Velocity Saturation Effect (III)
• When the electric field reaches a critical value ΕC, (1.5×106 V/m for p-type silicon) the velocity of the carriers tends to saturate (105
m/s for silicon) due to scattering effects.
constant mobility (slope=µ)
constant velocity
Ec=1.5
E (V/µm)
vn (m/s)
vsat=105
Integrated Electronic Systems Lab3: Short Channel Effects 76
Velocity Saturation Effect (IV)
(7)
with:
• For large values of L or small values of VDS, κ approaches 1 and (7) reduces to (5).
• For short channel devices κ<1 and the current is smaller than what would be expected.
satvv =
( ) ( ) ⎥⎦
⎤⎢⎣
⎡−−=
2
2DS
DSTGSOXnDSD
VVVV
L
WCVI µκ
C
nvΕΕ+
Ε=
1
µ
• The impact of this effect over the drain current of a MOSFET operating in the linear region is obtained as follows:
• The velocity as a function of the electric field, plotted in the last figure can be approximated by:
for Ε≤ΕC (6)
for Ε≥ΕC
Reevaluating (1) and (2) using (6):
( ) ( ))(1
1
LVV
CDSDS Ε+
=κ
Integrated Electronic Systems Lab3: Short Channel Effects 77
Velocity Saturation Effect (V)
• When increasing the drain-source voltage, the electric field reaches the value ΕC, and the carriers at the drain become velocitiy saturated. Assuming that the drift velocity is saturated, from (4) with µndV/dx=vsat the drain current is:
IDSAT=vsatCOXW(VGS-VT-VDSAT) (8)
Evaluating (7) with VDS=VDSAT
• Where VGT is a short notation for VGS-VT.
• Equating (8) and (9) and solving for VDSAT:
(10)
• For a short channel device and large enough values of VGT, κ(VGT) is smaller than 1, hence the device enters saturation before VDS reaches VGS-VT.
( ) ⎥⎦
⎤⎢⎣
⎡−=
2
2DSAT
DSATGTOXnDSATDSAT
VVV
L
WCVI µκ
( ) GTGTDSAT VVV κ=
Integrated Electronic Systems Lab3: Short Channel Effects 78
Velocity Saturation Effect (VI)
Long-channel device
Short-channel device
VGS=VDD
VDSAT VGS-VT
ID
VDS
Short channel devices display an extended saturation region due to velocity-saturation
Integrated Electronic Systems Lab3: Short Channel Effects 79
Simplificated model for hand calculations (I)
A substantially simpler model can be obtained by making two assumptions:
• Velocity saturates abruptly at ΕC and is approximated by:
ν=µnΕ for Ε≤ΕC
ν=νsat= µnΕC for Ε≥ΕC
• VDSAT at which ΕC is reached is constant and has a value:
(11)
Under these conditions the equation for the current in the linear region remains unchanged from the long channel model. The value for IDSAT is found by substituting eq. (11) in (5).
n
satCDSAT
LLV
µν
=Ε=
Integrated Electronic Systems Lab3: Short Channel Effects 80
Simplificated model for hand calculations (II)
( ) ⎥⎦
⎤⎢⎣
⎡−−=
2
2DSAT
DSATTGSOXnDSAT
VVVV
L
WCI µ
( ) ⎥⎦⎤
⎢⎣⎡ −−=
2DSAT
TGSOXsatDSAT
VVVWCvI (12)
This model is truly first order and empirical and causes substantial deviations in the transition zone between linear and velocity saturated regions. However it shows a linear dependence of the saturation current with respect to VGS for the short channel devices.
Integrated Electronic Systems Lab3: Short Channel Effects 81
I-V characteristics of long- and short-channel MOS transistors both with W/L=1.5
Integrated Electronic Systems Lab3: Short Channel Effects 82
ID-VGS characteristic for long- and short channel devices both with W/L=1.5
Integrated Electronic Systems Lab3: Short Channel Effects 83
Threshold Voltage Variations (I)
• For a long channel N-MOS transistor the threshold Voltage is given for:
(11)
• Eq. (11) states that the threshold Voltage is only a function of the technology and applied body bias VSB
• For short channel devices this model becomes inaccurate and threshold voltage becomes function of L, W and VDS.
( )FSBFTT VVV φφγ 220 −−+−+=
Integrated Electronic Systems Lab3: Short Channel Effects 84
Threshold Voltage Variations (II)
Drain-induced barrier lowering(for low L)
Threshold as a function ofthe length (for low VDS)
VT
VDSL
Long-channel threshold
VT
Low VDS threshold
Integrated Electronic Systems Lab3: Short Channel Effects 85
Hot Carrier Effects (I)
• During the last decades transistors dimensions were scaleddown, but not the power supply.
• The resulting increase in the electric field strength causes an increasing energy of the electrons.
• Some electrons are able to leave the silicon and tunnel into the gate oxide.
• Such electrons are called „Hot carriers“.
• Electrons trapped in the oxide change the VT of the transistors.
• This leads to a long term reliabilty problem.
• For an electron to become hot an electric field of 104 V/cm is necessary.
• This condition is easily met with channel lengths below 1µm.
Integrated Electronic Systems Lab3: Short Channel Effects 86
Hot Carrier Effects (II)
Hot carrier effects cause the I-V characteristics of an NMOS transistor to degrade from extensive usage.
Integrated Electronic Systems Lab3: Short Channel Effects 87
Process Variations.
Devices parameters vary between runs and even onthe same die!
Variations in the process parameters, such as impurity concentrationdensities, oxide thicknesses, and diffusion depths. These are caused by non uniform conditions during the deposition and/or the diffusion of the impurities. This introduces variations in the sheet resistances and transistor parameters such as the threshold voltage.
Variations in the dimensions of the devices, mainly resulting from the limited resolution of the photolithographic process. This causes (W/L) variations in MOS transistors and mismatches in the emitter areas of bipolar devices.
Integrated Electronic Systems Lab3: Short Channel Effects 88
Impact of Device Variations.
1.10 1.20 1.30 1.40 1.50 1.60
Leff (in mm)
1.50
1.70
1.90
2.10
De
lay
( ns e
c)
–0.90 –0.80 –0.70 –0.60 –0.50
VTp (V)
1.50
1.70
1.90
2.10
De
lay
( ns e
c)
Delay of Adder circuit as a function of variations in L and VT
Integrated Electronic Systems Lab3: Short Channel Effects 89
Parameter values for a 0.25µm CMOS process. (minimum length devices).
VTO (V) γ (V0.5) VDSAT (V) K‘ (A/V2) λ (V-1)NMOS 0.43 0.4 0.63 115 × 10-6 0.06PMOS -0.4 -0.4 -1 -30 × 10-6 -0.1
Integrated Electronic Systems Lab
4. SPICE LEVEL 1 MOSFET MODEL
Integrated Electronic Systems Lab4: MOSFET Model 91
Four mask layout and cross section of a N channel MOS Transistor.
Integrated Electronic Systems Lab4: MOSFET Model 92
Layout and cross section of a n-well CMOS technology.
Integrated Electronic Systems Lab4: MOSFET Model 93
Equations for the different operation regions
0=DSI )( THGS VV ≤
( )[ ]( )DSDSTHGSDSeffDS VLAMBDAVVVVLWKP
I ⋅+−−= 12)(2
)0( THGSDS VVV −≤≤
( )( ) ( )DSTHGSeffDS VLAMBDAVVLWKP
I ⋅+−= 12
2 )0( DSTHGS VVV ≤−≤
Where the threshold voltage is given by:
( )PHIVPHIGAMMAVV BSTTH ⋅−−⋅+= 220
and the channel length:
LDLLeff ⋅−= 2
Integrated Electronic Systems Lab4: MOSFET Model 94
Where L is the length of the polysilicon gate and LD is the gate overlap of the source and drain.
The elements in the large signal MOSFET model are shown in the following figure.
Integrated Electronic Systems Lab4: MOSFET Model 95
MOSFET SPICE PARAMETERS.
Parameter Name SPICE Symbol Analytical Symbol Units
Channel length Leff L M
Poly gate length L Lgate M
Lateral diffusion/Gate-source overlap LD LD M
Transconductanceparameter KP µnCOX A/V2
Threshold voltage/Zero-bias threshold VTO VTO V
Channel-lengthmodulation parameter LAMBDA λn V-1
Bulk threshold/Backgate effect parameter GAMMA γn V1/2
Surface potential/Depletion drop in
inversionPHI -φP V
Integrated Electronic Systems Lab4: MOSFET Model 96
Specifying MOSFET Geometry in SPICE.
Mname D G S B MODname L= W= AD= AS= PD= PS= NRD= NRS=
Integrated Electronic Systems Lab4: MOSFET Model 97
LEVEL 1 MOSFET MODEL PARAMETERS.
.MODEL MODname NMOS/PMOS VTO= KP= GAMMA= PHI= LAMBDA= RD= RS= RSH= CBD= CBS= CJ= MJ= CJSW= MJSW= PB= IS= CGDO= CGSO= CGBO= TOX= LD=
where:
NMOS/PMOS- MOSFET type.
VTO- Threshold voltage (V)
KP- Transconductance parameter (A/V2)
GAMMA- Bulk threshold parameter (V1/2)
PHI- Surface potential (V)
LAMBDA- Channel length modulation parameter (V-1)
RD- Drain resistance (Ω)
Integrated Electronic Systems Lab4: MOSFET Model 98
LEVEL 1 MOSFET MODEL PARAMETERS.
RS- Source resistance (Ω)
RSH- Sheet resistance of the drain/source diffusions (Ω/ )
CBD- Zero bias drain-bulk junction capacitance (F)
CBS- Zero bias source-bulk junction capacitance (F)
MJ- Bulk junction grading coefficient (dimensionless)
PB- Built-in potential for the bulk junction (V)
• With CBD, CBS, MJ and PB, SPICE computes the voltage dependences of the drain-bulk and source-bulk capacitances:
( )( )MJ
BD
BDBDPBV
CBDVC
−=
1( )
( )MJBS
BSBSPBV
CBSVC
−=
1
Integrated Electronic Systems Lab4: MOSFET Model 99
Large-signal, charge-storage capacitors of the MOS device.
Integrated Electronic Systems Lab4: MOSFET Model 100
LEVEL 1 MOSFET MODEL PARAMETERS.
CJ- Zero bias planar bulk junction capacitance (F/m2)
CJSW- Zero bias sidewall bulk junction capacitance (F/m)
MJSW- Sidewall junction grading coefficient (dimensionless)
• If CJ, CJSW, and MJSW are given, a more accurated simulation of these capacitances is performed using the following equations:
( )( ) ( )MJSW
BDMJ
BD
BDBDPBV
PDCJSW
PBV
ADCJVC
−⋅
+−
⋅=
11
( )( ) ( )MJSW
BSMJ
BS
BSBSPBV
PSCJSW
PBV
ASCJVC
−⋅
+−
⋅=
11
Integrated Electronic Systems Lab4: MOSFET Model 101
Bottom and Sidewall components of the bulk junction capacitors.
Bottom=ABCD
Sidewall=ABEF+BCFG+DCGH+ADEH
Integrated Electronic Systems Lab4: MOSFET Model 102
LEVEL 1 MOSFET MODEL PARAMETERS.
IS- Saturation current of the junction diode (A)
CGDO- Overlap capacitance of the gate with drain (F)
CGSO- Overlap capacitance of the gate with source (F)
CGBO- Overlap capacitance of the gate with bulk (F)
TOX- Gate oxide thickness (m)
LD- Lateral diffusion (m)
Integrated Electronic Systems Lab4: MOSFET Model 103
Overlap Capacitances of an MOS transistor. (a) Top view showing the overlap between the source or drain
and the gate. (b) Side view.
Integrated Electronic Systems Lab4: MOSFET Model 104
Example of MOSFET model parameters values.
Parameter Name N Channel MOSFET P Channel MOSFET Units
Gate oxide thickness TOX 150 150 Angstroms
Transconductanceparameter KP 50 x 10-6 25 x 10-6 A/V2
Threshold voltage 1.0 -1.0 V
Channel-lengthmodulation parameter
LAMBDA0.1/L (L in µm) 0.1/L (L in µm) V-1
Bulk threshold parameterGAMMA 0.6 0.6 V1/2
Surface potential PHI 0.8 0.8 V
Gate-Drain overlapcapacitance. CGDO 5 x 10-10 5 x 10-10 F/m
Gate-Source overlapcapacitance. CGSO 5 x 10-10 5 x 10-10 F/m
Zero-bias planar bulkdepeltion capacitance CJ 10-4 3 x 10-4 F/m2
Zero-bias sidewall bulkdepletion capacitance
CJSW5 x 10-10 3.5 x 10-10 F/m
Bulk junction potential PB 0.95 0.95 V
Planar bulk junctiongrading coefficient MJ 0.5 0.5
Sidewall bulk junctiongrading coefficient MJSW 0.33 0.33
Integrated Electronic Systems Lab
5. CMOS Inverter
Integrated Electronic Systems Lab5: CMOS Inverter 106
Inverter as simplest logic gate
R
vO
V +
v I
vI
vO
V+
MS
v I
R
vO
V DD
iD
v I
R
vO
V CC
iC
QS
VI
VO
Integrated Electronic Systems Lab5: CMOS Inverter 107
Logic Voltage Levels
VOL: Nominal voltage corresponding to a low logic state at the output of a logic gate for vI = VOH.
Generally V- ≤ VOL.
VOH: Nominal voltage corresponding to a high logic state at the output of a logic gate for vI = VOL.
Generally VOH ≤ V+.
VIL: Maximum input voltage that will be recognised as a low input logic level.
VIH: Minimum input voltage that will be recognised as a high input logic level.
VIL
VIH
vI
vO
VOH
VOL
Slope = -1
Slope = -1
00
VOL
VOH
V+
V+
NMLNM
H
V-
Integrated Electronic Systems Lab5: CMOS Inverter 108
Noise Margins
NML: Noise margin associated with a low input level
NML = VIL - VOL
NMH: Noise margin associated with a high input level
NMH = VOH - VIH
Undefined Logic State
VOH
VIH
VIL
VOL
NML
NMH
V+
V-
"1"
"0"
"0"
"1"
vIvO
Integrated Electronic Systems Lab5: CMOS Inverter 109
Dynamic Response of Logic Gates
• Rise time tr: time required for the transition from V10% to V90%.
• Fall time tf: time required for the transition from V90% to V10%.
V10% = VOL + 0.1(VOH - VOL)
V90% = VOL + 0.9(VOH - VOL)
• Propagation delay τP: difference in time between the input and output signals reaching V50%.
V50% = (VOH + VOL)/2t rt f
vO
VOH
t
50%
90%
10%
τPHLτ PLH
VOL
VOL
vI
t
50%
90%
10%
tr tf
V + V OH OL
2
VOH
V + V OH OL
2
t1 t2 t3 t4
(a)
(b)
Switching waveforms for an idealised inverter(a) Input voltage signal (b) Output voltage waveform2
PHLPLHP
τττ +=
Integrated Electronic Systems Lab5: CMOS Inverter 110
MOS Inverter with Resistive Load
• NMOS switching device MS
designed to force vO to VOL
• Resistor load R to pull the output up toward the power supply VDD
• VOH = VDD (driver in cut off ⇒ iD = 0)
• VOL determined by W/L ratio of MS
MS
vI
R
vO
V = 5 VDD
vDS
iD
+
-
Integrated Electronic Systems Lab5: CMOS Inverter 111
Example
R
V = 5V DD
MS
v = V = 5 VOH
0
R
V = 5V DD
95 k Ω
MS
v = V OL
50 µA
2.06 1
v = 0.25 VDS
iDD
(a) (b)
+
-v = V < V
OL THIv = V = 5 V
OHI
O
O
Integrated Electronic Systems Lab5: CMOS Inverter 112
On - Resistance
R
VDD
VOH
Ron
v = V OLI
R
Ron
VDD
VOL
v = VOHI
(b)(a)
on
DDon
onDDOL
RR
VRR
RVV
+=
+=
1
1
⎟⎠⎞
⎜⎝⎛ −−
==
2'
1
DSTNGSn
D
DSon v
VvL
WK
i
vR
Integrated Electronic Systems Lab5: CMOS Inverter 113
Transistor Alternatives to the Load Resistor
M SvI
vO
VDD
(b) NMOS inverter with gate of the load device grounded
M L+
MSvI
vO
VDD
(a) NMOS inverter with gate of the load device connected to its source
M L
M S
vO
VDDVGG
(d) Linear load inverter
ML
MSvI
vO
VDD
(c) Saturated load inverter
M L
VI
Integrated Electronic Systems Lab5: CMOS Inverter 114
CMOS Inverter Technology
n+
p-type substrate
n+ p+ p+
vI
vo
V (5 V)DD
n-well
n+
NMOS transistorPMOS transistor
p+
V (0 V)SS
BSDDSB
Ohmic contact Ohmic
contact
CM O S T ransistor Param eters
NM O S Device PM O S Device
V T O 1 V -1 V
γ 0.50 V 0.75 V
2 φF 0.60 V 0.70 V
K ' 25 µA/V 2 10 µA/V 2
Integrated Electronic Systems Lab5: CMOS Inverter 115
Complementary MOS (CMOS) Logic Design
• Inverter with resistive load ⇒ power dissipation when the input is high.
• If an NMOS and PMOS transistor is used ⇒ CMOS.
• One transistor is always off while the other is on ⇒ no static power consumption.
MN
vI
vO
V = 5 VDD
MP
S
G
G
S
D
DvO
V = 5 VDD
vI
Ronp
onnR
Integrated Electronic Systems Lab5: CMOS Inverter 116
CMOS voltage transfer Characteristic
0V 1.0V 2.0V 3.0V 4.0V 5.0V
4.0V
2.0V
0V
- VTPI= vov
- VTNI= vov IHV
ILV
vI
1 2
3
45
M offN
M and M saturated
M saturatedM linearP
N
N P
M saturatedM linear
PN
M offP
vo
Integrated Electronic Systems Lab5: CMOS Inverter 117
Regions of Operation of Transistors in a Symmetrical Inverter
Region Input Voltage vI OutputVoltage vO
NMOSTransistor
PMOSTransistor
1 vI ≤ VTN VOH = VDD Cutoff Linear
2 VTN < vI ≤ vO + VTP High Saturation Linear
3 vI ≈ VDD/2 VDD/2 Saturation Saturation
4 vO + VTN < vI ≤ (VDD + VTP) Low Linear Saturation
5 vI ≥ (VDD + VTP) VOL = 0 Linear Cutoff
Integrated Electronic Systems Lab5: CMOS Inverter 118
What happens, if the inverter is not symmetrical?
0V 1.0V 2.0V 3.0V 4.0V 5.0V 6.0V
6.0V
4.0V
2.0V
0V
V = 5 VDD
V = 4 V
V = 3 V
V = 2 V
DD
DD
DD
Iv = vO
vI0V 1.0V 2.0V 3.0V 4.0V 5.0V
6.0V
4.0V
2.0V
0V
vI
K = 0.2R
K = 1R
K = 5R
v = vO I
Symmetrical inverter (Kn = Kp) Asymmetrical inverter (KR = Kn / Kp)
Integrated Electronic Systems Lab5: CMOS Inverter 119
Calculation of VIL
Equating currents for saturated nMOS and nonsaturated pMOS device (Region 2):
The derivation condition (dVout / dVin) = -1 has to be evaluated for
IDn(Vin) = IDp(Vin , Vout):
Evaluating the derivation gives:
This equation has to be solved together with the first equation ⇒ VIL
( ) ( )1
/
//−=
∂∂∂∂−
=outDp
inDpinDn
in
out
VI
VIdVdI
dV
dV
( ) ( )( ) ( )[ ]22 222 outDDoutDDTpinDD
pTnin
n VVVVVVVK
VVK
−−−−−=−
TpDDTnp
nout
p
nIL VVV
K
KV
K
KV −−+=⎟⎟
⎠
⎞⎜⎜⎝
⎛+ 21
Integrated Electronic Systems Lab5: CMOS Inverter 120
Calculation of VIH
At the point VIH the NMOS device is nonsaturated and the PMOS transistor is saturated (region 4):
The derivation condition (dVout / dVin) = -1 has to be evaluated for IDn(Vin, Vout) = IDp(Vin):
which gives:
This equation forms together with the first equation a quadratic in VIH
which has to be solved.
( )[ ] ( )22
22
2 TpIHDDp
outoutTnIHn VVV
KVVVV
K−−=−−
( ) ( )1
/
//−=
∂∂∂∂−
=outDn
inDninDp
in
out
VI
VIdVdI
dV
dV
( )TpDDn
pTnout
n
pIH VV
K
KVV
K
KV −++=⎟⎟
⎠
⎞⎜⎜⎝
⎛+ 21
Integrated Electronic Systems Lab5: CMOS Inverter 121
Calculation of Vth
For Vth = Vin = Vout both transistors are saturated (λ is assumed to be 0):
Solving for Vth yields:
0V 1.0V 2.0V 3.0V 4.0V 5.0V
4.0V
2.0V
0V
IHV
ILV
vI
1 2
3
4 5
M and M saturatedN Pvo
Vin=Vout
Vth
( ) ( )22
22 TpthDDp
Tnthn VVV
KVV
K−−=−
( )np
TpDDnpTnth KK
VVKKVV
/1
/
+
−+=
Integrated Electronic Systems Lab5: CMOS Inverter 122
Design of CMOS inverter (I)
• NMH = VOH - VIH = VDD - VIH
• NML = VIL - VOL = VIL - 0 = VIL
• KR = Kp / Kn
• Remember:
⇒Influence of the symmetry via W/L of transistors!
111098765432100.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
Noi
se M
argi
n (
Vol
ts)
NM
NM
H
L
KR
nnn L
WKK ⎟
⎠⎞
⎜⎝⎛= '
ppp L
WKK ⎟
⎠⎞
⎜⎝⎛= '
Integrated Electronic Systems Lab5: CMOS Inverter 123
Design of CMOS inverter (II)
The ratio (W/L) in CMOS design is used to set the level of Vth.
The ratio required to establish a given inverter threshold voltage is:
To get a symmetrical voltage transfer curve, Vth is set to VDD/2:
If in a process |VTp| = VTn, the device aspect ratios for a symmetrical inverter are related by:
Since µn / µp ≈ 2.5, a minimum area CMOS inverter will have (W/L)n ≈ 1 and (W/L)p ≈ 2.5. In this case the voltage transfer function is completely symmetric.
( )( )nn
pp
n LWµ
LWµ
K
K p
=
Tnth
TpthDD
p
n
VV
VVV
K
K
−
−−=
TnDD
TpDD
p
n
VV
VV
K
K
−
−=
21
21
( )( ) p
n
n
p
µµ
LW
LW=
Integrated Electronic Systems Lab5: CMOS Inverter 124
Summary
So what did we accomplish until now?
• We know how a CMOS inverter works.
• VOL, VOH - do you still know it?
• We know how to set the W/L ratios of the transistors to get optimal noise margins.
• So we make every inverter the same, that is to say minimal -or?0V 1.0V 2.0V 3.0V 4.0V 5.0V
4.0V
2.0V
0V
IHV
ILV
vI
1 2
3
4 5
vo
Integrated Electronic Systems Lab5: CMOS Inverter 125
M N
v = 5VI O
M P
C
V = 5 VDD
v (0+) = 5V0 V
+ 5V
0t
vI
VOL = 0 V
VOH = 5V
t1
t
vO
t2tX
(Vin - VTn)
MN saturated
MN nonsaturated
Dynamic Behavior of the CMOS Inverter High to Low Output Transition (I)
MN goes from Cutoff over Saturation into Nonsaturation region for the given input. The border between Saturation and Nonsaturation is reached at the time txand the output voltage Vout = VOH - VTn
Integrated Electronic Systems Lab5: CMOS Inverter 126
High to Low Output Transition (II)
In order to simplify the final expressions, the integrations on the right for computing tHL are done with the borders from VDD to V0
(V1 = 0,9 VDD, V0 = 0,1 VDD) ∫∫ =
==
i
dVCdt
dt
dVC
dt
dQi
OUTOUT
OUTOUT
( )[ ] ( ) ( )
( )( )
⎟⎟⎠
⎞⎜⎜⎝
⎛−
−−
=
=⎟⎟⎠
⎞⎜⎜⎝
⎛−−−
−=−−
−=−−−
∫
12
ln
2ln
2
12
22
0
22
00
V
VV
VVK
C
VVV
V
VVK
C
VVVVK
dVCtt
TnDD
TnDDn
OUT
V
VVOUTThDD
OUT
TnDDn
OUT
V
VVOUTOUTTnDD
n
OUTOUTx
TnDDTnDD
Saturation:
Nonsaturation:
( ) ( )22
1
2
2TnDDn
Tnout
VV
VTnDD
n
OUTOUTx
VVK
VC
VVK
dVCtt
TnDD
DD−
=−
−=− ∫−
Integrated Electronic Systems Lab5: CMOS Inverter 127
High to Low Output Transition (III)
therefore:( )
⎥⎦
⎤⎢⎣
⎡⎟⎟⎠
⎞⎜⎜⎝
⎛−
−+
−= 1
2ln
2
0V
VV
VV
Vt TnDD
TnDD
TnHL τ
( )TnDDn
OUT
VVK
C
−=τwhere
⎟⎠⎞
⎜⎝⎛
−=
−∫ xa
x
axax
dxln
12
( ) ⎟⎟⎠
⎞⎜⎜⎝
⎛+
=+∫ n
n
n bxa
x
anbxax
dxln
1
In our case: 1 ,1 −== bn
We have used the following integral:
( ) ( )xxHL ttttt −+−= 21
Integrated Electronic Systems Lab5: CMOS Inverter 128
Low to high output transition
From symmetry (VTn → VTp; Kn → Kp) follows for the high to low transition time:
M N
M P
C
V = 5 VDD
V = 0 V I
v (0+) = 0VO
0
v
0 V
+ 5V
t
I
v
0 V
+ 5V
0t
O
( )( )
⎥⎥⎦
⎤
⎢⎢⎣
⎡
⎟⎟
⎠
⎞
⎜⎜
⎝
⎛−
−+
−−=⇒ 1
2ln
2
0V
VV
VV
V
VVK
Ct
TpDD
TpDD
Tp
TpDDp
OUTLH
Integrated Electronic Systems Lab5: CMOS Inverter 129
Dynamic Behavior of the CMOS Inverter (cont’d)
• The choice of size of the NMOS and PMOS transistors can be dictated by the
desired average propagation delay τP
• For symmetrical inverter: PLHPHL
PLHPHLP tt
tt==
+=
2τ '' 5.2 pn KK ≈
M N
v I v
M P
C
V = 5 VDD
21
51
o
M N
M P
1 pF
V = 5 VDD
131
32.51
(a)
M N
vo
M P
2 pF
V = 5 VDD
81
201
(b)
vI
vI v
o
Example:
Symmetrical reference inverter
| VTP | = VTN = 1V τP = 6.4 nsC = 1 pF tr = tf = 12.8 ns
Scaled inverters
a) τP = 1 ns b) τP = 3.2 ns
Pfr tt τ2==
Integrated Electronic Systems Lab5: CMOS Inverter 130
Power Dissipation
• Two kinds of power dissipation in digital electronics:
– static power dissipation (logic gate output is stable)
– dynamic power dissipation (during switching of logic gate)
• With CMOS nearly no static power dissipation!
0V 2.0V 4.0V 6.0V
6.0V
4.0V
2.0V
0V
40uA
20uA
0A >>
Output Voltage
Drain Current
vI
Integrated Electronic Systems Lab5: CMOS Inverter 131
Dynamic Power Dissipation (I)
Power dissipation due to charge and discharge of capacitances
The total energy ED delivered by the source is given by
The power P(t) = VDDi(t), and because VDD is a constant,
R
VDD
+ -
C
Switch closes at t = 0
i(t)
v (t)c
+
-
1
(a) v (0) = 0c
Non-linear Resistor
∫∞
=0
)( dttPED
∫ ∫∞ ∞
==0 0
)()( dttiVdttiVE DDDDD
∫
∫∞
∞
=
=
)(
)0(
0
C
C
V
V CDD
CDDD
dvCV
dtdt
dvCVE
The current supplied by source VDD is also equal to the current in capacitor C, and so
Integrated Electronic Systems Lab5: CMOS Inverter 132
Dynamic Power Dissipation (II)
Integrating from t = 0 to t = ∞, with VC(0) = 0 and VC (∞) = VDD results in
We know that the energy Es stored in capacitor C is given by
and thus the energy EL lost in the resistive element must be
2DDD CVE =
2
2DD
S
CVE =
2
2DD
SDL
CVEEE =−=
2
22
22
DD
Discharge
DD
Charge
DDTD
CV
CVCVE
=
⎟⎟⎠
⎞⎜⎜⎝
⎛+⎟⎟
⎠
⎞⎜⎜⎝
⎛=
The total energy ETD dissipated in the process of first charging and then discharging the capacitor is equal to
Integrated Electronic Systems Lab5: CMOS Inverter 133
Dynamic Power Dissipation (III)
Thus, every time a logic gate goes through a complete switching cycle, the transistors within the gate dissipate an energy equal to ETD. Logic gates normally switch states at some relatively high frequency (switching events/second), and the dynamic power PD dissipated by the logic gate is then
In effect, an average current equal to (CVDDf) is supplied from the source VDD.
fCVP DDD2=
Integrated Electronic Systems Lab5: CMOS Inverter 134
Dynamic Power Dissipation (IV)
• Power dissipation due to the “short circuit current” (when both transistors are on during transition)
• The short circuit current reaches a peak for Vin = Vout = VDD/2
vout
V = 5 VDD
Ronp
onnR
Vin = Vout = VDD/2
0s 4ns 8ns 12ns 16nsTime
0 uA
30uA
0.0 V
vO
vI
i DD
5.0 V
Vol
tage
Cur
rent
Integrated Electronic Systems Lab5: CMOS Inverter 135
Summary
Let’s repeat:
• What is the dynamic behaviour of the inverter?
• What do we need it for?
• What kind of power dissipation is there?
• What kind of power dissipation is dominant with CMOS logic?
0V 2.0V 4.0V 6.0V
6.0V
4.0V
2.0V
0V
40uA
20uA
0A>>
Output Voltage
Drain Current
vI
fCVP DDD2=
Integrated Electronic Systems Lab 136
6: CMOS Technology
6. CMOS Technology
Integrated Electronic Systems Lab 137
6: CMOS Technology
• Basic Fabrication Operations
• Steps for Fabricating a NMOS Transistor
• LOCOS Process
• n-Well CMOS Technology
• Layout Design Rules
• CMOS Inverter Layout Design
• Circuit Extraction, Electrical Process Parameters
• Layout Tool Demonstration
• Appendix: MOSIS, EUROPRACTICE
CMOS Technology
Integrated Electronic Systems Lab 138
6: CMOS Technology
1. Chip = Die = Microchip = Bar
2. Scribe Lines
3. Engineering Test Die
4. Edge Die
5. Crystal Planes
6. Wafer Flats
Wafer Terminology
Integrated Electronic Systems Lab 139
6: CMOS Technology
The number of steps in IC fabrication flow depends upon the technology process and the complexity of the circuit
Example:
CMOS n-Well process - 30 major steps, and each major step may involve up to 15 substeps
Only three basic operations are performed on the wafer:
• Layering
• Patterning
• Doping
Basic Wafer Fabrication Operations
Integrated Electronic Systems Lab 140
6: CMOS Technology
Layering
Layers Technique
Thermaloxidation
Chemical VaporDeposition (CVD)
Evaporation Sputtering
Insulators Silicon Dioxide(SiO2)
Silicon Dioxide (SiO2)
Silicon Nitrides (Si3N4)
Silicon Dioxide (SiO2)
Silicon Monoxide (SiO)
Semiconductors Epitaxial Silicon
Poly Silicon
Conductors
Doped polysilicon
Metals
Al/Si Alloys
Silicides
Metals
Alloys
Metals
Alloys
Grow or deposit thin layers of different materials on the wafer surface
Integrated Electronic Systems Lab 141
6: CMOS Technology
Layering - Thermal Oxidation
SiO2 functions:
Si + O2 → SiO2 (900-1200°C)
700nm oxide: 10hours (1200°C)
Good oxide quality: gate oxide
Si + H2O → SiO2 + 2H2 (900-1200°C)
700nm oxide: 0.65hours (1200°C)
Poor oxide quality: field oxide
Dry oxidation
Wet oxidation (water vapor or steam)
Natural oxide: silicon will readily grow an oxide (5-10nm) if exposed to oxygen in the air!
The range for useful oxide thickness: 25nm (MOS gates) - 1500nm (field oxide)
O2
Silicon
SiO2
Surface passivation Diffusion barrier Field oxide MOS Gate oxide
Integrated Electronic Systems Lab 142
6: CMOS Technology
Layering - Chemical Vapor Deposition (CVD)
Deposited materials:
• Insulators & Dielectrics: SiO2, Si3N4, Phosphorus Silicate Glass (PSG), Doped Oxide
• Semiconductors: Si
• Conductors: Al, Cu, Ni, Au, Pt, Ti, W, Mo, Cr, Silicides (WSi2, MoSi2), doped polysilicon
Basic CVD processing:
• a gas containing an atom(s) of the material to be deposited reacts with another gas liberating the desired material
• the freed material (atom or molecular form) “deposits” on the substrate
• the unwanted products of the chemical reaction leave the reaction chamber
Example: CVD of silicon from silicon tetrachloride
SiCl4 + 2H2 → Si + 4HCl↑
wafer
Integrated Electronic Systems Lab 143
6: CMOS Technology
Layering - Evaporation
Used to deposit conductive layers (metallization): Al, Al/Si, Al/Cu, Au, Mo, Pt
When temperature is raised high enough, atoms of solid material (Al) will melt and “evaporate” into the atmosphere and deposit on to the wafer
External energy needed to evaporate the metal are provided by:
1.A current flowing through a filament
The evaporation take place into an evacuated chamber; otherwise Al would combine with oxygen in air to form Al2O3
2.Flash system
Al/Si alloy
3.Electron beam
Al
Crucible
Magnet
Evaporation Source
Wafer
Heater
Vacuum Pump
High Vacuum
(10-5-10-7 torr)
Integrated Electronic Systems Lab 144
6: CMOS Technology
Layering - Sputtering
Used to deposit thin metal/alloys films and insulators: Al, Ti, Mo, Al/Si, Al/Cu, SiO2
Sputtering process:
• ionized argon atoms (+) are introduced into an evacuated chamber
• the target (Al) is maintained at negative potential
• the argon ions accelerated towards the negative charge
• following the impact some of the target material atoms tear off
• the liberated material settles on everything in the chamber, including the wafers
The material to be sputtered does not have to be heated
Integrated Electronic Systems Lab 145
6: CMOS Technology
Patterning • Patterning = Lithography = Masking
• Selective removal of the top layer(s) on the wafers
• Ex.: Process steps required for patterning SiO2
Photoresist
2.Photoresist deposition
Si substrate (wafer)
SiO2
1.Initial structure
Mask
UV light
Insoluble photoresist
Soluble photoresist 3.UV Exposure
Chemical/Dry etch
5.SiO2 etching
5.SiO2 etching (end)
4.Soluble photoresist etching
6.Photoresist etching
Integrated Electronic Systems Lab 146
6: CMOS Technology
Doping
• Change conductivity type and resistivity on selected regions of wafer
• Doping takes place to the wafer through the holes patterned in the surface layer
• Two techniques are used:
• Thermal diffusion
• Ion implantation
Thermal diffusion:- heat the wafer to the vicinity of 1000°C- expose the wafer to vapors containing the desired dopant- the dopant atoms diffuse into the wafer surface creating a p/n region
Ion implantation:- room temperature- dopant atoms are accelerated to a high speed and “shot” into the wafer surface- an annealing (heating) step is necessary to reorder the crystal structure damaged by implant
Integrated Electronic Systems Lab 147
6: CMOS Technology
NMOS Transistor Fabrication - process flow (1)
Si Substrate (p)
SiO2 Field Oxide (Thick Oxide)
Oxidation (Layering)
Oxide etching (Patterning)
Integrated Electronic Systems Lab 148
6: CMOS Technology
NMOS Transistor Fabrication - process flow (2)
Polysilicon etching (Patterning)
SiO2 Gate Oxide (Thin Oxide)
Polysilicon deposition (Layering)
Oxidation (Layering)
Integrated Electronic Systems Lab 149
6: CMOS Technology
Oxide etching (Patterning)
NMOS Transistor Fabrication - process flow (3)
Ion implantation (Doping)
Oxidation (Layering)
SiO2 Insulated Oxide
n type n+ n+
n+ n+
Integrated Electronic Systems Lab 150
6: CMOS Technology
NMOS Transistor Fabrication - process flow (4)
Al evaporation
Oxide etching (Patterning)
Metal deposition (Layering)
Metal etching (Patterning)
Contact windows
n+ n+
n+ n+
n+ n+
S D
G
Si Substrate (p)
Integrated Electronic Systems Lab 151
6: CMOS Technology
Device Isolation Techniques
MOS transistors must be electrically isolated from each other in order to:
• prevent unwanted conduction paths between devices
• avoid creation of inversion layers outside the channel regions
• reduce the leakage currents
Each device is created in dedicated regions - active areas
Each active area is surrounded by a field oxide barrier using few techniques:
A) Etched field-oxide isolation
1) grow a field oxide over the entire surface of the chip
2) pattern the oxide and define active areas
Drawbacks: -large oxide steps at the boundaries between active areas and field regions!
-cracking of polysilicon/metal subsequent deposited layers!
Not used!
B) Local Oxidation of Silicon (LOCOS)
Integrated Electronic Systems Lab 152
6: CMOS Technology
Local Oxidation of Silicon (LOCOS) (1)
More planar surface topology
Selectively growing the field oxide in certain regions - process flow:
1) grow a thin pad oxide (SiO2) on the silicon surface
2) define active area : deposition and patterning a silicon nitride (Si3N4) layer
Si3N4
SiO2
Silicon substrate
The thin pad oxide - protect the silicon surface from stress caused by nitride
3) channel stop implant: p-type regions that surround the transistors
p+ p+p+
Integrated Electronic Systems Lab 153
6: CMOS Technology
Local Oxidation of Silicon (LOCOS) (2)
4) Grow a thick field oxide
Field oxide is partially recessed into the surface (oxidation consume some of the silicon)
Field oxides forms a lateral extension under the nitride layer - bird`s beak region
Bird’s beak region limits device scaling and device density in VLSI circuits!
5) Etch the nitride layer and the thin oxide pad layer
Active area
Active area
Integrated Electronic Systems Lab 154
6: CMOS Technology
n-Well CMOS Technology - simplified process sequence
Creating n-well regions (PMOS transistors) and channel stop regions
Grow field oxide and gate oxide
Deposit and pattern polysilicon layer
Implant source and drain regions, substrate contacts
Create contact windows, deposit and pattern metal layer
Integrated Electronic Systems Lab 155
6: CMOS Technology
n-Well CMOS Technology - Inverter Example
• Process starts with a moderately doped (1015 cm-3) p-type substrate (wafer)
• An initial oxide layer is grown on the entire surface (barrier oxide)
SiO2
Si (p)
Integrated Electronic Systems Lab 156
6: CMOS Technology
1. n-Well mask - defines the n-Well regions
• Pattern the oxide
• Implant n-type impurity atoms (phosphorus) - 1016cm-3
• Drive-in the impurities (vertical but also lateral redistribution - limits the density )
n-well
SiO2
Si (p)
Integrated Electronic Systems Lab 157
6: CMOS Technology
2. Active area mask - define the regions in which MOS devices will be created
• LOCOS process to isolate NMOS and PMOS transistors
• lateral penetration of bird’s beak region ~ oxide thickness
• channel stop p+ implants (boron)
• Grow gate oxide (dry oxidation) - only in the open area of active region
n-well
SiO2
Si (p)
p+
Integrated Electronic Systems Lab 158
6: CMOS Technology
3. Polysilicon mask - define the gates of the MOS transistors
• Polysilicon is deposited over the entire wafer (CVD process) and doped (typically n-type)
• Pattern the polysilicon in the dry (plasma) etching process
• Etch the gate oxide
n-well
SiO2
Si (p)
p+
Polysilicon gate
Integrated Electronic Systems Lab 159
6: CMOS Technology
4. n-Select mask - define the n+ source/drain regions of NMOS transistors
• Define an ohmic contact to the n-well
• Implant n-type impurity atoms (arsenic)
• Polisilicon layer protects transistor channel regions from the arsenic dopant
n+ n+ n+
n-well
SiO2DS
Si (p)
p+
n-well ohmic contact
Integrated Electronic Systems Lab 160
6: CMOS Technology
5. Complement of the n-select mask - define the p+ source/drain regions of PMOS transistors
• Define the ohmic contacts to the substrate
• Implant p-type impurity atoms (boron)
• Polisilicon layer protects transistor channel regions from the boron dopant
n+ n+ n+p+ p+
n-well
SiO2D D SSp+
Si (p)
p+
substrate ohmic contact
Integrated Electronic Systems Lab 161
6: CMOS Technology
• In the n-well two p+ and one n+ regions are created
• After source/drain implantation a short thermal process is performed (annealing):
• moderate temperature
• drive the impurities deeper into the substrate
• repair some of the crystal structure damage
• lateral diffusion under the gate: overlap capacitances
• Next the SiO2 insulated layer is deposited over the entire wafer area using a CVD technique
• The surface becomes nonplanar: impact on the metal deposition step
n+ n+ n+p+ p+
SiO2
n-well
SiO2D D SSp+
Si (p)
p+
Integrated Electronic Systems Lab 162
6: CMOS Technology
6. Contact mask - define the contact cuts in the insulating layer
• Contacts to polysilicon must be made outside the gate region (avoid metal spikes through the poly and the thin gate oxide)
n+ n+ n+p+ p+
SiO2
n-well
SiO2D D SSp+
Si (p)
p+
Contact window
Integrated Electronic Systems Lab 163
6: CMOS Technology
7. Metallization mask - define the interconnection pattern
• Aluminum is deposited over the entire wafer (evaporation) and selectively etched
• The step coverage in this process is most critical (nonplanarity of the wafer surface)
n+ n+ n+p+ p+
SiO2
n-well
SiO2
Metal
D D SSp+
Si (p)
p+
Integrated Electronic Systems Lab 164
6: CMOS Technology
• The final step: the entire surface is passivated (overglass layer)
• Protect the surface from contaminants and scratches
• Then, openings are etched to the bond pads to allow for wire bonding
Integrated Electronic Systems Lab 165
6: CMOS Technology
GND VDD
Out
In
Poly
n+ n+ n+p+ p+
SiO2
n-well
SiO2
Metal
D
Gate oxide
N-channel transistor P-channel transistor
D SSp+
Si (p)
p+
InGND VDD
Out
Integrated Electronic Systems Lab 166
6: CMOS Technology
Design Rules
• Interface between designer and process engineer
• Guidelines for constructing process masks
• Unit dimension: minimum line width
• Scalable design rules - lambda (λ) parameter:
– define all rules as a function of a single parameter λ– scaling of the minimum dimension: change the value of λ - linear scaling!
– linear scaling is only possible over a limited range of dimensions (1-3µm)
– are conservative: they have to represent the worst case rules for the whole set
– for small projects are a flexible and versatile design methodology
• Micron rules - absolute dimensions:
– can exploit the features of a given process to a maximum degree
– scaling and porting designs between technologies is more demanding: manually or using advanced CAD tools!
• Ex.: Scalable CMOS design rules
Integrated Electronic Systems Lab 167
6: CMOS Technology
CMOS Process Layers
Layer
Polysilicon
Metal1
Metal2
Contact To Poly
Contact To Diffusion
Via
Well (p,n)
Active Area (n+,p+)
Color Representation
Yellow
Green
Red
Blue
Magenta
Black
Black
Black
Select (p+,n+) Green
Integrated Electronic Systems Lab 168
6: CMOS Technology
Intra-Layer Design Rules (λ)
Metal24
3
10
96Well
Active3
3
Polysilicon2
2
Different PotentialSame Potential
Metal13
2
Contact/Via hole
Select 2
2
3
Minimum dimensions and distances
Integrated Electronic Systems Lab 169
6: CMOS Technology
1
2
5
3
Well boundary
Transistor
poly active (n+)
Inter-Layer Design Rules - Transistor Layout (λ)
Integrated Electronic Systems Lab 170
6: CMOS Technology
Inter-Layer Design Rules - Contact and Via (λ)
1
2
1
Via
Metal toPoly contact
Metal toActive contact
2
5
4
3 2
2
1
Metal1 toMetal2 contact
m2
m1
n+
Via
2
m2m1
poly
m1
Integrated Electronic Systems Lab 171
6: CMOS Technology
Select Layer (λ)
33
2
2
2
Well
Substrate
Select
5
SelectContact to substrate
Contact to well
1
Integrated Electronic Systems Lab 172
6: CMOS Technology
CMOS Inverter Layout
Poly
n+ n+ n+p+ p+
SiO2
n-well
SiO2
Metal
D
Gate oxide
N-channel transistor P-channel transistor
D SSp+
Si (p)
p+
InGND VDD
Out
Integrated Electronic Systems Lab 173
6: CMOS Technology
CMOS Latchup
n+
p-type substrate
n+ p+ p+
V (5 V)DD
n+p+
V (0 V)SS
BSDDSB
n-well
Rp
Rn
vO
pnp transistor
npn transistor
• The parasitic bipolar transistors can destroy the CMOS circuitry• The bipolar devices are normallly inactive• The collector of each bipolar transistor is connected to the base of the
other in a positive feedback structure• The latchup effect can occur when:
1. Both bipolar transistors conduct2. Product of gains of the 2 transistors in the feedback loop
exceeds unity ( βPβN > 1)
Integrated Electronic Systems Lab
7. Complementary MOS (CMOS) Logic Design
Integrated Electronic Systems Lab7: CMOS Logic 175
Basic CMOS Logic Gate Structure
• PMOS and NMOS switching networks are complementary
⇒Either the PMOS or the NMOS network is on while the other is off
⇒No static power dissipation
VDD
Logic Inputs
PMOS SwitchingNetwork
NMOS SwitchingNetwork
Y
Integrated Electronic Systems Lab7: CMOS Logic 176
CMOS NOR Gate
M N
v I
M P
V = 5 VDD
vo
21
51
V = 5 VDD
A B
Z
101
101
21
21
NOR Gate Truth Table
A B
0 0
0 1
1 0
1 1
1
0
0
0
Z = A + B
Integrated Electronic Systems Lab7: CMOS Logic 177
Transistor Sizing for CMOS Gates: Review
Goal: To maintain the delay times equal the reference inverter design under the worst-case input conditions
Example: 2 input CMOS NOR gate
- Each transistor of the NMOS network is capable of dischargingindividually the load capacitance C ⇒ Same size as NMOStransistor of reference inverter
- PMOS network conducts only when AB = 00 (Transistors in serie) ⇒ Each PMOS must be twice larger( On-resistance proportional to (W/L)-1 )
Integrated Electronic Systems Lab7: CMOS Logic 178
CMOS NAND Gate
M N
v I v
O
M P
V = 5 VDD
21
51
Z
V = 5 VDD
41
A
B
51
51
41
A B Z = AB
0 00 11 01 1
1110
NAND Gate Truth Table
Integrated Electronic Systems Lab7: CMOS Logic 179
Multi-Input NAND Gate
C
V = 5 VDD
A
B
C
Y
D
E
Y
15
15
15
15
15
110
110
110
110
110
Y= ABCDE
Why should one prefer a NAND gate rather than a NOR gate?
Integrated Electronic Systems Lab7: CMOS Logic 180
Steps in Constructing Graphs for NMOS and PMOS Networks (I)
Y = A + B (C + D)
+5 V
M AA
Y
PMOS Switch Network
ABCD
M CC MDD
MBB
C + D
B (C + D)
A + B (C + D)
Integrated Electronic Systems Lab7: CMOS Logic 181
Steps in Constructing Graphs for NMOS and PMOS Networks (II)
0
1
2
A
B
C
D
(b) NMOS Graph
1
+5 V
M AA21
Y
PMOS Switch Network
ABCD
2
3
M CC MDD
MBB
41
41
41
0
1
(a)
3
4
2
0
1
2
A
B
CD
23
4
5
(c) NMOS Graph with
New Nodes Added
0
1
2
A
B
C
D
23
4
5
(d) Graph with
PMOS Arcs Added
Integrated Electronic Systems Lab7: CMOS Logic 182
Steps in Constructing Graphs for NMOS and PMOS Networks (III)
+5 V
M CC MDDMAA
MBB
41
21
Y
41
41
4
B
A
C
D
151
151
151
7.51
Final CMOS Circuit
3
4
5
2
1
0
1
2
A
B
C
D
23
4
5
Graph with
PMOS Arcs Added
Integrated Electronic Systems Lab7: CMOS Logic 183
Summary
• AND - serially connected FET
• OR - parallel connected FET
• NMOS network implements “zeros”
• PMOS network implements “ones”
• W/L ratio has to be determined as a design parameter
+5 V
M CC MDDMAA
MBB
41
21
Y
41
41
B
A
C
D
151
151
151
7.51
Integrated Electronic Systems Lab7: CMOS Logic 184
CMOS Gate Design: Minimum Size Vs. Performance (I)
CMOS circuit with only minimum size transistors
Considerable savings in chip area, but increased logic delay
Example:
Integrated Electronic Systems Lab7: CMOS Logic 185
CMOS Gate Design: Minimum Size Vs. Performance (II)
(W/L) for PMOS network = 2/3 PLHIPLHIPLH τττ 5.7
3215
=⎟⎠⎞
⎜⎝⎛
⎟⎠⎞
⎜⎝⎛
=
of reference inverterPLHPLHI ττ =
The average propagation delay of the minimum size logic gate is:
( ) ( )PLHI
PLHIPLHIPHLIPLHPHLP τττττττ 75.4
2
5.9
2
5.72
2==
+=
+=
Mininimum size gate will 4.75 times slower than reference inverter whendriving the same load capacitance
For NMOS network PHLIPHL ττ 2=
Integrated Electronic Systems Lab7: CMOS Logic 186
Power-Delay Product (PDP)
The PDP is an important figure of merit for a logic technology
PAVPPDP τ=
For CMOS: fCVP DDAV2= with
Tf
1=
CMOS switching waveform
Integrated Electronic Systems Lab7: CMOS Logic 187
Power-Delay Product (cont’d)
bfar ttttT +++≥• The period T must satisfy:
• Assumptions: At high frequencies ta → 0 and tb → 0, tr and tf account forapproximately 80 % of the total transition time
For symmetrical inverter:
( )P
PrtT ττ
58.0
22
8.0
2==≥
55
22DD
PP
DD CVCVPDP =≤ τ
τ
Integrated Electronic Systems Lab
8. Passtransistor and Transmission Gate Logic
Integrated Electronic Systems Lab8: Transmission Gate Logic 189
Passtransistor Logic: Basic Principle
control
inV outV
inV outV
control
Idea:
Implementation:
Vin control Vout
1 0 x
1 1 1
0 0 x
0 1 0
0=open1=closed
Integrated Electronic Systems Lab7b: Transmission Gate Logic 190
Passtransistor Logic: NEXOR Realisation
B
A
OUT
A B OUT
0 0 1
0 1 0
1 0 0
1 1 1
A
B
Integrated Electronic Systems Lab7b: Transmission Gate Logic 191
Passtransistor: Charging Characteristics
DDin VV = )t(Vout
)t(Vctrl
outC 00 == )t(Vout
DDctrl
ctrl
V)t(V
)t(V
=>=
=<
0
00
)t(Vout
t
)V(VV SBTDD −
GSVTransistor is in Saturation duringCharging Process
NMOS
Integrated Electronic Systems Lab7b: Transmission Gate Logic 192
Passtransistor Cascades
DDin VV = )V(VVV maxTDDmax −=
outC
DDVDDV DDVDDV
maxV maxV maxV
maxV
DDin VV =
DDV
1max,V
TDD
max,Tmax,max,
VV
)V(VVV
2212
−≈
−=
outC2max,V
)V(VVV max,TDDmax, 11 −=
DDin VV =
Integrated Electronic Systems Lab7b: Transmission Gate Logic 193
Passtransistor: Discharging Characteristics
0=inV )t(Vout
)t(Vctrl
outC )V(VV)t(V SBTDDout −== 0
DDctrl
ctrl
V)t(V
)t(V
=>=
=<
0
00
)t(Vout
t
)V(VV SBTDD −
GSVTransistor is always in Nonsaturation duringDischarging Process
NMOS Passtransistor:Discharging faster thanCharging, since DeviceImpedance is lower in NSatthan in Sat
NMOS
Integrated Electronic Systems Lab7b: Transmission Gate Logic 194
Passtransistor: Charging Characteristics
DDin VV = )t(Vout
)t(Vctrl
outC
00
0
=>=
=<
)t(V
V)t(V
ctrl
DDctrl
GSV
PMOS Charging Process:
00 == )t(Vout
The output is charged to VDD(Transistor is initially saturated and goes in nonsaturatedmode)
0=inV )t(Vout
)t(Vctrl
outC
00
0
=>=
=<
)t(V
V)t(V
ctrl
DDctrl
GSV
PMOS Discharging Process:
DDout V)t(V == 0
The output is discharged to VT(Transistor is saturated and finally goes in cut-off mode)
DDV
DDV
Integrated Electronic Systems Lab7b: Transmission Gate Logic 195
From Passtransistors to Transmission Gates
Logic Level
NMOS PMOS CMOS
Logic 0 0 0
Logic 1 TNDD VV −
TPV
DDV DDV
ctrlV
DDV
outV
outC
inV
ctrlV
CMOS Transmission Gate
ctrlV
ctrlV
outVinV
Symbol: CMOS Transmission Gate
dt
dV*CII out
outDPDN =+
• Bidirectional resistive connection between the input and output terminals• Useful in both analog (e.g. for relay contacts) and in digital design (e.g.
for multiplexers)
Integrated Electronic Systems Lab7b: Transmission Gate Logic 196
Transmission Gate: Operation States
Operation states of the Transistors which are passed over during charging the output from 0 to VDD:
DDV : VoltageFinal
0 : VoltageInitial
TPV
TNDD VV −
Mn
satu
rate
d
Mn
cut-
off
Mp
sat.
Mp
nons
atur
ated
Integrated Electronic Systems Lab7b: Transmission Gate Logic 197
CMOS Transmission Gate: On-Resistance
On-resistance of a transmission gate, including body effect
22
5.0
/50,/20
,6.02,5.0
75.0,75.0
VAKVAK
VV
VVVV
np
F
TOPTON
µµ
φγ
==
==
−==
onNonP
onNonPEQ RR
RRR
+=
Integrated Electronic Systems Lab7b: Transmission Gate Logic 198
CMOS Transmission Gate (III)
• Charge sharing problem
SMALLBIG
SMALLSMALLBIGBIGF CC
VCVCV
++
=
Example: CSMALL = 0.02 pF, VSMALL = 5 V, VBIG = 0 V
CBIG = 0.2 pF (about 10 standard loads in a 0.5 CMOS process)
VF = 0.45 V ⇒ The ‘big‘ capacitor has forced node A to a voltage
close to a ‘0‘
Node A has to be insulated from node Z by including a buffer (e.g. Inverter) between the 2 nodes, if node A is not strong enough to over-come the ‘big‘ capacitor
Integrated Electronic Systems Lab7b: Transmission Gate Logic 199
Transmission Gate Logic
S
S
B
A
S
F
Multiplexer:
SBASF +=
B
B
A
A
B
F
Equivalence (NEXOR):
BA
BAABF
⊕=
+=
F
B
B
A
Alternate equivalence logic circuit:
Integrated Electronic Systems Lab7b: Transmission Gate Logic 200
Function Implementation with Passtransistor Logic
dcbdabdbadbF +++=
Karnaugh Map of F:
1 0 0 1
0 0 1 0
1 0 1 1
1 1 1 1a
b
cd
F
(in our case: decompose with combinations of the literals b and d
find minimum decomposition in such a way, that each selected field is depending on one variable or constant 0 or constant 1 only
Step 1:
Integrated Electronic Systems Lab7b: Transmission Gate Logic 201
Function Implementation with Passtransistor Logic
F
DDVAttach decomposition variables toselection lines
Step 2:
Determine the line input signals (implement inverted function to compensate output inverter
Step 3:
b b d d
Sustainer transistor
c
a
a
0
Integrated Electronic Systems Lab
9. Memory Elements and Dynamic Logic
Integrated Electronic Systems Lab 203
9: Memory Elements & Dynamic Logic
RS Flipflop
The RS-flipflop is a bistable element with two inputs:
• Reset (R), resets the output Q to 0
• Set (S), sets the output Q to 1
Integrated Electronic Systems Lab 204
9: Memory Elements & Dynamic Logic
RS-Flipflops
There are two ways to implement a RS-flipflop:
• based on NOR-gates: positive logic
• based on NAND-gates: negative logic
Integrated Electronic Systems Lab 205
9: Memory Elements & Dynamic Logic
Clocked RS-Latch
To achieve a synchronous operation, we can add a clock signal
• Clock= 0: R and S have no influence upon the state of the circuit
• Clock= 1: R and S can change the state of the circuit
Integrated Electronic Systems Lab 206
9: Memory Elements & Dynamic Logic
D-Latch
For storing data it is more convenient to have a data input. This is realized by using the data input as set signal and the inverted data input as reset signal.
• Clock= 0: Q unchanged
• Clock= 1: Q= D
Integrated Electronic Systems Lab 207
9: Memory Elements & Dynamic Logic
Transmission Gate D-Latch
An alternative way to build a D-latch is to use transmission gates thus reducing the complexity (transistor count) of the circuit.
• Load= 0: Latch stores data
• Load= 1: Latch is transparent (output= input)
Integrated Electronic Systems Lab 208
9: Memory Elements & Dynamic Logic
Clocked JK-Latch
An other extension of a simple RS-flipflop is a JK-Latch
• J: enables/disables the low to high transition of the latch
• K: enables/disables the high to low transition of the latch
Integrated Electronic Systems Lab 209
9: Memory Elements & Dynamic Logic
Edge Triggered Logic
If the previous presented D-latch would be used in a synchronous circuit, i.e. a counter, it would produce a malfunction:
While clock is low the latches have the state Q(n) and the feedback network would apply the state Q(n+1) at the inputs of the latches. When clock goes high the latches change to the new state Q(n+1). The feedback logic calculates now the state Q(n+2). But clock is still high so the latches change falsely to the state Q(n+2).
So what we need is a latch which changes only once per clock cycle, this is edge triggered logic.
Integrated Electronic Systems Lab 210
9: Memory Elements & Dynamic Logic
Edge Triggered JK-Flipflop
A straight forward way to implement an edge-triggered JK-flipflop is to use a master-slave flipflop.
• Clock= 1: The master (left latch) is changeable, the slave (right latch) is locked and holds the output at the current state
• Clock= 0: The master is locked and the slave is changes its state if necessary
The output value is the state of the master at the falling edge of the clock signal
Integrated Electronic Systems Lab 211
9: Memory Elements & Dynamic Logic
Edge Triggered TG D-Flipflop
Circuitry of an edge-triggered flipflop
• Clk= 0: First stage is loaded, second stage is locked and stores data
• Clk= 1: First stage is locked, second stage is loaded
With the rising edge (low to high transition) the new value is available a the output
Integrated Electronic Systems Lab 212
9: Memory Elements & Dynamic Logic
Transmission Gate JK- Flipflop
It is also possible to build a JK-flipflop with transmission gates as a edge-triggered flipflop.
This achieves that the output state can only change at the rising edge of the clock signal
Integrated Electronic Systems Lab 213
9: Memory Elements & Dynamic Logic
Dynamic D-Flipflop
Dynamic logic utilizes the parasitic capacitances of transistors and interconnect to store the current state. This reduces the transistor count but forbids a static operation. An application of dynamic circuits is the dynamic D-flipflop.
Integrated Electronic Systems Lab 214
9: Memory Elements & Dynamic Logic
Dynamic Shift Register
An other application is the dynamic shift register. It has also less transistor count but requires a non-overlapping two-phase clock which is expensive to generate.
Integrated Electronic Systems Lab 215
9: Memory Elements & Dynamic Logic
Dynamic Chain Latch
Integrated Electronic Systems Lab 216
9: Memory Elements & Dynamic Logic
Dynamic RAM
A special kind of memory is dynamic RAM. The major advantage is the low transistor count, DRAM requires only one transistor and one (small) capacitor per bit.
The first disadvantage is the destructive read. After reading a cell the red value must be written back to keep the data in the RAM.
The second disadvantage is the limited duration of storage. After some milliseconds the cell must be refreshed (read and written back).
Integrated Electronic Systems Lab 217
9: Memory Elements & Dynamic Logic
Dynamic RAM
Integrated Electronic Systems Lab 218
9: Memory Elements & Dynamic Logic
Clock Signal:
• used to synchronize data flow though a digital network
⇒ clocked static or dynamic circuits
• problems: clock skew(delay caused by clock distribution wires)
Condition for nonoverlapping clock signals and :)t(φ2
Clocking
Ideal nonoverlapping 2-phase clocks
)t(φ1
0)t(φ)t(φ 21 = t∀
Integrated Electronic Systems Lab 219
9: Memory Elements & Dynamic Logic
Basic 2-phase clocking
Integrated Electronic Systems Lab 220
9: Memory Elements & Dynamic Logic
Single and Multiple Clock Signals
⇒ For nonoverlapping clock phases fine tuned and well designeddelay lines (realized as Transmission gates) have to be inserted in order toavoid overlapping of .
φφ and
φφ and
Single clock 2-phase timing
Integrated Electronic Systems Lab 221
9: Memory Elements & Dynamic Logic
Generation of inverted clock phase
TG delay circuit
Integrated Electronic Systems Lab 222
9: Memory Elements & Dynamic Logic
Pseudo 2-φ clocking
Integrated Electronic Systems Lab 223
9: Memory Elements & Dynamic Logic
Clocked Dynamic Logic⇒ Synchronized data transfer
Shift register
1) Upper Frequency Limitation: Charging and Discharging Times
Clocked shift register circuit
Integrated Electronic Systems Lab 224
9: Memory Elements & Dynamic Logic
Time constant for charging and discharging:LTGTG CR=τ
wherelineinTGL CCCC ++=
VA=VDD: (Vin(0)=0)
⎥⎦⎤
⎢⎣⎡ τ−−≅ TGDDin
/te1V)t(V
Inverter is switched, when Vin=VIH which occurs after
( ) ( )[ ]pnoxin
DD
IHTG1
WLWLCC
VV
1lnt
+=
⎥⎦⎤
⎢⎣⎡ −τ−≅ϕ
VA=0: (Vin(0)= VDD)
TGDDin/teV)t(V τ−⋅≅
The time until Vin reaches VIL is given by
⎥⎦⎤
⎢⎣⎡τ−≅
IL
DDTG0
VV
lnt
Integrated Electronic Systems Lab 225
9: Memory Elements & Dynamic Logic
2) Lower Frequency Limitation: Charge Leakage
Leakage patch in a CMOS TG
The load capacitance, seen by the transmission gate (TG) is
inlineTGL CCCC ++=
The depletion capacitance contributions to CL are due to the reversed pnjunctions in the MOS transistors. As shown in fig. above a leakage current flow exists across the reverse biased pn junctions. The influence of this leakage current on the charge stored in CL depends on the values of ILp and ILn.
Integrated Electronic Systems Lab 226
9: Memory Elements & Dynamic Logic
Charge leakage problem in CMOS TG
Integrated Electronic Systems Lab 227
9: Memory Elements & Dynamic Logic
WithLpLnL III −=
the leakage current influence on Vin is given by
Lin
L Idt
dVC −=
If ILp>ILn the capacitance is charged by IL otherwise it is discharged or remains constant when the ideal condition ILp=ILn is true.
dVdQ
C
IIdt
dQ
storestore
LnLpstore
=
−=
Assuming that the leakage currents ILp and ILn are constant and that the node charge voltage relation is linear of the form
VCQ storestore =
Integrated Electronic Systems Lab 228
9: Memory Elements & Dynamic Logic
follows (because Cstore is const.)
LnLpstor IIdtdV
C −=
The solution of this equation is
)0(VtC
)II()t(V
stor
LnLp+
−=
If ∆V is the maximum allowed voltage change:
L
stormax
IV∆C
t =
Charge leakage circuit
Integrated Electronic Systems Lab 229
9: Memory Elements & Dynamic Logic
With Tmax=2tmax (the longest allowed clock period) follows for the minimum frequency
V∆C2I
t21
fstore
L
maxmin ≅≅
The transmission gate capacitance is
Transmission gate capacitance
)V(C)V(CCCCCC DBnSBpoldolslineGT +++++≅
Integrated Electronic Systems Lab 230
9: Memory Elements & Dynamic Logic
So the storage capacitance can be estimated by voltage averaging of this expression:
[ ]DBnSBpDDoldolslineGstor CC)V,0(KCCCCC +++++≅
For a realistic analysis of the charge leakage problems the dependence of the leakage currents from the reverse voltage bias has to be taken into consideration.
Integrated Electronic Systems Lab 231
9: Memory Elements & Dynamic Logic
Charge Sharing
Basic charge sharing circuit
t<0: (TG switched off)
DD1T
2
DD1
VCQ
0)0t(V
V)0t(V
==<=<
t>0: (TG switched on)
DD12
DD21
1
21f
f21T
V)C/C(1
1V
CCC
)0t(V)0t(VV
V)CC(Q
+=
+=
>=>=+=
Integrated Electronic Systems Lab 232
9: Memory Elements & Dynamic Logic
If we design a circuit with C1=C2, then Vf=(VDD/2), indicating drop in voltage. A reliable forward transfer of a logic 1 state from C1 to C2 requires that C1>>C2 to insure that Vf≈VDD.
Let us specify arbitrary initial conditions V1(0)and V2(0) on the capacitors giving the system a total charge of
)0(VC)0(VCQ 2211t +=Applying basic circuit analysis gives the time-dependent voltage as
where the time constant is given by
21
21eqeqTG
CCCC
CwithCR+
==τ
In the limit t→∝, V1=V2=Vf:
Integrated Electronic Systems Lab 233
9: Memory Elements & Dynamic Logic
This agrees with the result from simple charge conservation by noting that the final charge distributes according to
f21T V)CC(Q +=
Transient voltage behavior for initial conditions of V1(0)=VDD and V2(0)=0
Integrated Electronic Systems Lab 234
9: Memory Elements & Dynamic Logic
Charge sharing among N TG-connected capacitors
Initial charge: ∑==
N
1iiiT )0(VCQ
After connecting nodes: fN
1iiT VCQ ⎟⎠⎞
⎜⎝⎛ ∑=
=
Final voltage:∑
∑==
=N
1i i
N1i ii
fC
)0(VCV
Integrated Electronic Systems Lab 235
9: Memory Elements & Dynamic Logic
Dynamic Logic• Pull-up (pull-down) network of static CMOS is replaced by a single precharge(discharge) transistor.The remaining network then conditionally discharges (changes up) the output in a second operation pulse
• One logic level is held by dynamic charge storage• Transistor count is reduced from 2n (static CMOS) to n+2 for dynamic
precharged CMOS (but now: 2 phases of operation)
Dynamic nMOS Inverter (Single clock, 2 phases)
Basic dynamic nMOS inverter
Integrated Electronic Systems Lab 236
9: Memory Elements & Dynamic Logic
Precharge Phase
If Vin=0 then
outpTpDDp
outch CR
)VV(C
=−β
=τ
WORST case (Vin=VDD):
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛=
=
−−
+−
τ
+=τ
1V
)VV(2ln
)VV(
V2
t
)CC(R
0
TpDD
TpDD
Tpmax,ch
max,ch
noutpmax,ch
Dynamic nMOS inverter: precharge and evaluate
Integrated Electronic Systems Lab 237
9: Memory Elements & Dynamic Logic
Evaluation Phase
For the case that M1 is switched on and identically designed channel width for M1and Mn the discharge time constant is given by
)VV(WkC)LL(
TnDDn
outn1dis
−′+
=τ
Precharge network for worst case
Integrated Electronic Systems Lab 238
9: Memory Elements & Dynamic Logic
Evaluation discharge network
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛= −
−+
−τ 1
V)VV(2
ln)VV(
V2t0
TnDD
TnDD
Tndisdis
Maximum clock frequency
Mmax
dismax,chM
t21
f
)t,tmax(t
≅
=
Integrated Electronic Systems Lab 239
9: Memory Elements & Dynamic Logic
Dynamic pMOS Inverter
Basic dynamic pMOS inverter
Dynamic CMOS Properties and Conditions• single phase clock
• input should change during precharge only
• input must be stable at the end of the precharge phase
• in the evaluation phase the output remains HIGH (LOW) or is optionally
discharged (charged)
φ=1 Prechargeφ=0 Evaluate
Integrated Electronic Systems Lab 240
9: Memory Elements & Dynamic Logic
Complex Logic
Complex dynamic logic
Integrated Electronic Systems Lab 241
9: Memory Elements & Dynamic Logic
Dynamic CascadespMOS blocks and nMOS blocks have to be installed alternated in order to avoid glitches
Cascaded nMOS-nMOS glitch problem
Dynamic cascades
Wrongly coupledstages: while the first oneis in precharge, the secondis in evaluation.The result of the secondstage will be influencedby the precharge processof the first stage
Integrated Electronic Systems Lab 242
9: Memory Elements & Dynamic Logic
Domino CMOS Logic
Basic domino logic circuit
Integrated Electronic Systems Lab 243
9: Memory Elements & Dynamic Logic
• Domino Logic: design method for glitch-free cascading of nMOS logic blocks• Each stage is driven by φ
- Precharge during φ = 0- Evaluation when φ = 1
• Domino logic blocks consists of a precharge/ evaluation block and an output inverter
Precharge Phase: The gate output is precharged to logic 1 and the inverter output is going to logic 0. Logic transmission errors are avoided by providing a logic 0 at the inverter output (avoiding discharge of the next logic state).
Evaluation Phase: The inverter output stays according to the actual input values at logic 0 or is set to logic 1. The correct result signal is provided at the end of the domino cascade after stabilization of all stages.
Integrated Electronic Systems Lab 244
9: Memory Elements & Dynamic Logic
Domino AND gate
Cascaded domino logic
Integrated Electronic Systems Lab 245
9: Memory Elements & Dynamic Logic
Visualization of domino effect
Domino timing
Integrated Electronic Systems Lab 246
9: Memory Elements & Dynamic Logic
Cascaded domino circuit with fanout = 2
Integrated Electronic Systems Lab 247
9: Memory Elements & Dynamic Logic
Domino Logic Properties
• Domino logic consists of either n-type or p-type blocks• small load capacity to be driven by logic (one inverter only) ⇒ low dimensions of
transistors• only one clock signal required• only positive logic realizations possible because of the input inverters ⇒ domino logic is noninverting
Functions as
cannot be directly realized in a domino chain
Integrated Electronic Systems Lab 248
9: Memory Elements & Dynamic Logic
Analysis
Domino AND4 gate
CX=C0+CT. C0 represents the capacitance due to M0, while CT is the total of all other contributions.
Integrated Electronic Systems Lab 249
9: Memory Elements & Dynamic Logic
Precharge (φ=0: Mp1 in conduction, Mn1 in cutoff)
lineG1BDp1GDp1BDn1GDn
T0X
CC)CC()CC(
CCC
+++++≅+=
Evaluate
If all inputs Ai are set to logic 1, the worst case delay time can be estimated by
X0123n1123n
223n33nnnD
C)RRRRR(C)RRRR(
C)RRR(C)RR(CRt
+++++++++++++++≅
with
)()/(
1
TnDDjn
j
VVLWkR
−′=
Mp1 conducting → )1iclog(VVC IHxx =>→Minimum precharge time
⎟⎟⎠
⎞⎜⎜⎝
⎛−
−−
+⎥⎥⎦
⎤
⎢⎢⎣
⎡
−≅ 1
)(2ln
)(
2
IHDD
TpDD
TpDD
Tp
chch
VV
VV
VV
Vt τ
VX(0)=0
⎥⎦
⎤⎢⎣
⎡−β
τ =)VV(
CTpDDp
Xch
Integrated Electronic Systems Lab 250
9: Memory Elements & Dynamic Logic
Charge Leakage and Charge Sharing
Domino stage with pull-up MOSFET
Integrated Electronic Systems Lab 251
9: Memory Elements & Dynamic Logic
Charge sharing in a domino chain
Cout,1>>Cx1+Cx2
Integrated Electronic Systems Lab 252
9: Memory Elements & Dynamic Logic
Use of feedback to control a pull-up MOSFET for charge sharing problem
Integrated Electronic Systems Lab 253
9: Memory Elements & Dynamic Logic
NORA Logic
(NORA = NO RAce)
NORA Properties
• NORA is very insensitive to clock delay• one clock signal and the inverted clock signal with short slopes rise times are
sufficient • no inverter is needed between the logic stages, because of alternate use of
n-type and p-type blocks• the last stage is a clocked inverter, a C2MOS latch• ideal to clock pipelined logic systems
Integrated Electronic Systems Lab 254
9: Memory Elements & Dynamic Logic
The Signal Race Problem
Signal race problem
The signal race problem can be seen: a signal race can arise, when both transmission gates conduct at the same time. If the new input from TG1 reaches the input of TG2 while TG2 is still transmitting the output, the output information will be lost. Imperfect TG synchronization occurs because of normal transmission intervals or clock skew.
Integrated Electronic Systems Lab 255
9: Memory Elements & Dynamic Logic
Clock skew
tp>>tr,tf → no problems
Tskew=tp → race result critical
Integrated Electronic Systems Lab 256
9: Memory Elements & Dynamic Logic
Dynamic latch operation
φ=0 Prechargeφ=1 Evaluate
Accept data when φ=0,hold data when φ=1
Integrated Electronic Systems Lab 257
9: Memory Elements & Dynamic Logic
NORA Structuring
Integrated Electronic Systems Lab 258
9: Memory Elements & Dynamic Logic
tionssecφandφNORA
Integrated Electronic Systems Lab 259
9: Memory Elements & Dynamic Logic
C2MOS latch
NORA pipelined logic
φ=1 Prechargeφ=0 Evaluate
Integrated Electronic Systems Lab 260
9: Memory Elements & Dynamic Logic
tionssecφandφNORA
:0=φ P P locked E E transp.E E transp. P P locked:1=φ
φ
φ
Integrated Electronic Systems Lab 261
9: Memory Elements & Dynamic Logic
tionssecφandφNORA
:0=φ P P locked E E transp.E E transp. P P locked:1=φ
φ
φ
?
0V
?
?
Integrated Electronic Systems Lab 262
9: Memory Elements & Dynamic Logic
tionssecφandφNORA
:0=φ P P locked E E transp.E E transp. P P locked:1=φ
φ
φ
?
0V
C²MOS Latchlocked duringclock skewperiod!
Integrated Electronic Systems Lab 263
9: Memory Elements & Dynamic Logic
tionssecφandφNORA
:0=φ P P locked E E transp.E E transp. P P locked:1=φ
φ
φ
Prechargedto 0V
Duration of initial Value of Evalutation Phase (VDD) will be enhanced
?
?
Duration of provision of logicaloutput value to next stage willeventually be enhanced
And the other way round:
?
Integrated Electronic Systems Lab
Advantages of dynamic logic:• Smaller area than static logic• Smaller parasitic capacitances, therefore higher speed• Reliable operation if designed correctly
Concerns / Disadvantages:• Capacitive coupling to dynamic nodes• Charge sharing with dynamic nodes• Subthreshold leakage in eval logic• Minority carrier injection and latchup• Alpha particle immunity• Vdd/gnd noise vulnerability / IR-drop
Summary Dynamic Logic
Integrated Electronic Systems Lab
10. Performance, interconnect and packaging
10: PerformanceIntegrated Electronic Systems Lab 266
Summary
Interconnect Parameters: Capacitance, Resistance, Inductance
Electrical Wire Models
• Lumped C model
• Lumped RC model
• RC chain model
• Distributed RC line model
• Transmission line model
Technology Scaling
Power and Clock Distribution
Input Protection Circuits
Static Gate Sizing
Off-Chip Driver Circuits
Packaging Technology
10: PerformanceIntegrated Electronic Systems Lab 267
Interconnect Parameters
Interconnection choices in an actual CMOS process:
• multiple layers of Aluminum (up to 7)
• polysilicon layer (at least one)
• possibility of using the heavily doped n+ and p+ layers
The wiring forms a complex geometry that introduces parasitics:
• capacitive
• resistive
• inductive
Parasitic effects reduce the performance and the reliability by:
• increasing the propagation delay
• affecting the energy dissipation and the power distribution
• introducing extra noise source
10: PerformanceIntegrated Electronic Systems Lab 268
Modern Interconnect
10: PerformanceIntegrated Electronic Systems Lab 269
Full Wire Model
Assume that all wires in a bus network are implemented in a single interconnect layer (Al), isolated from the silicon substrate and from each other by a layer of dielectric material (SiO2):
Schematic view
Physical view
Full wire circuit model:
• Consider parasitic capacitance, resistance and inductance
• Parasitics are distributed over the length of the wire
• Inter-wire parasitics: coupling effects
10: PerformanceIntegrated Electronic Systems Lab 270
Simplified (Only Capacitance) Wire Model
A simplified capacitance-only model can be used if:
• the wires are short
• the wires cross-section is large or the wire material has a low resistivity (small resistance)
Other simplified models can be obtained
1) Neglecting the inductive effects, valid when:
• the resistance of the wire is large (long Al wires with a small cross-section)
• trise and tfall of the signals are large (slow signals)
2) Neglecting the inter-wire capacitance, valid when:
• the separation between neighboring wires is large
• the wires run together for a short distance
10: PerformanceIntegrated Electronic Systems Lab 271
Wire Parallel-Plate Capacitance
Simple model - the parallel-plate capacitance: L
H
tox
Substrate
SiO2
W
Current Flow
Electrical-field lines
The capacitance of a wire is function of:
• shape of the wire
• environment
• distance to substrate
• distance to surrounding wires
True for W >> tox ⇒ electric field lines are orthogonal to the capacitor plates
WLt
CCox
oxppwire
ε==
Cwire is the total capacitance of the wire (pF)
10: PerformanceIntegrated Electronic Systems Lab 272
Wire Fringing Capacitance
• Advanced processes have a reduced W/H ratio (<1)
• The capacitance between side-wall of the wires and the substrate (fringing capacitance) must be considered!
( )( )Htt
HWc
ccc
ox
ox
ox
oxwire
fringeppwire
/log
22/ πεε+
−≈
+=
cwire is the wire capacity per unit length (pF/cm)
For W/H large cfringe < cpp, cwire ~ cpp
For W/H < 1.5 ⇒ cfringe > cpp
W - H/2H
+
Cfringe CppSubstrateCpp
Cfringe
W
H
SiO2
Substrate
tOX
cpp
cwire
cpp
cfringe
10: PerformanceIntegrated Electronic Systems Lab 273
Interwire Capacitance
In multilevel interconnects technologies the wires are not completely isolated
Each wire is coupled to the:
• substrate (grounded capacitor)
• neighboring wires on the same layer (floating capacitor)
• neighboring wires on adjacent layers (floating capacitor)
Level1
Level2
CparallelCfringe
Assuming that oxide thickness (tox = 1µm) and metal thickness (H=1µm) are held constant while scaling the
other dimensions ⇒ for W < 1.75H, C interwire dominates!
10: PerformanceIntegrated Electronic Systems Lab 274
Wiring Capacitances
Field Active Poly Al1 Al2 Al3 Al4
Cplate (aF/µm2) 88
PolyCfringe (aF/µm) 54
Cplate (aF/µm2) 30 41 57
Al1Cfringe (aF/µm) 40 47 54
Cplate (aF/µm2) 13 15 17 36
Al2Cfringe (aF/µm) 25 27 29 45
Cplate (aF/µm2) 8.9 9.4 10 15 41
Al3Cfringe (aF/µm) 18 19 20 27 49
Cplate (aF/µm2) 6.5 6.8 7 8.9 15 35
Al4Cfringe (aF/µm) 14 15 15 18 27 45
Cplate (aF/µm2) 5.2 5.4 5.4 6.6 9.1 14 38
Al5Cfringe (aF/µm) 12 12 12 14 19 27 52
Plate and fringe capacitance values for a typical 0.25 µm CMOS process
10: PerformanceIntegrated Electronic Systems Lab 275
Wire Resistance
W
L
H
R1 R2≡
R - Sheet Resistance
R = ρ
H W
L= R
L
W
10: PerformanceIntegrated Electronic Systems Lab 276
Dealing With Resistance
Polycide gate MOSFET
Silicides: WSi2, TiSi2, PtSi2, TaSi
Conductivity: 8-10 times better than Poly
• Selective technology scaling
• Use better interconnect materials (silicides, bypasses)
• More interconnect layers (reduce average wire length)
10: PerformanceIntegrated Electronic Systems Lab 277
Other Resistive Effects(1) Contact resistance
• Extra resistance added by transition between routing layers
• Can be reduced by making the contact holes larger
• Current crowding upper limits the size of the contact
(2) Skin effect
• High frequency (GHz) currents tends to flow on the surface of a conductor
• Resistance become frequency-dependent (increase when frequency increase)
• Affects only wider wires
(3) Electromigration
• Limits the DC currents to 1mA/µm
10: PerformanceIntegrated Electronic Systems Lab 278
Wire inductance
dt
diLv =∆
At switching frequencies in GHz range the wire inductance must be considered
A changing current passing through an inductor generates a voltage drop:
On-chip inductance effects are:
• reflection of signals due to impedance mismatch
• inductive coupling between lines
• ringing effects
• switching noise due to Ldi/dt voltage drops
It is possible to compute the wire inductance directly from its geometry and its environment
A more simple approximation is given by following relation:
cl = εµ
where c is capacitance per unit length, l inductance per unit length, ε electric permittivity and µ magnetic permeability of the surrounding dielectric
Ex.: 0.25 µm technology a 0.4µm width Al wire routed on top of the field oxide (SiO2) has
c = 92aF/µm, l = 0.47pH/µm
10: PerformanceIntegrated Electronic Systems Lab 279
Example: Intel 0.25 micron Process
10: PerformanceIntegrated Electronic Systems Lab 280
The Lumped C Model
Conditions:
• resistive component of the wire is small
• consider only the capacitive component
• switching frequencies are in medium range
The wire still represents an equipotential region and does not introduce any delay
The distributed capacitance is lumped into a single capacitor
The only impact on performance:
• loading effect of Clumped on the driving gate
10: PerformanceIntegrated Electronic Systems Lab 281
The Lumped RC Model
Metal wires of few mm length have a significant resistance and the equipotential assumption is no longer adequate!
New model:
• Lumps the total resistance of the wire into a single resistor R
• Combines the global capacitance of the wire into a single capacitor C
The estimated wire delay: τ = RC
This model is pessimistic and inaccurate for long interconnect wires!
10: PerformanceIntegrated Electronic Systems Lab 282
The Elmore Delay
The shared path resistance Rik is the resistance shared among the paths from the source node s to the nodes k and i:
( ) ( )[ ]kspathispathwhereRRR jjik →∩→∈= ∑ ,
Assume that each node of the network is initially discharged and a step input is applied at t=0
The Elmore delay at node i, for a network with N nodes, is given by:
∑=
=N
kikkDi RC
1
τ
Ex.: τDi = R1C1 + R1C2 + (R1 + R3)C3 + (R1 + R3)C4 + (R1 + R3 + Ri)Ci
Consider the following RC-tree network:
• the network has a single input node (s)
• all capacitors are between a node and the ground
• the network does not contain any resistive loops
Ex.: Ri4 = R1 + R3; Ri2 = R1
10: PerformanceIntegrated Electronic Systems Lab 283
The RC Chain Model
RC chain - a special case of the RC-tree network:
∑ ∑∑= ==
==N
iii
N
ii
i
jjiDN RCRC
1 11
τ Ex.: τ Di = C1R1 + C2(R1 + R2) + ... + Ci(R1 + ... + Ri)
Assume that a wire of length L is modeled by N equal-length segments, each having Ri = rL/N, and Ci = cL/N (r, c are resistance and capacitance per unit length)
( ) ( ) ( )N
NRC
N
NNrcLNrcrcrc
N
LDN 2
1
2
1...2
22
2 +=
+=+++⎟
⎠⎞
⎜⎝⎛=τ
For N large, the RC chain model approach the distributed RC line model:22
2rcLRCDN ==τ
(1) The delay of a wire is a quadratic function of its length
(2) The delay of the RC chain model is 1/2 of the delay predicted by the lumped RC model!
21 i-1 NiRi-1 Ri
Ci-1 Ci
Vin VNR1
C1
R2
C2
10: PerformanceIntegrated Electronic Systems Lab 284
The Distributed RC Line Model (1)
( ) ( )Lr
VVVV
t
VLc iiiii
∆−−−
=∂
∂∆ −+ 11
2
2
x
V
t
Vrc
∂∂
=∂∂
For ∆L -> 0, we obtain the diffusion equation:
The voltage at node i is given by the following partial differential equation:
The diffusion equation is difficult to use for circuit analysis
However, the distributed RC line can be approximated by a lumped RC chain network, and:
( )2
2rcLout =τ
V - the voltage at a particular point in the wire
x - the distance between this point and the signal source
L - total length of the wire
r - resistance per unit length
c - capacitance per unit length
10: PerformanceIntegrated Electronic Systems Lab 285
The Distributed RC Line Model (2)
• The step input waveform diffuses from the start to the end of the wire
• The waveform rapidly degrades: delay for long wires
Voltage range Lumped RC network Distributed RC network
0 → 50%(tp) 0.69RC 0.38RC
0 → 63%(τ) RC 0.5RC
10% → 90%(tr) 2.2RC 0.9RC
0 → 90% 2.3RC RC
Step response of lumped and distributed RC networks: points of interests
10: PerformanceIntegrated Electronic Systems Lab 286
Transmission Lines
When the inductance of the wire dominates the delay behavior - transmission line effects!
Model: a distributed RLC wire
Signal propagate as a wave - alternatively transferring energy from electric to magnetic field
The wave propagation equation:
2
2
2
2
t
vlc
t
vrc
x
v
∂∂
+∂∂
=∂∂ r,c,l - resistance, capacitance and inductance per unit length
g ~ 0 - the leakage conductance
The ideal wave propagation equation (for lossless transmission line, r=0) :
2
2
22
2
2
2 1
t
v
t
vlc
x
v
∂∂
=∂∂
=∂∂
ν lc
1=ν propagation speed along the line
10: PerformanceIntegrated Electronic Systems Lab 287
Lossless Transmission Lines Parameters (1)
rr
c
lc µεεµν 011
===
Dielectric constant and wave-propagation speed for various materials used in IC technology
c0 - speed of light in vacuum
ε - electric permittivity of insulator
µ - magnetic permeability of insulator
εr - relative permittivity with respect to vacuum
µr - relative permeability with respect to vacuum
Propagation speed: only a function of surrounding medium
tflight = L/v - the time it takes for the wave to propagate from one to the other end of the wire
10: PerformanceIntegrated Electronic Systems Lab 288
Lossless Transmission Lines Parameters (2)
Characteristic impedance: impedance presented by wire
νν
cl
c
lZ
10 === 100 to 500Ω for typical wires
The behavior of the transmission line is influenced by the termination of the line
The termination how much of the wave is reflected upon arrival at the wire end
0
0
ZR
ZR
I
I
V
V
inc
refl
inc
refl
+−
===ρ
ρ - Reflection coefficient
R - the termination resistance
R = Z0 ρ = 0
R = ∞ ρ = 1
R = 0 ρ = -1
10: PerformanceIntegrated Electronic Systems Lab 289
Transmission Lines with Terminating Impedances Zs and ZL
Consider the case: ZL = ∞, ρ = 1
Zs
ZL
Z0 VDestVSource
VSource = (Z0/(Z0+Zs))Vin
Vin
ρs = (Zs-Z0)/(Zs+Z0)
10: PerformanceIntegrated Electronic Systems Lab 290
Lattice Diagram
Conclusion: in order to avoid ringing or slow propagation delay the transmission lineshould be terminated both at the source (series termination) and at the destination (parallel termination) with a resistance equal to Z0
Vin = 5V, RS = 5Z0, RL = ∞
t = 0 ... tflight
V1S = (Z0/(Z0+Zs))Vin = 0.83V
V1D = V1
S + Vr,1D; Vr,1
D = ρD V1S = 0.83V
V1D = 0.83V + 0.83 = 1.66V
ρs = (Zs-Z0)/(Zs+Z0) = 0.66ρD = 1
t = tflight ... 2tflight
V2S = V1
S + Vr,1D + Vr,1
S ; Vr,1S = ρS Vr,1
D = 0.55V
V2S = 2.22V
V2D = V1
D + Vr,1S + Vr,2
D; Vr,2D = ρD Vr,1
S = 0.55V
V2D = 2.77V
....
10: PerformanceIntegrated Electronic Systems Lab 291
2 2flight w
r
t lt lc< =
02 2w
lR rl Z
c= < =
2 2rw
t ll
r clc< <
12
wrl c
lξ = <or
Inductance is important
2. High attenuation
1. Large input rise time
2 rw
tl
lc<
2w
ll
r c<
10.00
1.00
0.10
0.01
0.01 0.10 1.00 10.00
Length (cm)
Transition time (ns) of line driver / input signal
1. & 2.
Figures of Merit for RLC Interconnect
Criteria:
•Distributed versus Lumped Model: Distributed Model: Rise (fall) time of input signal,
tr, must be smaller than propagation delay through wire. (Otherwise, a lumped model suffices.)
•Consideration of Inductance required: Wire resistance R / damping factor ξ may not be too large, otherwise distributed RC model sufficient
• In conclusion: Distributed RLC model required if
With Induct.
No Induct.
lc
tl r
w
2>⇔
c
l
rlw
2<⇔
10: PerformanceIntegrated Electronic Systems Lab 292
Scaling (1)
VLSI integration depends on the smallest-size feature permitted by the technology
The size of the transistors has to be as small as possible!
The internal operating physics of the down-scaled MOS transistor changes
First order scaling theory:
• Estimates the improvements that can be expected as technology is scaled
• Scaled MOS device is obtained by applying a dimensionless scaling factor α to:
• all dimensions (L, W, junction depth, oxide thickness, etc.)
• device voltages
• impurities concentration densities
• The characteristics of the scaled MOS device are similar to that of the original one
• A number of parameters such as voltage drop, line propagation delay, current density, contact resistance exhibit significant degradation with scaling!
10: PerformanceIntegrated Electronic Systems Lab 293
Scaling (2)
Parameter Scaling Factor
Length; L 1/α
Width; W 1/α
Gate oxide thickness; tox 1/α
Junction depth; Xj 1/α
Substrate doping; Na or Nd α
Supply voltage; VDD 1/α
Electric field across gate oxide; E 1
Depletion layer thickness; d 1/α
DeviceParameter
Parasitic capacitance; WL/tox 1/α
Gate delay; VC/I 1/α
DC power dissipation; Ps 1/α2
Dynamic power dissipation; Pd 1/α2
Power delay product 1/α3
Gate area 1/α2
Power density; VI/A 1
Current density; I/A α
ResultantInfluence
Transconductance; gm 1
Influence of first-order scaling on MOS device
1>α
10: PerformanceIntegrated Electronic Systems Lab 294
Scaling (3)
Interconnect layer scaling
Parameter Scaling Factor
Conductor line width; W 1/α
Conductor line length; L 1/α
Conductor line thickness; t 1/α
Line cross-section; A 1/α2
Line resistance; r α
Line response time; rc 1
Normalized line response time α
Line voltage drop; Vd 1
Normalized line voltage drop α
Current density; J α
Normalized contact voltage drop; Vc/V α2
rW
L
tr α
αα
αρ
=⎥⎦⎤
⎢⎣⎡=
/
/
/'
( )( ) constIrrIVd === αα/'
( )( ) constrCCrs === αατ /'
The scaled line resistance is:
The voltage drop along the scaled line is:
The scaled line response time is:
For a constant chip size many of the signals paths do not scale down! Therefore:
• Voltage drops along the lines are larger by a factor of α than scaled line voltage drop
• The line response time is larger by a factor of α than scaled line response (see table)
Problems: distribution and organization of clocking signals, electromigration, the increase ofthe wire capacitance (affects the gate delay)
(Line ofsame length)
(Line ofsame length)
10: PerformanceIntegrated Electronic Systems Lab 295
Power Distribution
Process with 1 Level of metal :
• VDD and ground (VSS) are routed in interdigitated trees
• Crossunders are very difficult (low resistance interconnect)
Power distribution is much easier for technologies with 2 (or more) levels of metal
Cautions:
• Parts of the chip that are likely to simultaneous transition are routed separately!
• Separate power pins might be used for the output driver!
10: PerformanceIntegrated Electronic Systems Lab 296
Clock and Timing Circles (1)
The clock
• synchronize machine operations and data transfer
• global control technique that provide the “glue” for system operation
System level timing can be described using circular timing charts
Ideal pseudo 2-phase clocking chart:
• φ1(t)φ2(t) = 0, ∀t
• φ1=1 during first half-period
• φ2=1 during the last half-period
• time increases in a counter-clockwise direction
• one full rotation corresponds to a clock period T
10: PerformanceIntegrated Electronic Systems Lab 297
Clock and Timing Circles (2)
Overlapping pseudo 2-phase clocking chart:
• φ1(t)φ2(t) = 0, except during the transition times
• mutually-exclusive clock periods provide timing intervals for logical operations
• overlapped segments must be avoided
• transition times can be made small by proper clock generator design
Clock skew is represented by rotating one of the clocks!
• φ1(t)φ2(t) = 1 defines the skew time, ts
• ts indicates the possibility of unwanted simultaneous bit transfer
• skew are caused by the clock driving circuit or by the distribution arrangement
10: PerformanceIntegrated Electronic Systems Lab 298
Clock Generation Circuits (1)
2-phase clock generator with transmission gate delay
• Mp1, Mn1 inverter acts as the first driver for the chain
• Transmission gate (TG) is used as delay element to minimize clock skew
• TG is modeled as an equivalent resistance RTG and introduces a delay tD = RTGCin
• tP - the propagation delay through an inverter
• Choosing tD ~ tP the delay between the two branches is the same
• Thus clocking skew can be controlled by adjusting the size of the TG transistors (β)
( ) ( )TpDDpTnDDn
TGVVVV
R−+−
=ββ
1
10: PerformanceIntegrated Electronic Systems Lab 299
Clock Generation Circuits (2)
2-phase clock generator with RS latch
To insure proper operation of the circuit two items should be checked:
• tP through the inverter must be small compared to the clock period (CLK has time to enter the latch)
• the output capacitance in both branches should be equal for equal switching delays; but capacitances are sensitive to the layout and interconnect geometry!
10: PerformanceIntegrated Electronic Systems Lab 300
Clock Drivers and Distribution Techniques (1)
The clock driver must be able to handle large capacitive loads at the required clock frequency
Clock skew originate mostly from:
• unbalanced loads at the driver
• unequal distribution line delays (RC) - see figure
Distribution networks approaches:
• cascaded chain of inverting buffers that matches the clock generator to the distribution line
• balanced tree network with multiple fanouts
• symmetrical geometries (like H-tree) for the clock distribution lines
10: PerformanceIntegrated Electronic Systems Lab 301
Clock Drivers and Distribution Techniques (2)
Balanced tree network with multiple fanouts:
• identical drivers can be used within a given stage
• the drive requirements of the output circuits are reduced from the single inverter design since the fanout has been split into groups
H-tree network:
• each clock distribution point O is at the same distance from the driver D, giving equal delay times
10: PerformanceIntegrated Electronic Systems Lab 302
Input Protection Circuits (1)
Excessive electrical charge on the gate of the MOS transistor can destroy the device!
Protection circuits drain this excessive charge and avoid static burnout!
WLCC oxg =ox
Gox x
VE ≈ cmVEBD /105,7~ 6•
If Eox>EBD, the oxide insulating properties break down and charge is transported through the material - destruction of the device!
The max gate voltage VGmax is a relatively small number
Static electricity during handling could easily reach a few kV
VcmVxEV oxBDG 25.261035/105,7 96max =⋅⋅⋅=⋅≅ −
Protection circuits allow for alternate charge flow paths when the input voltage is too large
Diode structures are very useful in this application because:
• have relatively low breakdown voltages which can be controlled
• reverse breakdown in a pn junction is non-destructive
10: PerformanceIntegrated Electronic Systems Lab 303
Input Protection Circuits (2)
Diode input protection circuit:
• D1...4 are reverse biased
• R reduces the voltage that reaches D3, D4 and increases the level of protection
• D1, D2 and D3, D4 undergo breakdown for positive or negative voltage sources
Thick oxide MOSFET protection circuit:
• the transistor has the threshold voltage > VDD
and is in cutoff during normal operation
• If Vin > VT,f the transistor conducts providing a path to ground to drain off the excessive charge
Input protection circuits introduce parasitic RC time constants into the network!
10: PerformanceIntegrated Electronic Systems Lab 304
Static Gate Sizing (1)
Problem - determine the values of Sj for j = 2,... which minimizes the total propagation delay through the inverter chain
• Sj - sizing factor, S1 = 1; Sj >1 for j>1
• βj - conduction factor, β1=k’(W/L)1; βj=Sjβ1
• Cw - wiring contribution of gate 1
• Ci, Co - in/out capacitances of gate 1
• Co,j = SjCo - output capacitance from gate j
• Ci,j = SjCi - input capacitance to gate j
• Cw,j = SjCw - wiring capacitance of gate j
( ) ( )[ ]wijojj
jwjijoj
jD CCSCSS
RCCC
S
Rt ++⎟
⎟⎠
⎞⎜⎜⎝
⎛=++⎟
⎟⎠
⎞⎜⎜⎝
⎛= +++ 11,1,,,
The time delay through gate j is, tD,j:
10: PerformanceIntegrated Electronic Systems Lab 305
Static Gate Sizing (2)
( )[ ]∑
=
+ ++=
N
j j
wijojD S
CCSCSRT
1
1
Suppose that there are N stages in the chain, the total time delay is given by:
To minimize TD we differentiate with respect to Sj and look for zero slope points: 0=∂∂
j
D
S
T
This results in the recursion relation:1
1
−
+ =j
j
j
j
S
S
S
Sfor j= 2,3,...N
If this to hold for arbitrary values of j, then: constKS
S
j
j ==+1
The boundary conditions of the problem are: S1 = 1, SN+1 = CL/Ci
Forming the product:i
LN
N
N
C
CK
S
S
S
S
S
S
S
S==⋅⋅⋅⋅⋅ +1
3
4
2
3
1
2
We obtain the scaling ratio in the form:N
i
L
C
CK
/1
⎟⎟⎠
⎞⎜⎜⎝
⎛=
10: PerformanceIntegrated Electronic Systems Lab 306
Static Gate Sizing (3)
Explicitly, the scaling factors are given by:
( )[ ] ( )[ ]∑=
++=++=N
jwiowioD CCKCNRCCKCRT
1min,
S1 = 1, S2 = K, S3 = K2 ... SN = KN-1
The minimum delay is then:
The number of stages that optimize the delay is obtained by differentiating TD (replacing K with its N-dependent equation) with respect to N and setting the result to 0:
( ) ( )0
)/ln1
1
=⎥⎦⎤
⎢⎣⎡ −⎟⎟
⎠
⎞⎜⎜⎝
⎛++
N
CC
C
CCCRRC iL
N
i
Lwio
If Co is small: ⎟⎟⎠
⎞⎜⎜⎝
⎛=
i
L
C
CN ln N is chosen the nearest integer for given values of Ci and CL
The equation K = Sj+1/Sj says that the minimum delay occurs when every stage has the same individual delay time tD
⎟⎟⎠
⎞⎜⎜⎝
⎛=⇔=
i
L
i
LN
C
CKN
C
CK lnlnwith eeKKNKN ==⇔=⇔=⇒ 11lnln
the optimum scaling ratio equals e !!!
10: PerformanceIntegrated Electronic Systems Lab 307
Off-Chip Driver Circuits
Off-chip driver circuits are critical to the overall chip design
Some important problems must be addressed:
• efficient buffer circuitry between internal and off-chip drivers
• minimization of transmission line effects
• fast switching
• static charge protection
• interface specific items, such as CMOS-TTL level converter, etc.
An inverter circuit can be used as a basic off-chip driver
Performance factors are :
• the transient switching times tLH and tHL
• transmission line effects
10: PerformanceIntegrated Electronic Systems Lab 308
Double-Inverter Off-Chip Driver Circuit
The simplest off-chip driver circuit: an inverter chain designed to handle a large capacitive load
( )TnDDnn
out
n VVk
C
L
W
−=⎟
⎠⎞
⎜⎝⎛
'2 τ
( )TpDDpp
out
p VVk
C
L
W
−=⎟
⎠⎞
⎜⎝⎛
'2 τ
The sizes of Mn2 and Mp2 can be estimated using the high-to-low time constant τn and the low-to-high time constant τp:
The actual values of the fall and rise time can be estimated from:
( )⎥⎦
⎤⎢⎣
⎡⎟⎟⎠
⎞⎜⎜⎝
⎛−
−+
−= 1
2ln
2
0V
VV
VV
Vt TnDD
TnDD
TnnHL τ
( )⎥⎥⎦
⎤
⎢⎢⎣
⎡
⎟⎟
⎠
⎞
⎜⎜
⎝
⎛−
−+
−= 1
2ln
2
0V
VV
VV
Vt
TpDD
TpDD
Tp
pLH τ
Cout is large ⇒ Mn2 and Mp2 are large! ⇒ obtained using parallel connected transistors to aid in layout and parasitic control
Mn1 and Mp1 can be sized using the previously presented sizing theory
where V0 is the 10% voltage point
10: PerformanceIntegrated Electronic Systems Lab 309
Example
Consider a process characterized by the nominal values:
k’n = 55[µA/V2] VT0n = 0.9[V]
k’p = 25[µA/V2] VT0p = -0.75[V]
and VDD = 5[V]
The requirements for off-chip driver circuits are tLH = tHL = 20[ns] with a maximum load of Cout = 50[pF]
Using the previous equations we can compute the time constants
τn = 6.45[ns]
τp = 6.58[ns]
the aspect ratios are: 352
≅⎟⎠⎞
⎜⎝⎛
nL
W72
2
≅⎟⎠⎞
⎜⎝⎛
pL
W
10: PerformanceIntegrated Electronic Systems Lab 310
Tri-State Off-Chip Driver Circuit
The input signal is split and individually control each output transistor
The high-impedance state is obtained by driving both NMOS and PMOS output devices into cutoff
Normal operation:
Z = 1 ⇒ Mp1 and Mp2 off, Mn on
High-impedance state:
Z = 0 ⇒ Mp1 and Mp2 on, Mn off
⇒ Vp = VDD, Vn = 0
⇒ the output transistors are in cutoff
10: PerformanceIntegrated Electronic Systems Lab 311
Bidirectional Off-Chip Driver Circuit
The tri-state section is a non-inverting buffer with an enable control E
E = 0 gives the high-Z state
10: PerformanceIntegrated Electronic Systems Lab 312
Packaging Technology (1)
Package types
1. Bare die
2. Dual-In-line Package (DIP)
3. Pin Grid Array (PGA)
4. Small-outline IC
5. Quad flat pack
6. Plastic Leaded Package (PLCC)
7. Leadless carrier
1
4
3 6
5
27
10: PerformanceIntegrated Electronic Systems Lab 313
Packaging Technology (2)
Package has an important functionality in IC technology
• provides a means of bringing signal and supply wires in/out of the circuit
• removes the heat generated by the circuit
• protects the die against environmental conditions such as humidity
• provides mechanical support
Meantime packaging technology has a tremendous impact on the performance ⇒ up to 50% of the delay of a high-performance computer is due to packaging delays!
Packages generate parasitic inductance and capacitance:
Package Type Capacitance (pF) Inductance (nH)
68-pin plastic DIP 4 35
68-pin ceramic DIP 7 20
256-PGA 1-5 2-15
Wire bond 0.5-1 1-2
Solder bump 0.1-0.5 0.01-0.1
10: PerformanceIntegrated Electronic Systems Lab 314
VDDext
L
L
VoutVin
CL
i(t)
VDDint
∆v - the difference between VDDext and VDDint:
• affects the logic levels
• reduces the noise margin
dt
diLv =∆
Inductive coupling between external (VDDext) and internal (VDDint) supply voltage (bonding wires)
A changing current passing through an inductor generates a voltage drop:
A transient current is sourced/sunk from/into the supply rails to charge/discharge CL
Packaging Technology (3)
Example: parasitic effects of the bond-wire inductance
10: PerformanceIntegrated Electronic Systems Lab 315
Design techniques:
• Separate power pins for I/O pads and chip core
• Multiple power and ground pins
• Careful selection of the position of the power and ground pins on the package
• Adding decoupling capacitance on the board
• Increase the rise and fall times
• Use advanced packaging technologies
CHIPSUPPLY
Bonding
WireBoard
Wiring
Cd
Decoupling
Capacitor
+
-
Packaging Technology (4)
10: PerformanceIntegrated Electronic Systems Lab 316
Packaging Technology (5)
Packaging Technology Requirements:
• Electrical: low parasitics (L, C, R)
• Mechanical: reliable and robust
• Thermal: efficient heat removal
• Economical: inexpensive
Two interconnect levels:
(1) Die-to-Package-Substrate
(2) Package substrate to PCB
10: PerformanceIntegrated Electronic Systems Lab 317
Packaging Technology (6)
1-a: Wire bonding
Lead Frame
Substrate
Die
Pad
• Wires must be attached serially
• Bonding wires have inferior electrical properties (L, C)
• Difficult to predict the exact value of parasitics (irregular)
10: PerformanceIntegrated Electronic Systems Lab 318
Packaging Technology (7)
Substrate
Die
Solder BumpFilm + Pattern
Sprockethole
Polymer film
Leadframe
Testpads
1-b: Tape-automated bonding (TAB)
• The die is attached to a metal lead frame that is printed on a polymer film
• The connection between chip pads and polymer film wires is made using solder bumps
• Highly automated process
• Improve electrical performance (L ~ 0.5nH, C~0.3pF)
10: PerformanceIntegrated Electronic Systems Lab 319
Packaging Technology (8)
1-c: Flip-chip mounting
Solder bumps
Substrate
Die
Interconnect
layers
• Flip the die upside-down and attach it directly to the substrate using solder bumps
• Superior electrical performance
• Pads can be placed at any position on the chip (not only on the die boundary)
• A possible solution for power and clock distribution problems
10: PerformanceIntegrated Electronic Systems Lab 320
Packaging Technology (9)
2-a: Through-hole mounting
• mechanically reliable connections
• limits packaging density
2-b: Surface mounting
• increase package density:
• through holes are eliminated
• the lead pitch is reduced
• both sides of the board can be used
• the on-the-surface connection is weaker
• more expensive equipment needed
• testing on board is more complex
10: PerformanceIntegrated Electronic Systems Lab 321
Packaging Technology (10)Multi-Chip-Modules (MCM) - Die-to-Board
(avionics processor module - Rabaey96)
Mount the die directly on the substrate
• increase the packaging density
• increase the performance
• reduce power consumption
• expensive technology
10: PerformanceIntegrated Electronic Systems Lab 322
Semiconductor Packaging Process
How to come from wafer to final application ?
?
?
?
10: PerformanceIntegrated Electronic Systems Lab 323
Semiconductor Packaging Process
Finally, the packaging processes on component and application board level make the product working and successful.
10: PerformanceIntegrated Electronic Systems Lab 324
Semiconductor Packaging Process– Pre-AssemblyAdvanced Pre-Assembly Process (Dicing before Grinding - DBG)
Half CutDicing
TapeLamination
Back Side Grinding
Dicing Blades Grinding Tape Grinding Wheels Mounting Tape Peeling Tape
Stress Relief(Plasma)
Wafer Mounting
Grind. Tape Removal
Gas/Energy
TapeLamination
Back Side Grinding
Dicing BladesGrinding Tape Grinding Wheels Mounting Tape Peeling Tape
Stress Relief(Plasma/Dry Polish)
Wafer Mounting
Grind. Tape Removal
Gas/Energy
Full CutDicing
Source: S. Mimietz/QD:Pre-Assembly Process Flow
Standard Pre-Assembly Process (Grinding before Dicing - GBD)
10: PerformanceIntegrated Electronic Systems Lab 325
Semiconductor Packaging Process
Face- Down Assembly Process (Ball Grid Arrays w/ Bond Channel)
Printing/Taping
Die Attach/Lamination
Adhesive/Tape Pick-up Tooling Temperature/Time Capillary & Wire
Adhesive Curing
WireBonding
Temperature/Time
Post Print Curing
Die Attach
Adhesive/ Dispense & Pick-up Tooling Temperature/Time Capillary & Wire
Adhesive Curing
WireBonding
AdhesiveDispense
Face-Up Assembly Process (Ball Grid Arrays w/o Bond Channel)
10: PerformanceIntegrated Electronic Systems Lab 326
Molding
Gas/Energy Temperature/Time Solder Ball & Flux
Post Mold Curing
S/B AttachReflow
Plasma Activation
End of Line Process (Ball Grid Arrays)
Compound Dicing Blades
Package Singulation
End of Line Process (Leaded Packages)
PlasmaActivation
Post Mold Curing
Gas/Energy Temperature/Time Plating Bath Cutting Tool
Sn-PlatingLeads
Dedam/ Dejunk
Compound
Molding
Forming Tool
Trim&Form
Semiconductor Packaging Process
10: PerformanceIntegrated Electronic Systems Lab 327
CostCost per function decreases 25% per year
Form Factor (Package Density)Feature size reduction by factor 0.7X linear each node (every 2...3 years)Doubling devices/cm² each node (every 2...3 years)
Integration LevelMoore's law: bits per chip grow by factor of 4x every 3 yearsIn future slowing down to 4x every 4...5 years
SpeedClock frequency/data rate is increasing (5x growth every 10 years, slowing down to 3x)
PowerLaptop or cell phone require extended battery life timesHeat dissipation to be more effective
Functionality Logic: Digital CMOS - Analog / Mixed Signal - CMOS RF Memory: SRAM - DRAM - eDRAM EEPROM/Flash - FRAM – MRAM Actors / Sensors: Electro-optical - MEMs - chemical sensors - electro biological
Packaging Key Enabler
10: PerformanceIntegrated Electronic Systems Lab 328
Typical Memory Package Types
Basic Packaging Concepts
The actual package concepts in use are:
TSOP (Thin Small Outline Package) – since about 1995
FBGA (Fine Pitch Ball Grind Array) – since about 2003
FLGA (Fine Pitch Land Grid Array) – since about 2005
F2BGA (Fine Pitch Flip Chip Ball Grid Array)
MCP (Multi Chip Package)
10: PerformanceIntegrated Electronic Systems Lab 329
Packaging Key Enabler – Form Factor Dimension
F2BGABump &Substrate
Wire Bond &Lead Frame
TSOP
Form Factor Interconnect, Size, Cost?
FBGA Wire Bond &Substrate
Source: H. Hedler/QAG: Current and future packaging challenges
LGAWire Bond &Substrate &w/o balls
SiliconSize
2D-Package
3D-Package
Function,Performance
MCP
MCP/SiP
CustomizedSolution
TSOP, FBGALGA, FCiP
Standard Package
Smaller package sizes allow increased package density on board.Better electrical package performance supports higher speed.
10: PerformanceIntegrated Electronic Systems Lab 330
Packaging Key Enabler – Form Factor Chip Density
Higher package and/or chip density support increased storage density on module level.
- packages get stacked to better utilize placement area- substrates get thinner to enable thin packages- chips get thinner to enable die stacking- balls get smaller to maintain total package height- bonding wires get replaced by RDL and vias
Stacked BGA (Folded)
Stacked BGA (PoP)
Stacked TSOP
Stacked Die FBGA
Wafer Level Package
10: PerformanceIntegrated Electronic Systems Lab 331
Typical Memory Package Types - TSOP
1. Thin Small Outline Package (TSOPII)
Package type w/ “Z- leads” on 2 opposite package sides
TSOPII is typically a single die package
SMT compliant
Typical pin count : 54/66
Package height : 1.2 mm
10: PerformanceIntegrated Electronic Systems Lab 332
Typical Memory Package Types - TSOP
Chip face-down assembly
Chip face-up assembly
Principle Package Constructions for TSOPII
10: PerformanceIntegrated Electronic Systems Lab 333
Technical Challenges – TSOP Challenges
TSOP Challenges
One big challenge for TSOP packages is whisker growing related to the Pb-free plating applied for green package. The whisker growth rate strongly depends on the existing stress level inside the plated layer on the leads. The stress conditions can be impacted by plating technology and SMT reflow.
10: PerformanceIntegrated Electronic Systems Lab 334
Typical Memory Package Types - FBGA
2. Fine Pitch Ball Grid Array (FBGA)
Package type w/ ball interconnects on bottom side only
The FBGA package concept is flexible and can carry more then 1 chip.
SMT compliant package
Ball count range : 54 – 144
Package height : 0.55/0.80/1.00/1.20/1.40 mm
10: PerformanceIntegrated Electronic Systems Lab 335
Typical Memory Package Types - FBGA
Chip face-down assembly
Chip face-up assembly
Principle Package Constructions for FBGA
10: PerformanceIntegrated Electronic Systems Lab 336
Typical Memory Package Types - FLGA
3. Fine Pitch Land Grid Array (FLGA)
Package type w/o solder spheres on bottom side what results in a lower total package height
Contains typically a single die in flip chip or wire bond technology but the FLGA package concept is also flexible to carry more then one die.
SMT compliant package
Ball count range : 8 – 300+
Package height : 0.4/0.48/1.0/1.1/1.2/1.3/1.4/1.92/2.2 mm
10: PerformanceIntegrated Electronic Systems Lab 337
Typical Memory Package Types - FLGA
Chip face-up assembly
Principle Package Construction for FLGA
10: PerformanceIntegrated Electronic Systems Lab 338
Typical Memory Package Types - F2BGA
4. Fine Pitch Flip Chip BGA (F2BGA)
F2BGA is a low or thin profile plastic BGA that carries inside a flip chip mounted on polymer substrate but looks from the package outside like a FBGA w/o bond channel
This package contains typically one die in flip chip technology
SMT compliant package
Package height : 1.2/1.4 mm
Package ball count range : 136 - 240
10: PerformanceIntegrated Electronic Systems Lab 339
Typical Memory Package Types – F2BGA
Chip face-down assembly
Principle Package Construction for F2BGA
10: PerformanceIntegrated Electronic Systems Lab 340
Typical Memory Package Types - MCP
5. Multi Chip Package (MCP)
MCP’s are low or thin profile plastic TQFP, LQFP or FBGA packages that contain today 2 - 8 stacked functional chips and up to 7 spacers in same package.
Memory MCP’s follow very different package concepts based on the individual chip sizes to be packaged and the required position of each individual die within the chip stack.
SMT compliant package
Ball count range : 54 – 149 (2007)
Package height : 0.8/1.0/1.2/1.3/1.4/1.6 mm
10: PerformanceIntegrated Electronic Systems Lab 341
Typical Memory Package Types - MCP
“Chinese Tower” “Chinese Reverse Tower”
“Mixed Die Stack” “Quad Die Stack”
Principle Package Constructions for MCP
10: PerformanceIntegrated Electronic Systems Lab 342
Typical Memory Package Types - MCP
… continued MCP
MCP’s got a tremendous importance as memory packages during last 2 years since this is the most effective way to combine different functionalities and/or increase storage density per package foot print.
The main stream memory packages using stacked chips. The package concepts could be generally structured into:
- Chip stack of same die size (Dual Die or Quad Die Stack)- Chip stack starting w/ largest and finishing w/ smallest die (Chinese Tower)- Chip stack starting w/ smallest and finishing w/ largest die (Chin. Reverse Tower)- Mixed die sizes in all stack positions (Mixed Die Stack)
To manufacture MCP’s a broad range of wafer thinning, die attach and wire bond technologies need to be mastered. Beside the process technologies also the materials to be used play a major role for success.
The MCP technology is considered as a key packaging technology of the near future.
10: PerformanceIntegrated Electronic Systems Lab 343
Technical Challenges – MCP Challenges
MCP Challenges
Most crucial task for MCP’s is to develop and establish robust processes for thin die stacking and wire bonding.
Die pick-up capability for 75µm, 50µm or less thickness Full range of material-set to stack different chips for different stack configurationsAdvanced die attach and wire bond loop capability
10: PerformanceIntegrated Electronic Systems Lab 344
Future Technical Challenges – Where we are?
Wins / Features• Small footprint• Very high scale integration• Very high storage density• High speed and data rate• Less energy consumption• New DRAM architecture
Phase 1:• Single Die Package
Phase 2:• Multi Chip Package
Phase 3:• 3D Chip Integration
10: PerformanceIntegrated Electronic Systems Lab 345
Future Technical Challenges – New Concepts
Future packaging technology will focus on 3D chip integration what requiresvery strong cooperation between Frontend and Backend Development.
Challenges
DRAM architecture different Wire bonds replaced by Si-
trough hole electrode DRAM design to consider
space for micro vias Redistribution layer and micro
vias to be Frontend process Chip thickness extremely low New interconnect technology to
be developed Balancing of CTE- mismatch
inside package to be managed
Multi Chip Package
3D Chip Stack Package
Integrated Electronic Systems Lab
11. CAD & Design Flow
Integrated Electronic Systems Lab 34711: CAD & Design Flow
Motivation: Microelectronics Design Efficiency
Achieving required productivity by system-level design methodologies
1970 1980 1990 2000 2010
Layout Editor
Moore‘sLaw
Schematic Entry
Logic and Architectural Synthesis
???
Eff
icie
ncy
Platform-based Design
Integrated Electronic Systems Lab 34811: CAD & Design Flow
Example for Complex Systems: Embedded SoC
Properties
• Potentially consisting of a large number of components
• Specialised to an application domain• reactive• Real-time capability
Design Tasks
• Definition of communication architecture which is adequate to the application‘s structure
• Mapping of the system specification on available implementation components
Constraints
• Costs• Power consumption• Latency• Required flexibility
Embedded „System-on-Chip“
Micro-con-
troller
DSP
Memory
I/O-Module
ASIC
Actuators
Sensors
RFTransc.
Integrated Electronic Systems Lab 34911: CAD & Design Flow
Platform-Based System Design: Platform Life-Cycle
Platform
DSP core
CPU core
busMemory
DSP core
CPU core
busMemory
Specificblocks
OSAPI
Applications
OSAPI
Easy Implementation:
multiple devices with similar basic functions
ExperiencesNew Requirements
Feedback for future platform generations
Drivers
GenericPlatform
+Application-
SpecificAdditions
Lifecycle
Integrated Electronic Systems Lab 35011: CAD & Design Flow
Project Management: System Design: V Model
Analysis ofSystem Requirements
Design ofSystem Architecture
Analysis of HW/SW Component Requirements
HW/SW Co-Design
HW and SW ComponentImplementation
HW/SWIntegration
System Integration
System Delivery
System Properties and Constraints
Cost Analysis
Abstract Interfaces
Implemented HW/SW Modules
Prototype Generation and/orManufacturing
Product
Customer Application
Quality Assurance
Quality Assurance
Quality Assurance
Validation
Validation
Validation
SystemLevel
HW/SWComponentLevel
HW/SWIP Databaseand ImplementationLevel
ProductLevel
Integrated Electronic Systems Lab 35111: CAD & Design Flow
Hardware/Software Co-Design
Co-Simulation
HW/SW-Partitioning
Specification
HW-Specification SW-Specification
Synthesis Compilation
Heterogeneous HW-/SW-System
Communication Synth.
Placement/Routing Real-Time OS
O.k., let‘s gobottom-up now
Integrated Electronic Systems Lab 35211: CAD & Design Flow
Classes of CAD Tools
• Design Entry:– Graphical Editor (drawing schematic diagrams, physical layout, stick
layout diagrams, ...)– Language based circuit capture tools (for hardware description
languages like VHDL, Verilog, EDIF)
• Design Validation:– Physical design verification tools (design rule checker, extractor,
LVS, schematic and electrical rule checker)– Design Simulation:
• analog simulation: circuit level; behavioural level• digital simulations: circuit level, switch level, logic level, register transfer
level, architectural level, behavioural level; • thermal simulation: displaying heat dissipation on chip
– Formal Verification Methods
Integrated Electronic Systems Lab 35311: CAD & Design Flow
Classes of CAD Tools
• Design Implementation:– Layout Compilers (stick2layout, macrocell generators, datapath
compilers)– Layout Structuring & Optimization:
• Layout Compaction• Placement and Routing
– Logic Synthesis– Finite State Machine (FSM) Synthesis– Architectural Synthesis
• Management of Design Projects:– Design Databases:
• keep different versions (current, backup 1, ..., backup n) and views of a design object (schematic, simulation netlist, stick diagram, physicallayout, ...) in database
Integrated Electronic Systems Lab 35411: CAD & Design Flow
Full Custom Design: Design EntryFull Custom Design
With Full Custom Design techniques, the designer is able to individually specify the geometrical layout of the integrated circuit (transistor size[channel length, channel width, shape, ...], transistor placement, wire width, ...). The designer has the option to manually optimizethe layout
the most dense/area efficient layouts can be generated using the full custom design styles.
www.tanner.comLayout Editor
and Design Rule CheckHand-Crafted Layout:• The layout is drawn in form of rectangles and polygons on different layers using a graphics
editor.• The designer has to know a large set of process dependent design rules.• The mask layout is generated as drawn on the screen: direct influence to component
placement, to important parameters as W and L of transistors, wire widths, ...
Integrated Electronic Systems Lab 35511: CAD & Design Flow
Full Custom Design: Design Entry
Tool internal Design Representation: Geometrical Specification Language
• The layout is specified in textual form giving either the position and layer of rectangles (similar to hand crafted layout) or lines (as in stick diagrams).
• Since programming language constructs like – parameterized macros (to be used for layout segments as cells, ...), – loops (while, repeat, for, ...), and – conditional statements (if, case, ...) may be available, – parameterized layouts (e.g. generic transistor with W and L as parameters, cells for
different bit widths, sss) can be described using geometrical specification languages.
• Used in a large number of macrocell compilers.
Integrated Electronic Systems Lab 35611: CAD & Design Flow
Full Custom Design: Design Entry
B x y dx dy Box with length dx, width dy, an lower left hand corner placed at (x,y)L n Layout level (layer) for the box definiitions that followM n Start of macro definition nE End of macro definitionC n x y m Call for macro number n with translation x,y and orientation m.Q End of layout file
Example for a simplified geometrical specification language:
MOS Layer definitions:
Layer CMOS NMOS
1 n-diffusion n-diffusion2 p-diffusion ion implant3 polysilicon polysilicon4 metal metal5 contact contact8 n-well --9 overglass overglass
Integrated Electronic Systems Lab 35711: CAD & Design Flow
Full Custom Design: Design Entry
Cell Orientations:
Orien-tation Description
1 no rotation2 rotate 90° counterclockwise3 rotate 180° counterclockwise4 rotate 270° counterclockwise5 mirror about y-axis6 rotate 90° counterclockwise and mirror about y-axis7 rotate 180° counterclockwise and mirror about y-axis8 rotate 270° counterclockwise and mirror about y-axis
Integrated Electronic Systems Lab 35811: CAD & Design Flow
Full Custom Design: Design Entry
Full custom layout (hand crafted or generated out of a stick
diagram resp. a layout description)
Corresponding geometrical specification file and schematic diagram
Integrated Electronic Systems Lab 35911: CAD & Design Flow
Full Custom Design: Design Entry
Stick Diagram:• The layout is drawn in form of lines and polygons on differentlayers using a
graphics editor. • A stick--to--layout converter together with a compactor and a description of the
process design rules is then used to generate the rectanglebased layout.
• The designer can draw almost process and design rule independent symbolic layouts. Process adaption is done by the converter/compactor.
• Converter constraints (cell dimensions, channel widths / lengths of transistors, ...) can be specified.
• Stick Diagram Conventions:– Diffusion Areas: green (b/w: dotted line)– Polysilicon Lines: red (b/w: dashed line)– Metal Lines: blue (b/w: solid line)– Contacts: black
Example: Stick Diagram of a Transistor:
Integrated Electronic Systems Lab 36011: CAD & Design Flow
Full Custom Design: Stick Diagrams
Memory cell schematic and corresponding stick diagram
Integrated Electronic Systems Lab 36111: CAD & Design Flow
Fabrication Test Pattern
Block Layout
FloorplanningPlacement & Routing
Full Custom Design: Design Flow
Stick DiagramEditor
stick2layoutConverter
and Compactor
Layout Editor
Cells
Symbol Generation
Schematic Entry
Mask Layout Data
Fabrication
Simulation NetlistExtraction and Simulation (SPICE)
Design AnalysisDRC, ERC
Circuit ExtractionLVS
Circuit Simulation (SPICE)Timing Analysis
Test Pattern Generation
Integrated Electronic Systems Lab 36211: CAD & Design Flow
Cell Based Design
Cell based Design approaches rely on layout components predefined and provided by a silicon foundry. Several implemenation styles can be distinguished:
• Standard Cells:– layout blocks predefined by silicon foundry– full process sequence (amount of mask layers) for chip fabrication required
• Gate Arrays:– Linear Gate Arrays:
• pre-fabricated diffusion and poly layers (regular structures, e.g. transistors)• customized interconnect structures (wires in metal 1 and metal 2)• fixed size interconnect areas (channels)
– Sea of Gate Array• pre-fabricated diffusion and poly layers (regular structures e.g. transistors)• customized interconnect structures (wires in metal 1 and metal 2)• variable size interconnect areas (channels) over unused transistors
discussed later in this lecture
Integrated Electronic Systems Lab 36311: CAD & Design Flow
Cell based Full Custom Design: Design FlowMacrocell
Specification/Compilation
Fabrication
Simulation Netlist Extraction
Design AnalysisDRC, ERC
Circuit ExtractionLVS
Fabrication Test Pattern
CellLibrary
Symbol Generation
Schematic EntryGraphical
Data
Logic SimulationFault SimulationTiming Analysis
Test Pattern Generation
Simulation Models
Placement:Standard Cells
Macro CellsI/O Cells
LayoutData
Routing:Channel Generation
Global RoutingDetailed Routing
Mask Layout Data
Place &RouteOptimization
ParasiticWire Capacitances /Delay Backannotation
Integrated Electronic Systems Lab 36411: CAD & Design Flow
Standard Cell Full Custom Design
Integrated Electronic Systems Lab 36511: CAD & Design Flow
Physical Design Rule Check:
Physical design rule checks (DRCs) are performed to guarantee the conformity of a layout design to thesilicon vendor's set of design rules. Design rules are defined between objects on the same layer (minimum width, minimum spacing) as well as for objects on different layers (minimum spacing, overlapping, extension).
• Minimum width• Minimum spacing• Overlapping• Extension
Design rule violations are usually reported in the physical layout using a graphics editor. Sometimes, also a tabular form indicating the location and type of design rule violation can be generated.
Design Verification
Integrated Electronic Systems Lab 36611: CAD & Design Flow
Design Verification
Extraction:
• Circuit Level Extraction can be used to create a netlist for circuit level simulations (e.g. SPICE, ...). The netlist consists of MOS transistors (including geometrical parameters as W / L, parasitic capacitances), resistors, capacitances, diodes, ...
• Switch Level Extraction: can be used to create a netlist which can be processed by a switch level simulator. The resulting netlist consists of MOS transistors and parasitic capacitances (to model storage effects in MOS circuits).
• Parasitics Extraction: is used in conjunction with cell based design techniques. Since wire delay is dependent on the parasitic capacitance of a wire, parasitic capacitances of nets and input capacitances of other gates connected to an output can be used to estimate the extrinsic delays (Note: intrinsic delays [i.e. the delay of unloaded gates] are fetched from the cell library's simulation model data).
• Schematic Extraction: is executed to generate the connectivity data out of a graphical representation (schematic diagram) of a circuit module. The connectivity data is forwarded to a netlister which provides the information required e.g. by simulation tools (the simulators cannot operate on graphical data, they require netlists in a textual format). This kind of extraction is usually required in pre-layout design specification phases.
Integrated Electronic Systems Lab 36711: CAD & Design Flow
Design Verification
LVS:
The layout-versus-schematic (LVS) comparison tool checks the equivalence of the layout and its schematic.The tool can be used to find wrong connections or parameter mismatch (as W/L of transistors, ...) between a schematic and its physical layout representation.
Schematic / Electrical Rule Check (SRC / ERC):
To verify schematics used e.g. in cell based designs, a schematic rulechecker can find schematic rule violations (like the following examples):
• Warnings:• unconnected (floating) wire segments• open outputs• exceeded fanout
• Errors:• open inputs (undefined input value!)• number of bits differ for 2 buses connected together• number of input/output pins in a schematic differs from its symbol representation ( --> pins are
not accessible / not present at higher levels of schematic hierarchy)• more than one active driver connected to a net at the same time
Integrated Electronic Systems Lab 36811: CAD & Design Flow
Simulation: Models
Circuit and Delay Modelling:
• Circuit is built up by simulator primitives• Modelling of the timing/delay behaviour:
∆ : basic time unitτ(n) = n * ∆: delay of the gatet1, t2, t3, ...: clock time of a synchronous circuit(tν+1-tν): ∆t = m*∆
Timing Models:
• Zero Delay: ∆ = 0• Unit Delay: τ(n) = constant• Nominal delay: τ(n) = user-specified
Integrated Electronic Systems Lab 36911: CAD & Design Flow
Logic simulation (1/8)
• Simulation only in the time domain
• Typical Questions:
– How do my output signals behave based on a certain input pattern?
– Is my design still functioning at a given frequency?
• Algorithms:
– Signals values are discrete
– Signal changes are discrete events (where an event characterizes the transition from one signal level to another)
– Events are held and processed using a so-called “event-queue”
• Dynamic, linked list
• Sorted based on time (appearance of event)
• Processed based on current simulation time
• Models (gate primitives) are triggered by events at input signals
Integrated Electronic Systems Lab 37011: CAD & Design Flow
Logic simulation (2/8)
• Logic Systems
– Signal values representing (logic) level and strength
– Resolving multiple drivers via so-called resolution functions (e.g. ‘0’ and ‘1’ at the same node result in an ‘X’); example later
– 2-valued logic system (e.g. VHDL: Type bit)• '0' ("low", e.g. Vout < 2.5 V) and '1' ("high", Vout > 2.5 V)
– 3-valued logic system• To describe circuit problems (signal conflicts)
• '0' ("low"), '1' ("high")
• 'X' ("unknown", may be '0' or '1')
– 4-valued logic system • To describe bus structures
• '0', '1', 'X' (see above)
• 'Z' ("high impedance")
0
1
1
0
X
1
1
0
1
1
Z
1
1
1
EN
EN
1
0
Integrated Electronic Systems Lab 37111: CAD & Design Flow
Logic simulation (3/8)
– 9-valued logic systems (VHDL: Type std_logic_1164)• 'U' ("uninitialized")
• 'X' ("forcing unknown")
• '0' ("forcing low"), '1' (forcing high")
• 'Z' ("high impedance")
• 'W' ("weak unknown")
• 'L' ("weak low"), 'H' ("weak high")
• '-' ("don't care")0
H
CONSTANT resolution_table : stdlogic_table := (-- ----------------------------------------------------------- | U X 0 1 Z W L H - | | -- ---------------------------------------------------------
( 'U', 'U', 'U', 'U', 'U', 'U', 'U', 'U', 'U' ), -- | U |( 'U', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X' ), -- | X |( 'U', 'X', '0', 'X', '0', '0', '0', '0', 'X' ), -- | 0 |( 'U', 'X', 'X', '1', '1', '1', '1', '1', 'X' ), -- | 1 |( 'U', 'X', '0', '1', 'Z', 'W', 'L', 'H', 'X' ), -- | Z |( 'U', 'X', '0', '1', 'W', 'W', 'W', 'W', 'X' ), -- | W |( 'U', 'X', '0', '1', 'L', 'W', 'L', 'W', 'X' ), -- | L |( 'U', 'X', '0', '1', 'H', 'W', 'W', 'H', 'X' ), -- | H |( 'U', 'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X' ) -- | - |
);
Integrated Electronic Systems Lab 37211: CAD & Design Flow
Logic simulation (4/8)
• Timing Behavior Models
– Non-Delay: All gates have the same delay: NULL Simple, fast
No accuracy or timing behavior, not for asynchronous circuits
– Unit-Delay: All gates have the same delay: t_pd > 0 Simple, fast
Cause oscillations in feedback loops (!= reality)
– Nominal Delay: Every gate has an individual, but nominal delay More detailed timing behavior
Tolerances still not being modeled
– Delay model for load (C_load) and environment conditions (temp., voltage, process) dependency (KT,V,P =1 in nominal case).
tpd,actual = ( t0 + KL * Cload) * KT * KV * KP
Integrated Electronic Systems Lab 37311: CAD & Design Flow
Logic simulation (5/8)
• Timing Behavior Models (cont.)
– Min-Max-Delay• Models delay
tolerances
Timing behavior under worst-case conditions
Complex, higher runtime, undefined signal states are mostly very pessimistic as they propagate
10,20
10,20
15,40
A
B C
D
0 ns 20 ns 40 ns 60 ns 80 ns 100 ns 120 ns 140 ns
min. max.
A
B
C
D
Integrated Electronic Systems Lab 37411: CAD & Design Flow
Logic simulation (6/8)
• Event QueueExample
0 ns 20 ns 40 ns 60 ns 80 ns 100 ns 120 ns 140 ns
i1
i2
sel
&
&
1
&
in_gate1
in_gate2
sel_inverter
out_gate
i1
sel
i2
results1
s2selbar 1
10ns
15ns
12ns
8nsTimeSignalValue
TimeSignalValue
Integrated Electronic Systems Lab 37511: CAD & Design Flow
Logic simulation (7/8)
• Event Queue Example (cont.)
– Event queue before Initialization
– Event queue for t = 0 ns
– Event queue for t = 10 ns
0 nssel0
10 nsi11
30 nsi20
70 nssel1
100 nsi10
0 nsi21
0 nsi10
10 nsselbar
1
10 nsi11
30 nsi20
70 nssel1
100 nsi10
22 nss21
15 nss10
30 nsi20
70 nssel1
100 nsi10
15 nss10
12 nss20
12 nss20
0 ns 20 ns 40 ns 60 ns 80 ns 100 ns 120 ns 140 ns
s1
s2
result
selbar U
U...
U
U
Integrated Electronic Systems Lab 37611: CAD & Design Flow
Logic simulation (8/8)
• Simulation on logic level
– Netlist of gates (structural modeling)
– Gate model defined in standard cell library macro
– Strictly using signals of the selectedlogic systems
• Simulation on register-transfer-level
– Netlist of larger components
– Modeling of the component behavior using a hardware description language(VHDL/Verilog)
– Logic signals or even more abstract data types (e.g. state machine states)
&1
&
&
&
1
a
b
1
s
c
init state_1
state_2 state_3
0 0 0 0 0 1
0 1 1 1 0 0
00 10
1-
01
01
11
11
00
01
10
0- 10
00
11
Integrated Electronic Systems Lab 37711: CAD & Design Flow
Simulation: ModelsAdvanced Logic Simulators:
• Introduction of signal strength additional to logic values for driver and bus modelling
A : active, e.g. low impedance driverP : passive, e.g. high impedance driver (depletion load)S : storing, e.g. capacitive stored stateX : active indeterminate (e.g. active or storing)Y : passive indeterminate (e.g. passive or storing)Z : high impedance
• Instead of simple logical values, signals are used for simulation. A signal consists of a logical value and a strength.
• Logical Values = 0,1,X• 16 states A0 A1 AX P0 P1 PX S0 S1 SX X0 X1 XX Y0 Y1 YX ZZ
A0 A0 AX AX A0 A0 A0 A0 A0 A0 A0 AX AX A0 A0 A0 A0A1 A1 A1 A1 A1 A1 A1 A1 A1 AX A1 AX A1 A1 A1 A1AX AX AX AX AX AX AX AX AX AX AX AX AX AX AXP0 P0 PX PX P0 P0 P0 X0 XX XX P0 PX PX P0P1 P1 PX P1 P1 P1 XX X1 XX PX P1 PX P1PX PX PX PX PX XX XX XX PX PX PX PXS0 S0 SX SX X0 XX XX Y0 YX YX S0S1 S1 SX XX X1 XX YX Y1 YX S1SX SX XX XX XX YX YX YX SXX0 X0 XX XX X0 X0 XX X0X1 X1 XX X1 XX XX X1XX XX XX XX XX XXY0 Y0 YX YX Y0Y1 Y1 YX Y1YX YX YXZZ ZZ
Overviewon
SignalCombinations
Integrated Electronic Systems Lab 37811: CAD & Design Flow
Simulation: Models
Example: Driver Modelling:
Competing Drivers at a Bus
Integrated Electronic Systems Lab 37911: CAD & Design Flow
Simulation
www.modelsim.com
Integrated Electronic Systems Lab 38011: CAD & Design Flow
Simulation: Techniques
Simulation Techniques:
• Compiler-driven technique:– Problems:
• Feedbacks• Sorting of gate netlist• Zero delay model• Entire circuit is simulated
• Event-driven simulation ...
Switch-Level Simulation:
• well-suited so simulate digital MOS circuits
• no fixed direction of signal flow• transistor modeled as a switch
with three states: open, closed, unknown
• algebraic or RC models
Integrated Electronic Systems Lab 38111: CAD & Design Flow
Executable Specifications: VHDL
architecture structural of first_tap is
signal x_q,red : std_logic_vector(bitwidth-1 downto 0);signal mult : std_logic_vector(2*bitwidth-1 downto 0);
begin
delay_register:process(reset,clk)begin
if reset='1' thenx_q <= (others => '0');
elsif (clk'event and clk='1') thenx_q <= x_in;
end if;end process;
mult <= signed(coef)*signed(x_q);
Different types of modeling:
• Data Flow• Behaviour• Structure
VHDL is used for:
• Modelling• Simulation• Hardware Synthesis
VHDL: Very high speed integrated Circuits Hardware Description Language
Integrated Electronic Systems Lab 38211: CAD & Design Flow
Design Flow: IC Design with High-Level-Entryarchitecture structural of first_tap is
signal x_q,red : std_logic_vector(bitwidth-1 downto 0);signal mult : std_logic_vector(2*bitwidth-1 downto 0);
begin
delay_register:process(reset,clk)begin if reset='1' then
x_q <= (others => '0');elsif (clk'event and clk='1') thenx_q <= x_in;
end if;end process;
mult <= signed(coef)*signed(x_q);
VHDL-Description
RTL-Synthesis(Synopsys)
Gate-LevelNetlist
Layout
Placement &Routing
(Cadence/Mentor)Production
ASIC
Integrated Electronic Systems Lab 38311: CAD & Design Flow
Future Outlook: Networks-on-Chip
Generic Interface
Router
High-Speed Interconnect
µP
FPGA MEM
ASIC
– Regular platform integrating independent subsystems
• combine structures of today‘s SoC complexity
– Separation between Communication and Computation
Integrated Electronic Systems Lab 38411: CAD & Design Flow
NoC-based design flow: Hardware/Software Co-DesignClassical Flow
Co-Simulation
HW/SW-Partitioning
Specification
HW-Specification SW-Specification
Synthesis Compilation
Heterogeneous HW-/SW-System
Communication Synth.
Placement/Routing Real-Time OS
Dynamic Allocation/Re-Mapping during Operation
HW Library
Implementation
Specification
SW Library
NoC Mapping
NoC-based Flow
NoC Placement
Integrated Electronic Systems Lab 38511: CAD & Design Flow
Application Scenario: Mobile Video Terminal
Single Chip Mobile TerminalMobileServiceBase
Station(s)RF Centr.
CTRL
DISPLAY
Displ.CTRL
Different Configurations for:• High Quality (Resolution) Downstreaming• Low-Power Mode (Quality Reduction)• Image Compression and Upstreaming• Multi-Stream Modes
Integrated Electronic Systems Lab
12. Digital Subsystem Design
Integrated Electronic Systems Lab12: Digital Design 387
Weinberger Structuring
Is a structured approach that simplifies structural layout and improves layout density. Method presented by Weinberger in 1967.
Weinberger Arrays:
• Are created by placing transistors on the chip in a geometrically regular manner. Horizontal and vertical interconnect patterns are used to wire the devices together.
• Using one type of gate (ex. NOR) complex NMOS circuits can be realized.
• Regularity of Weinberger Arrays is very suitable for automatic layout generation.
Integrated Electronic Systems Lab12: Digital Design 388
Weinberger Structuring (2)
Example of NOR gate reduction for Weinberger structuring:
• Empty squares = input connections
• Filled squares = output connections
( )CBAF ++=
Integrated Electronic Systems Lab12: Digital Design 389
Example: 3-to-8 decoder
Weinberger structuring:
Integrated Electronic Systems Lab12: Digital Design 390
3-to-8 decoder (2)
Integrated Electronic Systems Lab12: Digital Design 391
3-to-8 decoder (3)
Integrated Electronic Systems Lab12: Digital Design 392
Example 2
Weinberger NOR array representation
Random logic implementation
YXWVUF ++++=
Integrated Electronic Systems Lab12: Digital Design 393
Example 2 (2)
Weinberger stick diagram
Integrated Electronic Systems Lab12: Digital Design 394
Example 2 (3)
Weinberger array structure: (a) schematic (b) layout
Integrated Electronic Systems Lab12: Digital Design 395
Gate matrix layout
Gate matrix layout is a character based layout style for custom CMOScircuitry. It is a regular design style employing a matrix of intersecting transistor diffusion rows and poly-silicon columns such that intersections are potential transistor sites.
Creating a gate matrix. Representational line drawing or stick figure using the levels of interconnections available e.g. poly-silicon gate technology poly-silicon metal diffusion.
– Immediately draw series of parallel poly lines corresponding to the number of inputs to the circuit (may become more if an output is chosen to be poly-silicon)
– Subsequent transistor placements will be determined by two factors, i.e. input column and serial or parallel association among transistors.
– After row definition, further interconnections may be done with horizontal and vertical metal interconnection tracks\item final improvements
Integrated Electronic Systems Lab12: Digital Design 396
Gate matrix layout (2)
Gate matrix layout:
(a) Schematic
(b) Layout
(c) Optimized layout of N part
Integrated Electronic Systems Lab12: Digital Design 397
Example: half adder
( )AABBABAABBAB
ABABBABABAS
ABABC
⋅=+=
+++=+=
==
)(
Integrated Electronic Systems Lab12: Digital Design 398
Half adder realizations
(a) Standard cell
(b) Gate matrix
Integrated Electronic Systems Lab12: Digital Design 399
Character definitions for symbolic layout
N n-channel transistor
P p-channel transistor
+ metal-poly or metal-diffusion crossover
* contact
| poly-silicon or n-diffusion wire
! p-diffusion wire
: vertical metal
- horizontal metal
Integrated Electronic Systems Lab12: Digital Design 400
Character definitions (cont.)
Integrated Electronic Systems Lab12: Digital Design 401
Rules
The following rules summarize the gate-matrix technique:– Poly-silicon runs only in one direction and is of constant width and pitch
– Diffusion wires (of constant width) may run vertically between poly-silicon columns.
– Metal may run horizontally and vertically. Any pitch departures from a minimum (e.g. power rails) are manually specified.
– Transistors can only exist on poly-silicon columns.
Wide transistors may be specified by abutting two ort more N or P symbols.
Integrated Electronic Systems Lab12: Digital Design 402
Summary of gate matrix properties
regular design style
technology updateable
modularity is encouraged by the block nature of the layout style
circuit extraction may done at the symbolic level or at the mask level by conventional circuit extractions
character symbolic description is not hierarchical modules must be assembled in their entirety and ''pasted'' together at the mask level
no freedom to locally optimize geometry, e.g. transistor size
Integrated Electronic Systems Lab12: Digital Design 403
Optimal CMOS complex gate layout
In MOS circuit design, advantage can be taken by the application of complex functional cells in order to achieve better performance. In this section, the implementation of a random logic function on an array of CMOS transistors will be discussed. The method has been presented by Uehara and van Cleemput in 1981. A graph theoretical approach for systematic and efficient layout generation minimizes the required chip area.
optimal
Integrated Electronic Systems Lab12: Digital Design 404
EXOR: NAND implementation
(a) Logic diagram
(b) Circuit
(c) Layout
Integrated Electronic Systems Lab12: Digital Design 405
CMOS Functional cells (Complex gates)
Advantages of complex-gate approach:– better performance
– smaller size
Integrated Electronic Systems Lab12: Digital Design 406
Complex gates (2)
In the following, the consideration is limited to AND/OR networks realized in complex gate CMOS by means of series/parallel connections of transistors.The topology of the NMOS network and the PMOS network are assumed to be dual.
The delay of a complex CMOS cell mainly depends on the maximum number of series transistors between VDD or VSS and the cell output, which is called levelof the complex cell. This quantity has a direct influence on the charging or discharging resistance of the cell. Generally, cells with less than four levels are desirable. The number of cells with parallel/serial topology is given by the following table:
It is reasonable to use mainly cells with three levels and only sometimes cells with four levels in order to get a sufficient performance.
Integrated Electronic Systems Lab12: Digital Design 407
Alternative EXOR implementation
Integrated Electronic Systems Lab12: Digital Design 408
Basic layout strategy
Integrated Electronic Systems Lab12: Digital Design 409
Layout strategy (2)
Layout properties:– two rows of transistors, for the PMOS and NMOS parts of the circuit
– equal number of transistors in both rows
Optimizations: If the metal connections between adjacent transistors are replaced by diffusion (designer should be careful in doing this for high-speed circuits) the following layout (a) is achieved.
Integrated Electronic Systems Lab12: Digital Design 410
Optimized layout
An even more sophisticated layout arrangement which reduces the required area is shown in (b)
area = width * heightwith
height = const.width = basic grid size * (#inputs + #separations + 1)
A separation is required when there is no connection between physically adjacent transistors.
An optimal layout is obtained by reducing the number of separations.
Integrated Electronic Systems Lab12: Digital Design 411
Optimal layout
The best layout is achieved by the following transistor arrangement, logically equivalent to the previous figures:
Integrated Electronic Systems Lab12: Digital Design 412
Graph theoretical algorithm
The p-side and the n-side of the circuit can be formulated as graphs which can be defined:
( )( ) network siden,
network sidep,
−=−=
NNN
PPP
EVG
EVG
Graph properties:– the graphs are series/parallel graphs (CMOS complex gate
property/assumption)
– every source/drain potential is represented by a vertex V
– every transistor is represented by an edge E, connecting the vertices representing source and drain
– edges are labeled by the corresponding transistor gate input signal
– GP and GN are dual
Integrated Electronic Systems Lab12: Digital Design 413
Graph theoretical algorithm (2)
If two edges Ei and Ej are adjacent in the graph model, then it is possible to place the corresponding gates in a physically adjacent position of an array and hence, connect them by a diffusion area. In order to minimize the number of separations a set of minimum size paths has to be found, which corresponds to chains of transistors in the array.
Definition 1: An Euler path is a single (uninterrupted) path on a graph, that covers every edge of the graph exactly once.
If there exist Euler paths for GN and GP then all transistors can be chained by diffusion areas. Otherwise the graphs have to be partitioned into sub-graphs which have Euler graphs.
It's necessary to find a pair of paths for GP and GN with the same sequence of labels, because p- and n-type transistors corresponding to the same input have to be positioned at the same horizontal position (poly line).
Integrated Electronic Systems Lab12: Digital Design 414
Graph theoretical algorithm (3)
General algorithm:– enumerate all possible decompositions of the graph model to find the
minimum number of Euler paths that cover the graph
– chain the gates by means of a diffusion area according to the order of the edges in each Euler path and
– if more than two Euler paths are necessary to cover the graph model, then provide a separation area between each pair of chains
Result: Search of minimal number of Euler paths is NP-complete.
Problem reduction: An odd number of series or parallel edges can be reduced to a single edge:
Integrated Electronic Systems Lab12: Digital Design 415
Problem reduction
Definition 2: The reduced graph is obtained by iteratively replacing an odd number of series (parallel) edges by a single edge, until no further reduction is possible.
Theorem 1: If there is an Euler path in the reduced Graph then there exists an Euler path in the original graph.
Proof: It is possible to reconstruct an Euler path in the original graph by replacing each edge of the Euler path in the reduced graph by a sequence of the original odd number of edges.
Theorem 2: If the number of inputs to every AND/OR element is odd, then:
– the corresponding graph model has a single Euler path
– there exists a graph model such that the sequence of edges on an Euler path corresponds to the vertical order of inputs on a planar representation of the logic diagram.
Integrated Electronic Systems Lab12: Digital Design 416
Problem reduction (2)
If there are gates in the logic diagram with an even number of inputs, additional “pseudo” inputs have to be introduced in order to guarantee an odd number of inputs. It is guaranteed by the second previously given theorem, that there exists an Euler path for this modified problem. But the pseudo edges in the Euler path have to be removed afterwards and then they can cause diffusion separations. An algorithm for minimizing separations caused by pseudo edges is given in the next section ( minimal interlace of normal and pseudo inputs).
Integrated Electronic Systems Lab12: Digital Design 417
Problem reduction (3)
The heuristic algorithm for generating an Euler path is given by:1. To every gate with an even number of inputs a “pseudo” input is added
2. Add this new input to the gate such that the planar representation of the logic diagram shows a minimal interlace of “pseudo” and real inputs. It should be noted that a “pseudo” input at the top or at the bottom of the logic diagram does not contribute to the separation areas.
3. Construct the graph model such that the sequence of edges corresponds to the vertical order of inputs on the planar logic diagram.
4. Chain together the gates by means of diffusion areas, as indicated by the sequence of edges on the Euler path. “Pseudo” edges indicate separation areas.
5. The final circuit topology can be derived by deleting “pseudo” edges in parallel with other edges and by contracting “pseudo” edges in series with other edges.
Integrated Electronic Systems Lab12: Digital Design 418
Application of reduction rule
(a) Logic diagram
(b) Graph model and its reduction
(c) Reconstruction of an Euler path
Integrated Electronic Systems Lab12: Digital Design 419
Application of heuristic algorithm
This heuristic algorithm does not necessarily give the optimal layout, but if the resulting sequence has no separation areas, it is the real optimal solution.
(a) New inputs p1 and p2 are added
(b) Optimal sequence of inputs without the interlace of p1 and p2
(c) Circuit with the dual path p1,2,3,1,4,5,p2
Integrated Electronic Systems Lab12: Digital Design 420
Algorithm for calculating minimal interlace
Put it in the line.Any
white triangleleft?
Yes
Put it in the line, and set the white
part on top.
Anyblackwhite triangle
left?
Yes
Put it in the line.Any
black triangleleft?
Yes
Put it in the line, and set the black
part on top.
Anyblackwhite triangle
left?
Yes
Anywhite triangle
left?
start
No
No
No
No
stop
No
Yes
An example of line.
Integrated Electronic Systems Lab12: Digital Design 421
Application example for minimal interlace algorithm
Integrated Electronic Systems Lab12: Digital Design 422
Example: carry look-ahead
This implementation has no Euler path!
Integrated Electronic Systems Lab12: Digital Design 423
Alternative carry look-ahead topology
This topology
does have Euler path!
Integrated Electronic Systems Lab12: Digital Design 424
Comparison of space
(a) Functional cell realization
(b) Conventional NAND realization
Integrated Electronic Systems Lab12: Digital Design 425
Standard cell layout
Integrated Electronic Systems Lab12: Digital Design 426
Example: synchronous counter
Integrated Electronic Systems Lab12: Digital Design 427
Programmable Logic Arrays (1)
• Map a set of Boolean functions in canonical, two-level sum-of-product form into a geometrical structure
• Consist of an AND-plane and an OR-plane
• For every input variable in the Boolean equations, there is an input signal to the AND-plane
• The AND plane produces a set of product terms by performing an AND operation
• The OR plane generates output signals by performing an OR operation on the product terms fed by the AND plane
Integrated Electronic Systems Lab12: Digital Design 428
Programmable Logic Arrays (2)
Integrated Electronic Systems Lab12: Digital Design 429
Programmable Logic Arrays (3)
• PLA (Programmable Logic Array): – AND and OR array are programmable
– every product term of the AND array can be connected to any of the OR output gates
• PAL (Programmable Array Logic):– AND array is programmable
– OR array has fixed connection points (OR gates)
• PROM (Programmable Read Only Memory):– AND array hardwired
– OR array programmable
– Set of all possible product terms is realized
Integrated Electronic Systems Lab12: Digital Design 430
Architectures (1)
Integrated Electronic Systems Lab12: Digital Design 431
Architectures (2)
Integrated Electronic Systems Lab12: Digital Design 432
Example (1)
• PROM implementation realizes all of the 8 product terms
x0 x1 x2 z0 z1
0 0 0 1 1
0 0 1 1 1
0 1 0 0 0
0 1 1 0 0
1 0 0 0 0
1 0 1 0 0
1 1 0 1 0
1 1 1 0 1
21010
2102102100
xxxxx
xxxxxxxxxz
+=++=
21010
2102102101
xxxxx
xxxxxxxxxz
+=++=
Integrated Electronic Systems Lab12: Digital Design 433
Example (2)
• PLA implementation needs only 3 product terms
111
011
X00
x2x1x0
10
01
11
z1z021010
2102102100
xxxxx
xxxxxxxxxz
+=++=
21010
2102102101
xxxxx
xxxxxxxxxz
+=++=
Integrated Electronic Systems Lab12: Digital Design 434
Floor Plan for PLA
A AND plane programming cell
O OR plane programming cell
AO AND-OR communication cell
IN AND plane input cell
OUT OR plane output cell
LA left AND plane cell
RO right OR plane cell
BL bottom left cell
BM bottom middle cell
BR bottom right cell
TL top left cell
TA top AND cell
TM top middle cell
TO top OR cell
TR top right cell
PLA generic floor plan
Integrated Electronic Systems Lab12: Digital Design 435
Static nMOS and Pseudo-nMOS PLA
• nMOS PLA: Pull-up network realized by single nMOS depletion transistor
• Pseudo nMOS PLA: Pull-up by high resistance pMOS transistor with permanently grounded gate input
• But: AND-OR structure not suited to MOS circuit technology
• Therefore: AND and OR planes are implemented through NOR or NAND gate structures
• The transformation is based on deMorgan’s law
Integrated Electronic Systems Lab12: Digital Design 436
INV-NOR-NOR-INV Structure (1)
Transformation according to deMorgan’s law:
Integrated Electronic Systems Lab12: Digital Design 437
INV-NOR-NOR-INV Structure (2)
Example:
General structure:
Integrated Electronic Systems Lab12: Digital Design 438
INV-NOR-NOR-INV Structure (3)
Properties:
• high static power dissipation
• small area
• useful if high speed is not required
Integrated Electronic Systems Lab12: Digital Design 439
INV-NOR-NOR-INV Structure (4)
Pseudo nMOS NOR-NOR PLA circuit
Integrated Electronic Systems Lab12: Digital Design 440
INV-NOR-NOR-INV Structure (5)
PLA implementation in pseudo nMOS logic
Integrated Electronic Systems Lab12: Digital Design 441
INV-NOR-NOR-INV Structure (6)
Stick diagram of a nMOS PLA
Integrated Electronic Systems Lab12: Digital Design 442
NAND-NAND Structure (1)
Transformation according to deMorgan’s law:
Example:
Integrated Electronic Systems Lab12: Digital Design 443
NAND-NAND Structure (2)
Properties:
• NAND-NAND approach not recommended:
• decreasing performance at increasing number of inputs (because of series connection of nMOS transistors)
• high static power dissipation
Integrated Electronic Systems Lab12: Digital Design 444
Static CMOS PLA (1)
• NOR gates with a large number of inputs should be avoided in CMOS (because the p-channel devices are in series)
• Static CMOS PLAs are usually realized in NAND-INV-INV-NAND structure in order to avoid long chains of pMOS transistors
Properties:
• no static power dissipation
• area increase becomes unacceptable for large PLAs
• working fast
Integrated Electronic Systems Lab12: Digital Design 445
Static CMOS PLA (2)
PLA NAND-INV-INV-NAND implementation
Integrated Electronic Systems Lab12: Digital Design 446
Static CMOS PLA Layout
Integrated Electronic Systems Lab12: Digital Design 447
Dynamic CMOS PLA (1)
• less size than static CMOS
• fast
• 2-phase clocking
• states of Φ1: Φ1 = 1– no path to ground– inputs change– both NOR planes are precharged
• states of Φ1: Φ1 = 0– first NOR plane discharges– dummy: worst case discharge (prevents second NOR plane to
discharge)– after first NOR plane, the second plane evaluates
Integrated Electronic Systems Lab12: Digital Design 448
Dynamic CMOS PLA (2)
• Φ2 is used to latch the second stage
• Intermediate clock is required to precharge OR plane– generated by the cells TL, TA and TM
– uses a dummy product row that discharges at the worst case rate according to the loading of the AND array
Integrated Electronic Systems Lab12: Digital Design 449
Dynamic CMOS PLA (3)
Dynamic 2-phase PLA circuit
Integrated Electronic Systems Lab12: Digital Design 450
Noise in PLA circuits (1)
• Noise Problems on switched supply lines in dynamic PLAs
• The discharge current generates transients in the power supply bus
• To reduce noise: locally grounding the PLA; use of metal lines for power supply whenever possible (reduced impedance)
Integrated Electronic Systems Lab12: Digital Design 451
Noise in PLA circuits (2)
Integrated Electronic Systems Lab12: Digital Design 452
• optimizations (minimizations) of boolean equations in order to reduce the number of minterms or literals
• decoder in front of the AND plane to generate combined input variables
• if a term is needed both positive and negative, a reduction can be achieved sometimes by using negative logic
Example:
Optimization of PLAs – Logic Minimization
z = x1 + x0x1’x2’ + x0’x1’x2 3 minterms
z’ = (x1 + x0x1’x2’ + x0’x1’x2)’= x1’(x0x1’x2’)’(x0’x1’x2)’= x1’(x0’ + x1 + x2)’(x0 + x1 + x2’)’= (x0’x1’ + x1’x2)(x0 + x1 + x2’)= x0x1’x2 + x0’x1’x2’ 2 minterms
Integrated Electronic Systems Lab12: Digital Design 453
Optimization of PLAs – Folding
PLA before folding
Row-folded PLA
Column-folded PLA
Integrated Electronic Systems Lab12: Digital Design 454
Optimization of PLAs – Multi Sided Access
• An advantage of multi-sided access and folding is the decreased layout area, but the layout structure has changed and the wiring is more difficult.
Multi sided input/output access
Integrated Electronic Systems Lab12: Digital Design 455
Timing & Power Dissipation of a Static PLA
• Delay is determined by – (W/L) of the AND/OR load
– (W/L) of the AND/OR cells
• Minimum Delay:– large load current Iload
– (W/L)ORplane = e*(W/L)ANDplane
• Limitations:– Iload limited by:
• the total power of the PLA
• the internal logical ‘0’: (I * RnMOS = ‘0’) < VT !
– the stage sizing factor e for successive stages can not always be realized due to the floorplan
Integrated Electronic Systems Lab12: Digital Design 456
Automatic PLA Layout Generation (1)
Input: boolean equations
logical optimization
truth table = matrix
floorplanner
Output: layout with mask data
Cells:input/output bufferclock driverVDD/VSS cellsSchmittrigger …
structure of PLA
Integrated Electronic Systems Lab12: Digital Design 457
Automatic PLA Layout Generation (2)
Example: PLA generator input filePLA adderpla;INPUT: I1,I2,I3;OUTPUT: O1,O2;PRODUCT: P1,P2,P3,P4,P5,P6,P7;
AND_BEGINP1 := I1 * I2;P2 := I1 * I3;P3 := I2 * I3;P4 := I1 * I2' * I3';P5 := I1' * I2 * I3';P6 := I1' * I2' * I3;P7 := I1 * I2 * I3;
END_END
OR_BEGINO1 := P1 + P2 + P3;O2 := P4 + P5 + P6 + P7;
OR_END
Truth table matrix:optimized intermediateresult
1 1 X 1 0
1 X 1 1 0
X 1 1 1 0
1 0 0 0 1
0 1 0 0 1
0 0 1 0 1
1 1 1 0 1
Integrated Electronic Systems Lab
13. Finite State Machines
Integrated Electronic Systems Lab13: FSMs 459
Finite State Machines - Basics
• Finite State Machines (FSMs) can be divided into 2 classes:
– Moore Machines• The outputs depend only on the current state
• The next state depends on current state and inputs
– Mealy Machines• The outputs depend on current state and inputs
• The next state depends on current state and inputs
Integrated Electronic Systems Lab13: FSMs 460
Logic
Moore Machines
Characteristics of a Moore Machine:• Outputs depend only on the current state
• Next state depends on current state and inputs
Φ
state
inputs
outputs
next state
Logic
State Register
Integrated Electronic Systems Lab13: FSMs 461
Logic
Moore Machines
Alternative implementation of a Moore Machine with registered outputs:• Outputs still depend only on the current state !
– (but are calculated from the next state signal now)
– At the rising clock edge, the next state and its corresponding outputs are loaded into the registers
Φ
state
inputs
outputsnext state
Logic
State Register
Φ
Integrated Electronic Systems Lab13: FSMs 462
Logic
Mealy Machines
Characteristics of a Mealy Machine:• Outputs and next state both depend on current state and inputs
Φ
state
inputs
outputs
next state
State Register
Integrated Electronic Systems Lab13: FSMs 463
Logic
Mealy Machines
Implementation of a Mealy Machine with registered outputs• Note that the required logic would be different from that of a Mealy
Machine with unregistered outputs (like the one shown on the previous slide)
Φ
state
inputs
outputs
next state
State Register
Φ
Integrated Electronic Systems Lab13: FSMs 464
Table Notation
• FSMs can be represented as a State Transition Table– The table exactly defines the values for the next state and all outputs (right
side of the table) depending on the current state and the inputs (left side)
– Logic functions can be easily derived from the table, e.g.
– Current state and next state are encoded binary (in the example: 3 bits)
– “Don‘t cares” in the input conditions
are indicated by an ‘x’
– In each state, every possible
combination of input values should
be covered by exactly one line in
the table (not more, not less)
current state
inputs next state outputs
S2S1S0 a b S2‘S1‘S0‘ x y
0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 1 0 0
0 0 0 1 x 1 0 1 0 0
0 0 1 1 0 0 1 0 0 1
... ... ... ...
...' 0120120 ++= aSSSbaSSSS
Integrated Electronic Systems Lab13: FSMs 465
Graph Notation
• FSMs can also be represented as a graph– Every state is a node in the graph
– Every state transition is an edge (arrow)• The arrows indicate which state is taken in the next cycle, depending on the inputs
and the current state
– State encoding is displayed inside the nodes
001
initial state some other state
state transition input condition (boolean expression)
010binary state encoding
Integrated Electronic Systems Lab13: FSMs 466
Example for a Moore Machine
000
111
010
a = 0
a = 1
a = 0
a = 1
always
current state
inputs next state outputs
S1S0 a S1‘S0‘ x
0 0 0 0 0 0
0 0 1 0 1 0
0 1 0 0 0 0
0 1 1 1 1 0
1 1 x 0 0 1
S1S0
x
current state
assigned output value
Notation:
Integrated Electronic Systems Lab13: FSMs 467
Example for a Mealy Machine
• Because the outputs of a Mealy Machine also depend on the inputs, the values assigned to them are annotated at the transitions
• The notation is:
00
11 01
a = 0 / x 0
a = 1 / x 1
a = 0 / x 0
a = 1 / x 1
always / x 1
current state
inputs next state outputs
S1S0 a S1‘S0‘ x
0 0 0 0 0 0
0 0 1 0 1 1
0 1 0 0 0 0
0 1 1 1 1 1
1 1 x 0 0 1
input condition / output assignment
Integrated Electronic Systems Lab13: FSMs 468
State Encoding
• The encoding of the states plays a key role for the implementation of a FSM
– It influences the complexity of the logic functions, the hardware costs of the circuits, timing issues, power, etc.
• Therefore, several common coding styles with different features exist
– regular encoding
– „one hot“ encoding
– ...
• The optimum choice depends on the used technology (ASIC, PLA, FPGA, etc.) as well as on the given design goals
Integrated Electronic Systems Lab13: FSMs 469
State Encoding
• Regular Encoding– The minimum number of bits is used to encode the states
• At least N bits are required to encode up to 2N states
– Codes can be assigned to states arbitrarily or according to certain rules (e.g., in order to minimize complexity of the logic)
– Advantages:• Minimum number of flipflops required
– Disadvantages:• Due to the compactness of the state encoding, the logic functions for
calculating the next state and the outputs can be become more complex
• On average, many bits switch when the state changesHigher power consumption
Glitches can occur
Integrated Electronic Systems Lab13: FSMs 470
State Encoding
• One Hot Encoding– N bits are used to encode N states
• In each state, exactly one bit is ‘1’, all others are ‘0’
• therefore the name “one hot” encoding
– Advantages:• In many cases, less logic is required
– many small logic functions are used instead of few complex functions
– particularly advantageous for FPGA implementations
• Low switching activity, resulting in ... lower power consumption
less glitches
– Disadvantages:• The number of required flipflops grows linearly with the number of states
High hardware costs for large FSMs
Integrated Electronic Systems Lab13: FSMs 471
State Encoding
• One Hot Encoding – Implementation Aspects– Best suited for distributed implementation
• One flipflop for each state
• One small transition logic for each flipflop
– Each flipflop can be used to directly activate some other hardware block or logic function that is only needed in this state
Logic FF
Logic FF
Logic FF
current state
Logic FF
– From an abstract point of view, all N flipflops together can also be seen as one single state register of size N
some specific functional
block
enable
Integrated Electronic Systems Lab13: FSMs 472
Examples for State Encoding
0001
1000
0010
0100
One Hot Encoding
00
11
01
10
Regular Encoding
Integrated Electronic Systems Lab13: FSMs 473
Examples for State Encoding
0001
1000
0010
0100
One Hot Encoding
00
11
01
10
Regular Encoding
1
0
0
0
0 0
Integrated Electronic Systems Lab13: FSMs 474
Examples for State Encoding
0001
1000
0010
0100
One Hot Encoding
00
11
01
10
Regular Encoding
0 1
0
0
1
0
Integrated Electronic Systems Lab13: FSMs 475
Examples for State Encoding
0001
1000
0010
0100
One Hot Encoding
00
11
01
10
Regular Encoding
1 0
0
0
0
1
Integrated Electronic Systems Lab13: FSMs 476
Examples for State Encoding
0001
1000
0010
0100
One Hot Encoding
00
11
01
10
Regular Encoding
1 1
0
1
0
0
Integrated Electronic Systems Lab
14. ASIC Design Concepts:Gate Arrays
& Standard Cells
Integrated Electronic Systems Lab14: Gate Arrays 478
Cost Issues
• Design Costs
• Non-recurring Engineering Costs (NRE)
• Manufacturing Costs
TotalCosts
Number of manufactured Chips
Design+ NRE
Costs= Fixed
Costs
Costsper Chip
Number of manufactured Chips
Design+ NRE
Costs= Fixed
Costs
Integrated Electronic Systems Lab14: Gate Arrays 479
Cost Issues: Design Costs
Design Costs reduced by
• raising level of abstraction
• re-use
• powerful synthesis methods
Cost-affecting Decisions:
• System Level: – System architecture
– Communication architecture
• Block-Level:– appropriate modeling of control-
dominated and data path oriented components
Synthesis:
• High-level Synthesis (allocation, scheduling, binding)
• Logic Synthesis (RTL to logic translation, FSM synthesis, logic optimisation, retiming)
• Layout Synthesis (module generators, PLA generators, Place & Route)
Integrated Electronic Systems Lab14: Gate Arrays 480
Cost Issues: Manufacturing Costs
...depending on Design Style:
ASIC
(synthesized)
Standard CellsMacro Cells Gate Arrays FPGAs/PLDs
Full CustomSemi Custom
Cell-based Array-based
Gate Arrays
Integrated Electronic Systems Lab14: Gate Arrays 481
Gate Arrays – Introduction (1)
Gate Arrays (Masterslices):
• Prefabricated active elements (master)
• Construction of logic functions by personalization (wiring macros from a cell library, intra-cell routing)
• Connection of functional blocks by inter-cell routing in 1...3 layers plus contact/via layers
• Arrangement of gate arrays:– row structure
– island structure
– matrix of structures (= sea of gates)
• Mixed analog/digital gate arrays
Integrated Electronic Systems Lab14: Gate Arrays 482
Gate Arrays – Introduction (2)
Gate array floor plan with row structure
Integrated Electronic Systems Lab14: Gate Arrays 483
Gate Arrays – Introduction (3)
Floor plan for a sea of gates array
Integrated Electronic Systems Lab14: Gate Arrays 484
Gate Array Design Flow
Integrated Electronic Systems Lab14: Gate Arrays 485
Qualification of Gate Array Design Style
• Advantages:– Lower number of individual masks needed
– Higher number of pieces for uncustomized master (cost reduction)
– Many others for masters, second source fabrication, libraries and design systems
• Disadvantages:– Area overhead (by unused transistor cells)
– Overdimensioned routing channels
– Larger cell size
Advantages dominate for smaller production volumes
Integrated Electronic Systems Lab14: Gate Arrays 486
Costs: Full Custom vs. Gate Array
• Gate Arrays: Reduction of fixed costs (reduced mask costs)
• Increased per piece costs, since utilisation of transistors is not optimal, therefore larger chip area and less yield, implying larger cost
TotalCosts
Number of manufactured Chips
Design+ NRE
Costs= Fixed
Costs
Costsper Chip
Number of manufactured Chips
Design+ NRE
Costs= Fixed
Costs
Full Custom
Gate Array
Integrated Electronic Systems Lab14: Gate Arrays 487
Standard Cells
• Standard cell libraries are required by almost all CAD tools for chip design
• Standard cell libraries contain primitive cells required for digital design
• However, more complex cells that have been specially optimized can also be included
• The main purpose of the CAD tools is to implement the so called RTL-to-GDS flow
• The input to the design process, in most cases, is the circuit description at the register-transfer level (RTL)
• The final output from the design process is the full chip layout, mostly in the GDSII (gds2) format
• To produce a functionally correct design that meets all the specifications and constraints, requires a combination of different tools in the design flows
• These tools require specific information in different formats
Integrated Electronic Systems Lab14: Gate Arrays 488
Standard Cell Library Formats
• The formats explained here are for Cadence tools, howerver similar information is required for other tool suites.
• Physical Layout (gdsII, Virtuoso Layout Editor)– Should follow specific design standards eg. constant height, offsets etc.
• Logical View (verilog description or TLF or LIB)– Verilog is required for dynamic simulation. Place and route tools usually can use TLF.– Verilog description should preferably support back annotation of timing information.
• Abstract View (Cadence Abstract Generator, LEF)– LEF: Contains information about each cell as well as technology information
• Timing, power and parasitics (TLF or LIB)– Transistor and interconnect parasitics are extracted using Cadence or other extraction
tools.– Spice or Spectre netlist is generated and detailed timing simulations are performed.– Power information can also be generated during these simulations.– Data is formatted into a TLF or LIB file including process, temperature and supply
voltage variations.– Logical information for each cell is also contained in this file.
Integrated Electronic Systems Lab14: Gate Arrays 489
Standard Cell Design Flow
Integrated Electronic Systems Lab14: Gate Arrays 490
Standard Cell Layout
• Routing Grids
• Both vertical and horizontal routing grids need to be defined
• HVH or VHV routing is defined for alternating metals layers
• All standard cell pins should ideally be placed on intersection of horizontal and vertical routing grids
• Exceptions are abutment type pins (VDD and GND)
• Grids are defined wrt the cell origin
• Grids can be offset from the origin, however by exactly half the grid spacing
• The cell height must be a multiple of the horizontal grid spacing
• All cells must have the same height, but some complex cells can be designed with double height
• The cell width must be a multiple of the vertical grid spacing
• However, limited routing tracks are the bottleneck even with wider cells
Integrated Electronic Systems Lab14: Gate Arrays 491
Standard Cell Layout
Integrated Electronic Systems Lab14: Gate Arrays 492
Standard Cells
Integrated Electronic Systems Lab14: Gate Arrays 493
Standard Cell Layout
Integrated Electronic Systems Lab14: Gate Arrays 494
Standard Cell Example: Layout of Inverter
Integrated Electronic Systems Lab14: Gate Arrays 495
Standard Cell Example: Layout of NAND2
Integrated Electronic Systems Lab14: Gate Arrays 496
Standard Cell Library
• Cell libraries determine the overall performance of the synthesized logic
• Synthesis engines rely on a number of factors for optimization
• The cell library should be designed catered solely towards the synthesis approach
• Here are some guidelines:– A variety of drive strengths for all cells
– Larger varieties of drive strengths for inverters and buffers
– Cells with balanced rise and fall delays (for clock tree buffers/gated clocks)
– Same logical function and its inversion as separate outputs, within same cell
– Complex cells
– High fanin cells
Integrated Electronic Systems Lab14: Gate Arrays 497
Standard Cell Library
– Variety of flip-flops, both positive and negative edge triggered, preferably with multiple drive strengths
– Single or Multiple outputs available for each flip-flop (e.g. Q only, or Qbar only or both), preferably with multiple drive strengths
– Flops to contain different inputs for Set and Reset (e.g. Set only, Reset only, both)
– Variety of latches, both positive and negative level sensitive
– Several delay cells. Useful for fixing hold time violations
– To enable scan testing of the designs, each flip-flop should have an equivalent scan flop
• Using high fan-in reduce the overall cell area, but may cause routing congestion inadvertently causing timing degradation. Therefore they should be used with caution
Integrated Electronic Systems Lab
15. Programmable Logic Devices
Integrated Electronic Systems Lab15: PLDs 499
Overview
• Introduction
• Programming Technologies
• Basic Programmable Logic Device (PLD) Concepts
• Complex PLD
• Field Programmable Gate Array (FPGA)
• CAD (Computer Aided Design) for FPGAs
• Design flow for Xilinx FPGAs
• Economical Considerations
• Logic design Alternatives
Integrated Electronic Systems Lab15: PLDs 500
• A Programmable Logic Device is an integrated circuit with internal logic gates and interconnects. These gates can be connected to obtain the required logic configuration.
• The term “programmable” means changing either hardware or software configuration of an internal logic and interconnects.
• The configuration of the internal logic is done by the user.
• PROM, EPROM, PAL, GAL etc. are examples of Programmable Logic Devices.
Introduction
Integrated Electronic Systems Lab15: PLDs 501
Programmable Logic Device can be programmed in two ways:
1. Mask programming (in some few cases)
2. Field programming (typical)
1.) Mask programming: programming of device is done in the mask level.
+ good timing performance due to internal connections hardwired during manufacture
+ cheap at high volume production
- programmed by manufacturer
- development cycle = weeks or months
- not re-programmable
Programming Technologies
Integrated Electronic Systems Lab15: PLDs 502
2.) Field programming: Programming of device is done by the user. The programming technologies are of two types
Permanent type (Non-volatile):• Fuse (normal on) - ‘CLOSE (intact)’ ‘OPEN (blown)’• Anti-fuse (normal off) - just the opposite of a FUSE• EPROM• EEPROM
Nonpermanent type (Volatile):• driving n-MOS pass transistor by SRAM• NOTE:
-When power of device is switched off then the content of SRAM is lost.
Programming Technologies (II)
Integrated Electronic Systems Lab15: PLDs 503
1.) PLA (Programmable Logic Array):
• array of AND and OR gates are programmable• product term sharing: every product term of the AND array can be
connected to the input of any OR gate • unidirectional input/output pins
Basic PLD Concepts
Figure 1: PLA device
Integrated Electronic Systems Lab15: PLDs 504
Basic PLD Concepts (II)
2.) Memory based: Device with fixed AND array and programmable OR array
• output of OR gate has fixed connection with input of AND gates
• PROM, EPROM and EEPROM are memory based PLD device
3.) PAL/GAL(Programmable Array Logic/ Gate Array Logic):
AND array is programmable and OR array has fix connection with outputs of AND gates. PAL/GAL devices may have bi-directional I/O pins.
There are three different types of PAL/GAL devices
• combinational PAL devices are used for the implementation of logic function
• sequential PAL devices are used for the implementation of sequential
logic (finite state machines)
• arithmetic PAL devices sum of product terms may be combined by XOR
gates at the input of the macrocell D flip-flop
Integrated Electronic Systems Lab15: PLDs 505
Basic PLD Concepts (IV)
Additional features of PAL/GAL devices
• PAL: - EPROM - based programming Technology
• GAL: - has array of programmable AND gates and OLMC (Output
Logic Macro Cell)- EEPROM - based programming Technology - programmable output polarity- device can be configured as dedicated input and output mode
Integrated Electronic Systems Lab15: PLDs 506
Figure 2:
Combinational PAL device, AMD PAL16L8
Integrated Electronic Systems Lab15: PLDs 507
Figure 3:
Sequential PAL devices, AMD PAL16R8
Integrated Electronic Systems Lab15: PLDs 508
Figure 4:
Arithmetic PAL device, AMD PAL16A4
Integrated Electronic Systems Lab15: PLDs 509
Figure 5: GAL device, GAL 16V8
• GAL16V8 has 8 configurable OLMC (Output Logic Macro Cell)
• each OLMC has programmable XOR to get active low or high outputsignal
• there is a feedback from output to input
Integrated Electronic Systems Lab15: PLDs 510
• is combination of multiple PAL or GAL type devices on a single chip
• CPLD architectures consists of
- Macrocells
- configurable flip-flop (D, T, JK or SR)
- Output enable/clock select
- Feedback select
• CPLD has predictable time delay because of hierarchical inter-connection
• easy to route, very fast turnaround
• performance independent of netlist
• devices is erasable and programmable with non-volatile EPROM or EEPROM configuration
• wide designer acceptance
• has more logic density than any classical PLDs device
• relatively mature technology, but some innovation still ongoing
Complex PLD (CPLD)
Integrated Electronic Systems Lab15: PLDs 511
Figure 6:
Complex PLD device Altera EP1800
Complex PLD (II)
Integrated Electronic Systems Lab15: PLDs 512
• EP1800 is erasable PLD device and has 48 macrocells, 16 dedicated input pins and 48 I/O pins.
• device is divided into four quadrants, each contains 12 macrocells and has local bus with 24 lines and a local clock
• out of 12 microcells, 8 are “local” macrocells and 4 are “global” macrocells
Figure 7: Local macrocell Figure 8: Global macrocell
Erasable CPLD
Integrated Electronic Systems Lab15: PLDs 513
• global bus has 64 lines and runs through all of the four quadrants (true and complement signals of 12 inputs (=24 lines) + true and complement of 4 clocks (=8 lines) + true and complement of I/O pins of the 4 global macro cells in each quadrant (=32 lines)
• macrocells: combinational or registered data output; the flip-flop is configurable as D, T, JK or SR type.
Erasable CPLD (II)
Figure 9: Synchronous clock, output enable by product term
Figure 10: Asynchronous clock, output permanently enabled
Integrated Electronic Systems Lab15: PLDs 514
Figure 11: Block diagram of Altera MAX 7000 family
Electrically Erasable PLD
• MAX 7000 is EEPROM based programmable logic device
• it’s architecture includes following elements,
- Logic Array Blocks (LABs)
- Macrocells- Programmable Interconnect Array (PIA)
- I/O control blocks• Pin to pin delay is about 5
ns • predictable delay because
of hierarchical routing structure of PIA
Integrated Electronic Systems Lab15: PLDs 515
Figure 12: MAX 7000 device, macrocell
Electrically Erasable PLD (II)
• each Logic Array Block (LAB) has 16 macrocells
• each macrocell consists of logic array, product term select matrix and programmable register
• the product term select matrix allocates product terms from logic array to use them as either primary logic inputs to OR and XOR gate or secondary inputs to clear, preset, clock and clock enable control function for the register of macrocell
Integrated Electronic Systems Lab15: PLDs 516
Figure 13:
MAX 7000 device, programmable Interconnect Array (PIA)
Electrically Erasable PLD (III)
• logic is routed among LABs via the PIA.
• dedicated inputs, I/O pins, and macrocell outputs feed the PIA, which makes the signals available throughout the entire device
• only the signals required by each LAB are actually routed from the PIA into the LAB
• selecting of signal from PIA to LAB is done by an EEPROM cell
Integrated Electronic Systems Lab15: PLDs 517
Field Programmable Gate Array
• FPGA is a general purpose, multi-level programmable logic device
• FPGA is composed of,
- logic blocks to implement combinational and sequential
logic circuit
- programmable interconnect wire to connect input and output of logic blocks
- I/O blocks logic blocks at periphery of device for the external connection
•“The routing resources are both the greatest strength and weakness
of the FPGA’s”
Integrated Electronic Systems Lab15: PLDs 518
Field Programmable Gate Array (II)
Figure 14: Symmetrical arrayarchitecture of FPGAs
Integrated Electronic Systems Lab15: PLDs 519
• There are four main categories of FPGAs available commercially,
- symmetrical array
- row - based
- hierarchical PLD
- sea of gates
• They are differ to each other on their interconnection and how they are programmed
Field Programmable Gate Array (III)
Figure 15: Category of different FPGA
Integrated Electronic Systems Lab15: PLDs 520
Programming Technologies
• Currently, there are four programming technologies for FPGAs,
- static RAM cells
- anti fuse
- EPROM transistor
- EEPROM transistor
Static RAM programming technology:
a) pass-transister b) transmission gate
c) multiplexer
Figure 16: SRAM based programming technology
Integrated Electronic Systems Lab15: PLDs 521
• completely reusable - no limit concerning re-programmability
• pass gate closes when a “1” is stored in the SRAM cell
• allows iterative prototyping
• volatile memory - power must be maintained
• large area - five transistor SRAM cell plus pass gate
• memory cells distributed throughout the chip
• fast re-programmability (tens of milliseconds)
• only standard CMOS process required
SRAM Programming technology
Integrated Electronic Systems Lab15: PLDs 522
• An anti-fuse is the opposite of normal fuse. • Anti-fuse are made with a modified CMOS process having an extra step• This step creates a very thin insulating layer which separates two
conducting layers • That thin insulating layer is fused by applying a high voltage across the
conducting layer• Such high voltage can be destructive for CMOS logic circuit • Non-volatile (Permanent)• Requires extra programming circuitry, including a programming
transistor
Anti-fuse Programming
Integrated Electronic Systems Lab15: PLDs 523
Actel PLICE Anti-fuse programming technology
Figure 17: Actel PLICE anti-fuse structure
• The Actel PLICE anti-fuse consists of a layer of positively doped silicon (n+ diffusion), a layer of dielectric (Oxygen-Nitrogen-Oxygen) and a layer of polysilicon
• it is programmed by placing a relatively high voltage (18V) across the anti-fuse terminals which results current of about 5 mA through it
• typical resistance of a fused contact is 300 to 500 Ω
• manufactured by 3 additional masks to a normal CMOS process
Integrated Electronic Systems Lab15: PLDs 524
Quicklogic ViaLink Anti fuse programming technology
Figure 18 : Four layer Metal ViaLink structure Figure 19: ViaLink
element
• amorphous silicon is used as an insulating layer
• direct metal to metal contact results path resistance below 50 Ω
• 10 V terminal voltage is required to fuse the amorphous silicon
Integrated Electronic Systems Lab15: PLDs 525
EEPROM programming technology
• static charge on floating gate turns the transistor permanently off • re-programmable• non-volatile• external permanent memory is not required• slow re-configuration time• floating-gate FET has relatively high on resistance• higher static power consumption due to pull up resistor
Figure 20:
EEPROM programming technology
Integrated Electronic Systems Lab15: PLDs 526
Commercially available FPGAs
Integrated Electronic Systems Lab15: PLDs 527
Xilinx FPGA
Figure 21: General architecture of Xilinx FPGA
• Xilinx architecturecomprises of two dimensional array of logic block called as CLB.
• They are interconnected via horizontal and vertical routing channel
• I/O Blocks are user configurable to provide an interface between external package pin and input logic
• I/O can be configured as input, output and bi-directional signal
Integrated Electronic Systems Lab15: PLDs 528
Figure 22: Xilinx XC4000 CLB
Xilinx FPGA (II)
• Xilinx XC4000 is an SRAM based FPGA
• each CLB has three LUTs (Look Up Tables) and two flip-flops.
• result of combinatorial logic is stored in 16x1 SRAM LUTs
• LUTs can be also used as RAM
• combinatorial results of CLB is passed to the interconnect network or can be stored in flip-flops and pass to the interconnect network
• with two stage of LUTs, two functions of 4 variables or one function of 5 variables can be implemented
Integrated Electronic Systems Lab15: PLDs 529
Figure 24: Switch matrix
Figure 23: Programmable interconnect associated with XC4000 series CLB
Xilinx FPGA (III)
Horizontal longlines
Single length lines
Double length lines
Integrated Electronic Systems Lab15: PLDs 530
Xilinx FPGA (IV)
• interconnects of XC4000 device are arranged in horizontal and vertical channels
• each channel contains some number of wire segments • They are,Single length lines:
• they span a single CLB • provide highest interconnect flexibility and offer fast routing• acquire delay whenever line passes through switch matrix• they are not suitable for routing signal for long distance
Double length lines:• they span two CLB so that each line is twice as long as single length
lines• provide faster signal routing over intermediate distance
Longlines:• Longlines form a grid of metal interconnect segments that run entire
length or width of the array • they are for high fan-out and nets with critical delay
Integrated Electronic Systems Lab15: PLDs 531
Xilinx, Virtex-II ProTM FPGA family
• The Virtex-II Pro Platform FPGA is the most technically sophisticated silicon and software product development in the history of the
programmable logic industry.
• The Virtex-II Pro FPGAs are manufactured in a 0.13-micron process.
• It is capable of implementing high performance System-On-a-Chipdesigns with low development cost
• It can be used in the application such as system architectures in networking applications, deeply embedded systems and digital signal processing systems etc.
• Virtex-II Pro devices incorporates one to four PowerPC 405 processorcores. The PowerPC 405 cores are fully embedded within the FPGA, where all processor nodes are controlled by the FPGA routing resources.
• Each PowerPC 405 core is capable of more than 300 MHz clock frequency.
Integrated Electronic Systems Lab15: PLDs 532
Xilinx, Virtex-II ProTM FPGA family (II)
Figure 25: Virtex-II Pro Generic Architecture Overview
• The Virtex-II Pro FPGA consists of the following components:
- Embedded Rocket I/O™ Multi-Gigabit Transceivers (MGTs)
- Processor Blocks containing embedded IBM ® PowerPC ® 405 RISC CPU (PPC405) cores and integration circuitry
- FPGA fabric based on Virtex- II architecture.
Integrated Electronic Systems Lab15: PLDs 533
Xilinx, Virtex-II ProTM FPGA family (III)
• CLB (Configurable Logic Block) include four slices and two 3-state buffers
• Each slice is equivalent and contains:
• Two function generators (F & G)
• Two storage elements• Arithmetic logic gates• Large multiplexers• Wide function capability• Fast carry look-ahead chain• Horizontal cascade chain
(OR gate)
Figure 26: CLB (Configurable Logic Block) of Virtex-II Pro FPGA
Integrated Electronic Systems Lab15: PLDs 534
Xilinx, Virtex-II ProTM FPGA family (IV)
• IOB blocks include six storage elements, as shown in Figure.
• Each storage element can be configured either as an edge-triggered D-type flip-flop or as a level-sensitive latch.
• On the input, output, and 3-state path, one or two DDR (Double Data Rate) registers can be used.
• Double data rate is directly accomplished by the two registers on each path, clocked by the rising edges (or falling edges) from two different clock nets.
Figure 27: IOB block of Virtex-II Pro FPGA
Integrated Electronic Systems Lab15: PLDs 535
Actel/TI FPGA architecture
Figure 28: General architecture of Actel FPGA
• Actel offers three main families:
- Act 1, Act 2, Act 3
• programmable Logic blocks are arranged in row
• horizontal routing channels are arranged between the adjacent rows
• Actel FPGA are based on anti fused technology
• instead of LUTs, it has multiplexer
Integrated Electronic Systems Lab15: PLDs 536
Actel/TI FPGA architecture (II)
Act-1 Logic Module:• The Act-1 logic module has 8 - input and 1- output logic circuit
• it has only combinatorial logic circuit module• The Logic Module can implement the four basic functions which are NAND, AND, NOR and OR
Figure 29: Act-1 logic module
Integrated Electronic Systems Lab15: PLDs 537
Actel/TI FPGA architecture (III)
Figure 30: Act-2 logic module
C module
S module
Act-2 Logic Module:• Act-2 family has two module architecture, consisting of C module
(Combinatorial) and S module (Sequential) • the Logic Module is optimized for both combinatorial and sequential
designs
Integrated Electronic Systems Lab15: PLDs 538
Act-3 Logic Module:
• it comprises an AND and OR gate that are connected to a multiplexer-based circuit block.
• The multiplexer circuit is arranged such that, in combination with the two logic gates, a very wide range of functions can be realized in a single logic block
• about half of the logic blocks in an Act-3 device also contains a flip-flop
Figure 31: Act-3 Logic module
Actel/TI FPGA architecture (IV)
Integrated Electronic Systems Lab15: PLDs 539
Figure 32: Act-1 programmable interconnection architecture
Actel/TI FPGA architecture (V)
Integrated Electronic Systems Lab15: PLDs 540
CAD for FPGAsInitial Design Entry
Logic Optimization
Technology Mapping
Placement
Routing
Programming Unit
Configured FPGAFigure 33: Design flow for FPGA
Integrated Electronic Systems Lab15: PLDs 541
DESIGN IMPLEMENTATION
Design Entry
Design validation
Device Selection
Design Synthesis Optimization
Mapping
Placement
Routing
Design validation/ Back Annotation
Bits Stream generationDownload to Xilinx
FPGA
Design validation
Design flow for Xilinx FPGA
Integrated Electronic Systems Lab15: PLDs 542
Economical Considerations
Figure 34: Cost per Chip
Integrated Electronic Systems Lab15: PLDs 543
FPGA MPGA1. Cost per chip is less for low
volumes (low fixed cost)2. Short turnaround time3. Design flexibility is high and
cost for re-designing is low4. Speed is relatively slow
because of resistance andcapacitance of theprogrammable switch
5. Programmable switches andconfiguration network requirechip area, this resultsdecreased in logical density
1. Less cost per chip for high volumes2. Fabrication is done with hardwired
metal connection layer, this resultsfast operation
3. High logic density4. Very high costs for low volumes
(high fixed cost)5. No redesign flexibility
Economical Considerations (I)
Integrated Electronic Systems Lab15: PLDs 544
Logic design Alternatives
SSI andMSI Ics
PLDs Programmablegate arrays
Gatearrays
CustomICs
Chip complexity small medium medium large ultra large
Speed Fast Slow tomedium
Slow tomedium
Slow tofast
Fast
Functiondefined by user
No Yes Yes Yes Yes
Time tocustomize
- Seconds Seconds Months Year
Userprogrammable
No Yes Yes No No
Integrated Electronic Systems Lab15: PLDs 545
Logic design Alternatives (I)
Figure 35: Relative merits of various ASIC implementation styles
Integrated Electronic Systems Lab15: PLDs 546
CPLDs and FPGAs
Architecture More Combinational Gate array-likeMore Registers + RAM
Density Low-to-medium Medium-to-high
Performance Predictable timing Application dependent
Interconnect “Crossbar Switch” Incremental
Complex Programmable Logic Device (CPLD)
Field-Programmable Gate Array (FPGA)
Integrated Electronic Systems Lab
16. Arithmetic Units
Integrated Electronic Systems Lab16: Arithmetic Units 548
Basic Adder Cells
• Half Adder:• Can be used to calculate the sum of two bits A1 and A2.
• Full Adder:
• For adding binary numbers having a bitwidth of more than one single bit.
• These equations can be realized either by logic gates (AND, OR, XOR) or by two half-adders and an OR gate.
Adders / Subtracters
21AAC =
21 AAS ⊕=
2121 )( AAAACC inout ++=
inout CAAS ⊕⊕= 21
Integrated Electronic Systems Lab16: Arithmetic Units 549
Adders / Subtracters for Binary Coded Integers
Serial Adders
• The n-bit sum and the carry output are available after (n+1) clock cycles (1 operand load, n calculations).
• The serial adder has the smallest hardware complexity (wordlength independent if the shift registers are not considered) but requires the highest computation time of all adder implementations.
Integrated Electronic Systems Lab16: Arithmetic Units 550
Adders / Subtracters for Binary Coded Integers
Parallel Adders
• Ripple Carry Adder:
• Chained full-adders where the carry „ripples“ through the whole chain from the LSB to the MSB.
• The addition time depends on the wordlength of the operands.
Integrated Electronic Systems Lab16: Arithmetic Units 551
Parallel Adders
• Carry Lookahead Adder:• The carry input of a stage i is calculated directly from the input of
the preceding stages i-1, i-2, ... i-k.
• The Cout of ordinary full adders are substituted by the generateand propagate signals:
• The carry input of stage i+1 is defined by:
• Example (4 bit adder):
iii bag =iii bap +=
11 −+==+ iiiiin cpgcc
i
inin cpgcc 000 1+==
inin cppgpgcc 010111 2++==
inin cpppgppgpgcc 0120121222 3+++==
inout cppppgpppgppgpgcc 012301231232333 ++++==
Integrated Electronic Systems Lab16: Arithmetic Units 552
• The carry lookahead circuits can be realized by a two level logic implementation: the addition is performed in constant time.
• Carry lookahead adder for 4 bits:
• The number of gate inputs (the wordlength) is restricted due to technological constraints.
Integrated Electronic Systems Lab16: Arithmetic Units 553
• Clustered Carry Lookahead Adder:
• Big wordlengths are split into smaller groups processed by carry lookahead adders with reasonable length.
• The carry ripples through different blocks as in the carry ripple adder.
• Alternative: a group-generate and group-propagate signals can be generated and then evaluated by a second-level carry lookahead circuit.
Integrated Electronic Systems Lab16: Arithmetic Units 554
• Carry Select Adder:
Integrated Electronic Systems Lab16: Arithmetic Units 555
• Carry Select Adder:
– The additions are performed in each cluster in parallel for the following cases:
• Carry in is „0“
• Carry in is „1“
– Cluster carry out and partial sum C/Sum[i:j] are forwarded to multiplexors.
– The multiplexors select the appropriate value depending on the carry output of the preceding stages.
– The overall addition time is almost independent of the wordlength.
– The hardware amount is almost twice that of a ripple carry adder.
– It is slower than a carry lookahead adder.
– Has a higher regularity, thus better suited for VLSI implementation.
Integrated Electronic Systems Lab16: Arithmetic Units 556
• Carry Save Adder:
Ex. for 4 operands
(V, W, X, Y):
Integrated Electronic Systems Lab16: Arithmetic Units 557
• Carry Save Adder:
– Achieves constant addition time complexity.
– The propagation of computed carry results is avoided.
– S and Cout are connected to the correct adder in the succeeding stage.
– Requires a final addition to merge the sum and the carry vector of the final stage (e.g. with a carry ripple adder).
– The adder delay is increased by one full-adder delay if it is extended by an additional operand.
Integrated Electronic Systems Lab16: Arithmetic Units 558
Shift and Add (SAA) Multiplier
• The most common multiplier
• Multiplies two unsigned integer words X and Y of bit-size Nx and Ny:
• The following recurrence can be derived:
• At each step, one bit of X is AND-ed with Y and added to Di which is shifted one bit.
Multipliers
∑−
=
=1
0
2xN
i
iixX ∑
−
=
=1
0
2jN
j
jjyY
( )( )( )∑−
=−− +++==⋅=
1
0021 2222
x
xx
N
iNN
ii YxYxYxYxYXZ KK
00 =D YxDD iii += −+
11 2 12 −= x
x
NNDZ
Integrated Electronic Systems Lab16: Arithmetic Units 559
• It takes N clock cycles to complete the multiplication (one bit of X is processed each step).
• The delay is approximately NyδFA (where δFA is the delay of a full adder).
• The cost of a SAA multiplier is (3N + 2N)γFA (the cost of a full adder γFA is assumed to be equal to the cost of a register).
Integrated Electronic Systems Lab16: Arithmetic Units 560
Carry Save Multiplier (CSM)
• Calculates the result in one step.• Every bit of the first argument is multiplied with every bit of the second
argument concurrently.• The CSM consists of combinatorial logic only.• Example for two 4-bit binary numbers:
X3 X2 X1 X0
Y3 Y2 Y1 Y0
P30 P20 P10 P00
P31 P21 P11 P01
P32 P22 P12 P02
P33 P23 P13 P03
Z7 Z6 Z5 Z4 Z3 Z2 Z1 Z0
where Pij = Xi Λ Yj
Integrated Electronic Systems Lab16: Arithmetic Units 561
Part III Part II Part I
• It is assumed that Nx ≥ Ny (if Nx = Ny, then Part II is omitted).
• The multiplier delay is (Nx + Ny - 2)δFA
• The cost is (Nx - 1)NyγFA plus (2Ny + 2Nx) γFA, if X, Y, and the Z-register are accounted.
Integrated Electronic Systems Lab16: Arithmetic Units 562
Block Multiplier
• Can be configured from working fully serial to working fully parallel.
• Arguments divided into blocks of same size.
• Individual blocks are multiplied in a fast Carry Save Multiplier.
• The arguments and the intermediate result have to be shifted in an appropriate way.
Integrated Electronic Systems Lab16: Arithmetic Units 563
• The intermediate result has to be shifted in both directions (requires a bidirectional shift register).
• The controller can be realized using a simple counter.
• The multiplier needs kx·ky clock cycles to perform a multiplication (where kx and ky are the number of separated blocks of the first and of the second argument, respectively).
Integrated Electronic Systems Lab
17. Microarchitectures
Integrated Electronic Systems Lab17: Microarchitectures 565
Microarchitecture
• Components:
– Data Path
– Control Path (can be interpreted like a FSM)
• hardwired
• programmable
– I/O Unit
Integrated Electronic Systems Lab17: Microarchitectures 566
Datapath Design
• Example:
• Implementation:– Standard cells (gates, muxes, registers, ...).
Or:
– Datapath compiler: several layout tiles.
Integrated Electronic Systems Lab17: Microarchitectures 567
• Layout scheme:
• Datapath compiler: creates a regular layout by stacking the appropriate number of tiles (depending on the wordlengths of the operands).
• Bit slice: a horizontal slice of tiles performing all functions for a single bit.
• Functional slice: vertical layout block implementing a single function.
Integrated Electronic Systems Lab17: Microarchitectures 568
Bit-slice ALU AMD 2901
• 16-word register set
• Q register (used in add-shift multiplications and divisions)
• ALU
• Shifter
• Instruction decoder
• All operations and registers are designed for 4-bit operands.
Integrated Electronic Systems Lab17: Microarchitectures 569
Bit-slice ALU AMD 2901
• The instructions are encoded in a 9-bit I vector, provided by an external microcode controller.
• First table: selection of the sources for both ALU inputs (R and S).
• Second table: ALU functions.
• Third table: ALU results.
Integrated Electronic Systems Lab17: Microarchitectures 570
16-bit bit-sliced ALU:
• Cascaded 2901 ICs for wordlengths with multiples of 4 bits.
• Simple carry propagation scheme (alternatively, carry-lookahead circuits AMD 2902 can be used).
Integrated Electronic Systems Lab17: Microarchitectures 571
Controller Implementations
Combinational logic block implementation:
• Early microprocessors (≤ 8 bit) and RISC: random logic
– separate gates
– modifications require redesigning of a whole combinational gate network
• CISC processors: microprogramming– regular layout structures (ROMs or PLAs)
– modifications in the control sequence require only to redefine the contents of a PLA or ROM
Integrated Electronic Systems Lab17: Microarchitectures 572
Microprogrammed Controllers
ROM based controller PLA based controller
• Microinstruction = the concatenation of the control signals (for the
data path) and the next address (NA).
Integrated Electronic Systems Lab17: Microarchitectures 573
Horizontal Microinstructions
• Control word directly applied to the controlled circuit.
• Each control point has a corresponding entry in the control word.
• Very long control words
• Big control memories
• Very specific encoding is possible
• High degree of parallelism in the operations
Integrated Electronic Systems Lab17: Microarchitectures 574
Vertical Microinstructions
• n-bit control word: 2n configurations possible (hardly used).
• M control vectors are encoded into a vector of [log2M] bits.
• The n-bit control word is fetched from a secondary memory: control vector decoder (ROM or PLA).
• Alternative: encoding the control vector in groups for different units (ALU, shifter,...).
• Group by group decoding instead of using a single and large control vector decoder.
Integrated Electronic Systems Lab17: Microarchitectures 575
Microcode / Nanocode Controller
• Microinstruction = a sequence of nanoinstructions.
• MNA (microcode next address) register is halted while the nanocode sequence runs.
• Feedback via the NNA (nanocode next address): control sequences can be generated by the nanocode PLA.
• If the same nanocode sequences are used in many microinstructions, savings in implementation area are achieved.
Integrated Electronic Systems Lab
17. Semiconductor Memories
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 577
• Introduction
• Read Only Memory (ROM)
• Nonvolatile Read/Write Memory, esp. Flash (RWM)
• Static Random Access Memory (SRAM)
• Dynamic Random Access Memory (DRAM)
• Summary
Overview
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 578
Market
2 main driving forces for emerging technologies:
Total DRAM market 2008: 31 B$ (Source: Gartner 2009)
Total Flash market 2008: 28 B$ (Source: Gartner 2009)
Total SRAM market 2008: 2 B$ (Source: Gartner 2009)
Find lower cost solutions (shrinks capabilities are limited by costs rather than physics)
Find „unified memory“ combining strength of all known technologies (e.g. low power & speed)
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 579
Memory Requirement
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 580
Physical Principles of Semiconductor Memories
Memory Type Physical effectDRAM Charge (capacitor)
SRAM Cross coupled transistors
Flash Charge (gate of FET)
CBRAM Ion relocation Resistance
FeRAM Polarization
MRAM Magnetization Resistance
ORAM Phase Change Resistance
PCRAM Material phase Resistance
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 581
Semiconductor Memory Classification
SRAM - Static Random Access Memory
DRAM - Dynamic Random Access Memory
FIFO - First-In First-Out
LIFO - Last-In First-Out
Volatile Memory
Read/Write MemoryRead Only Memory
(ROM)
EPROME2PROMFLASH
RandomAccess
Non-RandomAccess
SRAMDRAM
Mask-Programmable ROM
Programmable ROM FIFOLIFO
Shift Register
Non-Volatile Memory
Read/Write Memory(RWM)
EPROM - Erasable Programmable ROM
E2PROM - Electrically Erasable Programmable ROM
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 582
Random Access Memory Array Organization
Each memory cell
• stores one bit of binary information (”0“ or ”1“ logic)
• shares common connections with other cells: rows, columns
Memory array
• Memory storage cells
• Address decoders
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 583
• Simple combinatorial Boolean network which produces a specific output for each input combination (address)
• ”1“ bit stored - absence of an active transistor• ”0“ bit stored - presence of an active transistor
• Organized in arrays of 2N words
• Typical applications:• store the microcoded instructions set of a microprocessor• store a portion of the operation system for PCs• store the fixed programs for microcontrollers (firmware)
Read Only Memory - ROM
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 584
Mask Programmable NOR ROM (1)
NOR ROM with 4-bit words
• Each column Ci (NOR gate) corresponds to one bit of the stored word
• A word is selected by rising to “1“ the corresponding wordline
• All the wordlines are “0“ except the selected wordline which is “1“
• ”1“ bit stored - absence of an active transistor
• ”0“ bit stored - presence of an active transistor
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 585
Mask Programmable NOR ROM (2)
• “1” bit stored - the drain/source connection (or the gate electrode) are omitted in the final metallization step
• “0” bit stored - the drain of the corresponding transistor is connected to the metal bit line
common ground line
D
D
S
S
G
G
Cost efficient, since few masks have to be manufactured only
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 586
Implant Mask Programmable NOR ROM
Idea: deactivation of the NMOS transistors by raising their threshold voltage above the VOHlevel through channel implants
• “1” bit stored - the corresponding transistor is turned off through channel implant
• “0” bit stored - non-implanted (normal) transistors
Advantage: higher density (smaller area)!
D
D
D
S
S
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 587
Implant Mask Programmable NAND ROM (1)
• Each column Ci (NAND gate) corresponds to one bit of the stored word
• A word is selected by putting to “0“ the corresponding wordline Ri
• All the wordlines Ri are “1“ except the selected wordline which is “0“
Normally on transistors: have a lower threshold voltage (channel implant)
NAND ROM with 4-bit words
• “1” bit stored - presence of a transistor that can be switched off
• “0” bit stored - shorted/normally-on transistor
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 588
Implant-Mask-Programmable NAND ROM (2)
4x4 bit NAND ROM array layout
• The structure is more compact than NOR array (no contacts)
• The access time is larger than NOR array access time (chain of nMOS)
R1
D
S
D
S
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 589
NOR Row Address Decoder for a NOR ROM Array
NOR ROMArray
• The decoder must select out one row by rising its voltage to “1” logic
• Different combinations for the address bits A1A2 select the desired row
• The NOR decoder array and the NOR ROM array are fabricated as two adjacent arrays, using the same layout strategy
A1 A2 R1 R2 R3 R4
0 0 1 0 0 0
0 1 0 1 0 0
1 0 0 0 1 0
1 1 0 0 0 1
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 590
NAND Row Address Decoder for a NAND ROM Array
• The decoder has to lower the voltage level of the selected row to logic “0” wile keeping all the other rows at logic “1”
• The NAND row decoder of the NAND ROM array is implemented using the same layout strategy as the memory itself
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 591
NOR Column Address Decoder for a NOR ROM Array
NOR Address decoder + 2M pass transistors
• Large area!
Binary selection tree decoder
• No need for NOR address decoder, but are necessary additional inverters!
• Smaller area
• Drawback - long data access time
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 592
Nonvolatile Read-Write Memories
• The architecture is similar to the ROM structure
• Array of transistors placed on a word-line/bit-line grid
• Special transistor that permits its threshold to be altered electrically
• Programming: selectively disabling or enabling some of these transistors
• Reprogramming: erasing the old threshold values and start a new programming cycle
Method of erasing:
• ultraviolet light - EPROMs
• electrically - EEPROMs
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 593
EPROM (1)
The floating gate avalanche-injection MOS (FAMOS) transistor:
• extra polysilicon strip is inserted between the gate and the channel - floating gate
• impact: double the gate oxide thickness, reduce the transconductance, increase the threshold voltage
• threshold voltage is programmable by the trapping electrons on the floating gate through avalanche injection
Schematic symbol
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 594
Avalanche injectionRemoving programming voltage leaves charge trapped
Programming results in higher VT
EPROM (2)
• Electrons acquire sufficient energy to became “hot” and traverse the first oxide insulator (100nm) so that they get trapped on the floating gate
• Electron accumulation on the floating gate is a self-limiting process that increases the threshold voltage (~7V)
• The trapped charge can be stored for many years
• The erasure is performed by shining strong ultraviolet light on the cells through a transparent window in the package
• The UV radiation renders the oxide conductive by direct generation of electron-hole pairs
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 595
EPROM (3)
• The erasure process is slow (~min.)
• The erasure procedure is off-system!
• Programming takes several usecs/word
• Limited endurance - max 1000 erase/program cycles
• The cell is very simple and dense: large memories at low cost!
• Applications that do not require regular reprogramming
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 596
EEPROM
• Reversible programming by reversing the applied voltage (rise and lower the threshold voltage) difficult to control the threshold voltage extra transistor required as access device
• Larger area than EPROM
• More expensive technology than EPROM
• Offers a higher versatility than EPROM
• Can support 105 erase/write cycles
• Provide an electrical-erasure procedure
• Modified floating-gate device, floating-gate tunneling oxide (FLOTOX):
• reduce the distance between floating gate and channel near the drain
• Fowler-Nordheim tunneling mechanism(when apply 10V over the thin insulator)
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 597
Flash Memories
Combines the density of the EPROM with the versatility of EEPROM structures
• Programming: avalanche hot-electron-injection
• Erasure: Fowler-Nordheim tunneling (as for EEPROM cells)
• Difference: erasure is performed in bulk for the complete (or subsection of) memory chip -reduction in flexibility!
• Extra access transistor of the EEPROM is eliminated because the global erasure process allows a careful monitoring of the device characteristics and control of the threshold voltage!
• High integration density
ETOX Flash cell - introduced by INTEL
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 598
Static Random Access Memory - SRAM (1)
• Permit the modification (writing) of stored data bits
• The stored data can be retained infinitely, without need of any refresh operation
• Data storage cell - simple latch circuit with 2 stable states
• Any voltages disturbance the latch switches from one stable point to the other stable point
• Two switches are required to access (r/w) the data
vI2 4 60
Unstable Q-Point
Stable Q-Point
Stable Q-Point
v = vo I
vo
0
2
4
6
VOL
VOH
0 1 0
vI v o
(a)
1 2 1
2
1
1
0
0
vI
v o(b)
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 599
Static Random Access Memory - SRAM (2)
a) general structure of a SRAM cell based on two inverter latch circuit
b) implementation of the SRAM cell
c) resistive load (undoped polysilicon resistors) SRAM cell
d) depletion load NMOS SRAM cell
e) full CMOS SRAM cell
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 600
Resistive Load SRAM Cell - Operation Principle (1)
• MP1,2 pull up transistors - charge up the large column parasitic capacitances CC, CC
• The steady-state voltage: VCc= VDD -VT ~ 3.5V
The basic operations on SRAM cells
RS = 1 (M3, M4 on)
• Read/Write “1”
• Read/Write “0”
RS = 0 (M3, M4 off)
• data is being held
V1 V2
Here we define the memorycontent to be located
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 601
Resistive Load SRAM Cell - Operation Principle (2)
• Write “1” operation (RS = 1 - M3, M4 on)
VC - forced to 0 by data write circuitry, V2 decreases to 0, M1 off; V1 increases;
Final state: V1= 1, V2= 0
• Read “1” operation (RS = 1 - M3, M4 on)
M1 off; M2, M4 on; VC - pulled down , VC > VC read as a logic “1”
• Write “0” operation (RS = 1 - M3, M4 on)
VC - forced to 0 by data write circuitry, V1 goes to 0, M2 off; V2 increases to 1
Final state: V1= 0, V2= 1
• Read “0” operation (RS = 1 - M3, M4 on)
M2 off; M1, M3 on; VC - pulled down, VC < VC read as logic 0
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 602
Full CMOS SRAM Cell
• Low-power SRAM Cell: the static power dissipation is limited by the leakage current during a switching event
• The pMOS pull-up transistors allow the column voltage to reach full VDD level
• High noise immunity due to larger noise margins
• Lower power supply voltages than resistive-load SRAM cell
• Drawback: large area!
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 603
CMOS SRAM Cell Design Strategy (1)
Layout of the resistive-load SRAM cell Layout of the CMOS SRAM cell
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 604
CMOS SRAM Cell Design Strategy (2)
(1) The data read operation should not destroy the stored information
Assume that a logic “0” is stored in the cell (V1 = 0, V2 = 1: M1, M6-linear; M2, M5-off)
• RS = 0: M3, M4-off;
• RS = 1: M3-saturation; M4, M1-linear
VC decreases , V1 increases slowly
Condition - M2 must remain turned off during the data reading operation:
V1, max ≤ V T,2 ; IM3 = IM1 ⇒
( )( )2
,
,,
1
3
2
5.12
nTDD
nTnTDD
VV
VVV
LWL
W
−
−<
⎟⎠⎞
⎜⎝⎛
⎟⎠⎞
⎜⎝⎛
Design rule:
A symmetrical rule is valid also for M2 and M4
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 605
(2) The cell should allow modification of the stored information during the data write phase
CMOS SRAM Cell Design Strategy (3)
Consider the write “0“ operation, assuming that “1“ is stored in the cell (V1 = 1, V2 = 0: M1, M6-off; M2, M5-linear)
• RS = 0: M3, M4-off;
• RS = 1: M3, M4 saturation, M5-linear
In order to change the stored information: V1 = 0, V2 = 1 ⇒ M1 on and M2 off!
But V2 < VT1 (previous design condition) ⇒ M1 cannot be switched on! ⇒ M2 must be switched off ⇒ V1 must be reduced below VT2
V1 ≤ V T,2 ; IM3 = IM5 ⇒
VDD 0V0V
( )( )2
,
,,
3
5
2
5.12
pTDD
nTnTDD
p
n
VV
VVV
LWL
W
+
−=
⎟⎠⎞
⎜⎝⎛
⎟⎠⎞
⎜⎝⎛
µµ
Design rule:
A symmetrical rule is valid also for M6 and M4
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 606
SRAM Write Circuitry
Write operation is performing by forcing the voltage level of either column (bit line) to “0”
W DATA WB WB Operation
0 1 1 0 M1-off, M2-on, VC high, VC low
0 0 0 1 M1-on, M2-off, VC low, VC high
1 X 0 0 M1, M2 off, VC, VC high
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 607
SRAM Read Circuitry
The read circuitry must detect a very small difference between the two complementary columns (sense amplifier)
( )( ) Dn
GS
Dmm
CC
oo IkV
IgwheregR
VV
VV2 ,21 =
∂∂
=•−=−∂−∂
The gain can be increased by using• active loads• cascode configuration
Precharging of bit lines plays a significant role in the access time!
• The equalization of bit lines prior to each new access (between two access cycles)
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 608
Dual Port SRAM Arrays
Allows simultaneous access to the same location in the memory array (systems with multiple high speed processors).
• Eliminates wait states for the processes during data read operation• Problems can occur if:
• two processors attempt to write data simultaneously onto the same cell• one processor attempts to read while other writes data onto the same cell
• Solution: contention arbitration logic
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 609
Summary of the SRAM properties
– 6 Transistors required (layout area about 100F2)
– Circuit is always in a stable state
– Current/Power consumption only by change of state
– Area required: approx 3 * area of an inverter
– Very fast read and write cycles
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 610
Introduction to the DRAM cell
• The typical DRAM cell consists of 1 Transistor / 1 Capacitor
WL (= Wordline)
BL
(= B
itlin
e)
CS
WRITE: WL-Activation
Writing a 1 (or 0) to BL and to CS
WL-Deactivation – CS,
isolated,transistor is off
1
VCS
VCS
VWL
VBL
VDD-Vth
VDD/2
VDD
1
– Transistor on
t
t
t
0
0
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 611
WL (= Wordline)
BL
(= B
itlin
e)CS
CBL
READ
WL-Activation
Transferring CS-Charge to BL towards a sense amplifier
Loading BL to VDD/2; BL not driven
VCS
VWL
VBL
VDD-Vth
VDD/2
VDD
t
t
t
– Transistor on
CBL >> CS !
Introduction to the DRAM cell
1
1
• The typical DRAM cell consists of 1 Transistor / 1 Capacitor
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 612
DRAM realization (Trench)
Single-sided buried strap(= cell contact)
Deep trench isolation:- strap cut- Isolation collar
Wordline ( = gate)
CB (contact to bitline)
Bitline
Deep trench:- common electrode- storage electrode
Current path
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 613
DRAM Stack realization (buried wordline)
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 614
DRAM Stack realization (buried wordline)
614
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 615
Summary of the DRAM properties
– One one transistor / one capacitor needed – very efficient and cheap! (area currently 8F2-6F2, path to 4F2
demonstrated)
– Capacitor is leaking, therefore refresh cycles required
– Very low area for realization required!
– Somehow slower read & write cycles compared to SRAM
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 616
High End: Graphics DRAMQimonda 512Mbit GDDR5, 2008
11326.74um
9898um
Application area: high end graphic cards (ATI HD4870)up to 6Gbit/p/s (HD4870: 115GB/s)
Technology: 75nm3 Metal layer interconnect (Al, W)Area: 112mm2
750 Mio TransistorsSelling price: at launch time about 8 US$
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 617
Flash
• non-volatile memory (10 years)Flash : =• electrical programmable & erasable
- EEPROM: single bytes erasable- Flash: large blocks erasable
• applications:- camera- mobile - chip card- solid state disk / storage...
• storage element: = MOS transistor
with adjustable threshold voltage:transistor on <-> off
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 618
source drain
substrate
Idrain
Vgate
floating gate
control gate
TOX
thick dielectric (gate coupling)
Charge storage,completely encapsulated
keeps carge 10 yearsTOX thickness ca. 8 nm
Flash Introduction
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 619
source drain
substrate
floating gate
control gate
TOX
thick dielectric (gate coupling)
Samsung
Flash Introduction
Charge storage,completely encapsulated
keeps carge 10 yearsTOX thicknes ca. 8 nm
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 620
source drain
substrate
Idrain
Vgate
Flash Introduction
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 621
source drain
Vcontrol gate = 2.5V
Vdrain = 1V
substrate
No current Is-d
Negative charge => no current.
+ ++ +
Idrain
Vgate2.5V
source drain
Vcontrol gate = 2.5V
Vdrain = 1V
substrate
Is-d = 30 µANeutral (or positive) charge in Floating Gate => current along channel.
+ ++ +
Idrain
Vgate2.5V
Flash Introduction
The 2 storage states:
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 622
source drain
substrate
source drain
substrate
0V 0V 0V 0V
∆ Vt ≈ 6V∆ Q ≈ 500 electrons
- 20V + 20V
e-e-
Electrical programming & erase
Flash Introduction
Leakage rate < 1 electron per week !
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 623
Electric Field
source drain
substrate
source drain
substrate
0V 0V 0V 0V
Thunderstorm flash lightening:0.03 MV/cm
- 20V + 20V
e-e-
Elektrical programming & erase
Flash Introduction
Flash transistor programming field:10 MV/cm
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 624
NAND Flash Memory Architecture Bitline
Wordline
Cell array Memory Cell
Verstärker
Input / Output
Decoding of bits
DecoderControl-signal
Semiconductor Memories
Bit accessed at intersect Word / Bit -line Verstärker/ Decoder
NAND has string of 32 cells, w. select transistors at end
4F2 cell size with 2/4 bits per cell
Smallest cell sizes of all memories
But: slow random single bit access
Usage for large data storage (fast serial data access)
System level solves limitations of serial access
GSL WL32 BSLWL2WL1
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 625
NOR Flash Memory Architecture
NOR cells are fully in parallel
Random access / erase, but low density
Usage for execute in Place (XiP) (no need to copy into RAM before executing)
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 626
Model: Trap assisted Tunneling
Mechanism of Slow Charge Loss
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 627
Floating Gate vs Trapping
Si
O
Poly
ONO
Poly Si
ONO
EC
EVPoly
Cell optimisation:
Intense work on dielectrics /energy barriers
Floating Gate
2 options how to bring the charge in Poly or Nitride layer:
Fowler Nordheim (FN) or Hot Electron programming (CHE)
2 options how to bring the charge in Poly or Nitride layer:
Fowler Nordheim (FN) or Hot Electron programming (CHE)
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 628
Floating gate – today‘s winner for Data Flash!
Vgate
I Dra
in
programming
erase
electrons in floating gate
Vt high
no electrons floating gate
Vt low
Different Levels of Vt allow storage of 2 bits/cell
Program / erase by Fowler/ Nordheim tunnelingEnergy barrier required to secure 10 years of retentionHigh voltage (10-20V)Slow prog./erase (µs-ms)
Vt
500-1000 electrons
18: Semiconductor MemoriesIntegrated Electronic Systems Lab 629
Summary of Established Memory Technologies
Every established memory technology has shortcomings.
Quest for universal memory, that combines non-volatility withhigh speed, high write endurance and a small size
SRAM DRAM NAND Flash NOR Flash
Cell Size per bit in F2 100 8 ..6.. (4) 2 5
Retention Time ∞(with power)
64ms 10 yrs 10 yrs
Random Read Access
2-100ns 30ns 10µs 90ns
Random Write Access
2-100ns 30ns100µs
(erase 100ms)10µs
(erase 100ms)
Endurance >1015 >1015 >1015 read105 write
>1015 read105 write
Established memory technologies:
Integrated Electronic Systems Lab
19. ASIC Design Guidelines
Integrated Electronic Systems Lab19: ASIC Design Guidelines 631
Introduction
• The following design guidelines have been adapted from [2]: European Silicon Structures (ES2), Zone Industrielle, 13106 France. Solo 2030 User Guide, e02a02 edition, June 1992
• These recommendations are useful in order to avoid functional faults and get the desired functionality
Integrated Electronic Systems Lab19: ASIC Design Guidelines 632
Synchronous Circuits (1)
• All data storage elements are clocked
• The same active edge of a single clock is applied at precisely the same time to all storage elements
Integrated Electronic Systems Lab19: ASIC Design Guidelines 633
Synchronous Circuits (2)
• NON-RECOMMENDED CIRCUITS:– Flip-flop driving clock input of another Flip-flop:
– The clock-input of the second FF is skewed by the clock-to-q delay of the first FF and not activated at every activation clock edge (e.g. ripple counter)
Integrated Electronic Systems Lab19: ASIC Design Guidelines 634
Synchronous Circuits (3)
• NON-RECOMMENDED CIRCUITS:– Gated clock line:
– Clock skew caused by gating the clock line (e.g. multiplexer in clock line)
Integrated Electronic Systems Lab19: ASIC Design Guidelines 635
Synchronous Circuits (4)
• NON-RECOMMENDED CIRCUITS:– Double-edged clocking:
– FFs are clocked on the opposite edges of the clock signal
– Insertion of scan-path impossible
– Difficulties in determining critical path lengths
Integrated Electronic Systems Lab19: ASIC Design Guidelines 636
Synchronous Circuits (5)
• NON-RECOMMENDED CIRCUITS:– Flip-flop driving asynchronous reset of another Flip-flop:
– Synchronous design principle, that all FFs change state at exactly the same time is not fulfilled
• Recommended Circuits will be described during the following sections
Integrated Electronic Systems Lab19: ASIC Design Guidelines 637
Clock Buffering (1)
• NON-RECOMMENDED CIRCUITS:– Unequal depth of clock buffering:
– causes clock skew
Integrated Electronic Systems Lab19: ASIC Design Guidelines 638
Clock Buffering (2)
• NON-RECOMMENDED CIRCUITS:– Unbalanced fanout of clock buffers:
– Clock skew by different
load-dependent delays
– Excessive clock fanout
should be avoided (slow edges)
Integrated Electronic Systems Lab19: ASIC Design Guidelines 639
Clock Buffering (3)
• Recommended circuits:– Balanced clock tree buffering
– Same depth of buffering
– Same fanout
– Limited fanout in order to
achieve sharp clock edges
Integrated Electronic Systems Lab19: ASIC Design Guidelines 640
Clock Buffering (4)
• Recommended circuits:– Combined geometric/tree buffering
– Using intermediate buffer
of suitable strength at each
fanout point
Integrated Electronic Systems Lab19: ASIC Design Guidelines 641
Gated Clocks (1)
• NON-RECOMMENDED CIRCUITS:– Multiplexer on clock line:
– Signal change at multiplexer input can cause a glitch at the clk input (FF captures invalid data)
– Gating the clock line introduces clock skew
Integrated Electronic Systems Lab19: ASIC Design Guidelines 642
Gated Clocks (2)
• Recommended circuits:1) Enabled (E-type) flip-flop: 2) Toggle (T-type) flip-flop:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 643
Double-edged Clocking (1)
• NON-RECOMMENDED CIRCUITS:– Pipelined logic with double-edged clocking:
– Not recommended in context with scan-path methods
Integrated Electronic Systems Lab19: ASIC Design Guidelines 644
Double-edged Clocking (2)
• Recommended circuits:– Pipelined logic with single-edged clocking:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 645
Asynchronous Resets (1)
• NON-RECOMMENDED CIRCUITS:– Flip-flop driving the asynchronous reset of another flip-flop:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 646
Asynchronous Resets (2)
• Recommended circuits:– Global asynchronous reset by external signal:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 647
Asynchronous Resets (3)
• Recommended circuits: – Flip-flop driving the synchronous reset of another flip-flop:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 648
Shift Registers (1)
• NON-RECOMMENDED CIRCUITS:– Shift register with forward or reverse chain of clock buffers:
– Internal clock skew can cause data fallthrough
Integrated Electronic Systems Lab19: ASIC Design Guidelines 649
Shift Registers (2)
• Recommended circuits:– Shift register with balanced tree of clock buffers:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 650
Asynchronous Inputs (1)
• NON-RECOMMENDED CIRCUITS:– Circuits with complicated feedback loops to capture asynchronous
inputs (very sensitive to noise, and functionality can be influenced by placement and routing delays)
Integrated Electronic Systems Lab19: ASIC Design Guidelines 651
Asynchronous Inputs (2)
• Recommended circuits:– Chain of two or more D-type flip-flops for capturing an asynchronous
input:
– The probability of propagating a metastable state is decreased with increasing number of register stages
Integrated Electronic Systems Lab19: ASIC Design Guidelines 652
Asynchronous Inputs (3)
• Recommended circuits:– Use of 4-bit register as shift register for capturing an asynchronous
input:
– The probability of propagating a metastable state is decreased with increasing number of register stages
Integrated Electronic Systems Lab19: ASIC Design Guidelines 653
Asynchronous Inputs (4)
• Recommended circuits:– Asynchronous handshake circuit:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 654
Asynchronous Inputs (5)
• The asynchronous handshake ciruit works as follows:a) The first flip-flop is reset asynchronously when the r input is zero or
when the qb outputs of the second and the third FF both have the value 0
b) The q-output of the first FF is asynchronously set to high, when a positive edge arises at its ck-input
c) The high output of the first FF is propagated through the second and the third FF in the two following cycles. The q-outputs of these FFs are set to zero and the reset logic for the first FF is activated. Now the first FF is ready to receive another edge at its input.
d) ...
Integrated Electronic Systems Lab19: ASIC Design Guidelines 655
Asynchronous Inputs (6)
d) Three cases of metastability caused by simultaneously rising edges of the asynchronous input and the system clock:
1) the second FF stabilizes to q=1 before the next rising clock edge (circuit works as desired)
2) the second FF settles to q=0 and the third FF remains in its state. Since the output q of the first FF is high, the propagation of this output works correctly, but it needs one cycle more than in the first case.
3) The metastable state of the second FF is still there at the next rising edge of the clock signal. Then the third FF also becomes metastable. The probability of receiving a metastable d (internal) signal can be reduced by increasing the length of the register chain.
Integrated Electronic Systems Lab19: ASIC Design Guidelines 656
Asynchronous Inputs (7)
• Operation of asynchronous handshake circuit:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 657
Delay Lines and Monostables (1)
• NON-RECOMMENDED CIRCUITS:– In general, it cannot be recommended to build circuits with a
functionality that relies on delays.
– E.g. monostable pulse generator:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 658
Delay Lines and Monostables (2)
• NON-RECOMMENDED CIRCUITS:– Pulse generator using flip-flop:
– Multivibrator:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 659
Delay Lines and Monostables (3)
• Recommended circuits:– Synchronous pulse generator:
– Usage of higher clock speed
– Minimum time resolution is given by clock cycle
Integrated Electronic Systems Lab19: ASIC Design Guidelines 660
Bistable Elements (1)
• NON-RECOMMENDED CIRCUITS:– Cross-coupled flip-flops and RS-flip-flops
– Bistable storing elements formed by cross-coupled NAND or NOR gates:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 661
Bistable Elements (2)
• NON-RECOMMENDED CIRCUITS:– Asynchronous RS-flip-flop:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 662
Bistable Elements (3)
• Recommended circuits:– Use D-types with set/reset
– Use latch configured as RS flip-flop:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 663
RAMs and ROMs in Synchronous Circuits 1
• Problem: RAMs are double-edge triggered. The address is latched on the opposite edge to the data
• Timing scheme:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 664
RAMs and ROMs in Synchronous Circuits 2
• Recommended circuits:– Interfacing RAM into synchronous circuit: ME and WEbar generation
Integrated Electronic Systems Lab19: ASIC Design Guidelines 665
RAMs and ROMs in Synchronous Circuits 3
• Recommended circuits:– Using flip-flop for WEbar generation: timing scheme
Integrated Electronic Systems Lab19: ASIC Design Guidelines 666
RAMs and ROMs in Synchronous Circuits 4
• Recommended circuits:– Avoiding floating RAM/DPRAM output propagation
Integrated Electronic Systems Lab19: ASIC Design Guidelines 667
Tristates (1)
• NON-RECOMMENDED CIRCUITS:– Tristate bus with non-central enable control:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 668
Tristates (2)
• Recommended circuits:– Tristate bus with central control of all tristate enable signals and one
additional driver that is activated on non-controlled states
Integrated Electronic Systems Lab19: ASIC Design Guidelines 669
Tristates vs. Multiplexer
Tristates:
– large area
– limited buffering
– large routing load slow
• Control decoding expense is the same for tristates and multiplexers.
• Multiplexers are more favourable
Multiplexer:
– small area
– efficient routing
Integrated Electronic Systems Lab19: ASIC Design Guidelines 670
Parallel Signals
• NON-RECOMMENDED CIRCUITS:– Wired-OR part used to create higher fanout:
• Recommended Circuits:– High-fanout buffer replacing wired OR part
Integrated Electronic Systems Lab19: ASIC Design Guidelines 671
Fanout (1)
• NON-RECOMMENDED CIRCUITS:– Excessive fanout on
control signals:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 672
Fanout (2)
• Recommended circuits:– Geometric buffering
on control signal:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 673
Fanout (3)
• Recommended circuits:– Tree buffering
on control signal:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 674
Design for Speed (1)
• Use a maximum of 2 inputs on all combinational logic gates:
• Use AOI logic (complex cells from standard cell library) where possible. The figure below shows a multiplexer using AOI logic:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 675
Design for Speed (2)
• Feed late changing inputs late into combinational logic:
• Use shift (Johnson) counters instead of binary counters:
q0 q1 q2 q3
0 0 0 0
1 0 0 0
1 1 0 0
1 1 1 0
1 1 1 1
0 1 1 1
0 0 1 1
0 0 0 1
0 0 0 0
Integrated Electronic Systems Lab19: ASIC Design Guidelines 676
Design for Speed (3)
• Use duplicate logic to reduce fanout:
• Use fast library cells where available
• Reduce length of critical signal paths
• Use Schmitt trigger inputs in noisy environments
Integrated Electronic Systems Lab19: ASIC Design Guidelines 677
Design for Testability (1)
• Testability = Controllability + Observability
• NON-RECOMMENDED CIRCUITS:– Circuit with inaccessible internal logic: only the first block is
controllable, and only the last block is directly observable
Integrated Electronic Systems Lab19: ASIC Design Guidelines 678
Design for Testability (2)
• Recommended circuit:– Insert test inputs and outputs
Integrated Electronic Systems Lab19: ASIC Design Guidelines 679
Design for Testability (3)
• NON-RECOMMENDED CIRCUITS:– Chain of counters: first counter is not directly observable and
second counter is not directly controllable
Integrated Electronic Systems Lab19: ASIC Design Guidelines 680
Design for Testability (4)
• Recommended circuit:– Break long counter / shift register chains
– Chain of counters broken by test input tc and output signals:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 681
Design for Testability (5)
• NON-RECOMMENDED CIRCUITS:– Counter with closed feedback loop: initial state is not known
Integrated Electronic Systems Lab19: ASIC Design Guidelines 682
Design for Testability (6)
• Recommended circuit:– Open feedback loops
– Counter with feedback loop opened by test control tr and output signals:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 683
Design for Testability (7)
• Recommended circuits:– Use BIST (Built-In-Self-Test) with compiled megacells
– Compiled megacell with compiled inputs/outputs:
Integrated Electronic Systems Lab19: ASIC Design Guidelines 684
Design for Testability (8)
• Recommended circuits:– Scan path testing
– E-type scan path flip-flop (right):
– Circuit with scan path (below):
Integrated Electronic Systems Lab19: ASIC Design Guidelines 685
Design for Testability (9)
• Recommended circuits:– Use of JTAG boundary scan path
– JTAG test circuitry:
Integrated Electronic Systems Lab
20. Testing andDesign for Testability
Integrated Electronic Systems Lab20: Testing 687
Motivation
• Stable chip manufacturing costs
• Increasing testing costs:– Increasing number of gates/device
– Limited number of pins
– Increasing number of internal states
– Increasing logical and sequential depth
• Example:– Testing of a combinational
circuit with n inputs
(10 MHz, one test per cycle)
• Testability has to be considered in all
phases of design
n time for test
25 3 s
30 107 s
40 1 day
50 3,5 years
60 3656 years
Integrated Electronic Systems Lab20: Testing 688
Economical Considerations (1)
• Average Quality Level (AQL):
rtsAcceptedPa
artsDevectivePaql
#
#=
Integrated Electronic Systems Lab20: Testing 689
Economical Considerations (2)
• Correlation: Fault Coverage and Defective Parts
Integrated Electronic Systems Lab20: Testing 690
Economical Considerations (3)
• Correlation: Fault Coverage and Defective Parts
– DL(=AQL): Defect Level; Number of defective circuits which have been classified as correct working (testing with T )
– Y: yield
– T: fault coverage
TYDL −−= 11
Integrated Electronic Systems Lab20: Testing 691
Economical Considerations (3)
Defect level as function of yield and fault coverage
Integrated Electronic Systems Lab20: Testing 692
Design Flow: Testing (1)
Integrated Electronic Systems Lab20: Testing 693
Design Flow: Testing (2)
• Chip Test after Manufacturing:
Manufacturing Process
Parametric Test (current/power dissipation)(erroneous chips are marked with color points and removed after sawing)
Chip Test on Tester
Integrated Electronic Systems Lab20: Testing 694
Fundamental Definitions
• Relationship between faults, errors and failures:
• Fault: physical defect, imperfection or flaw which occurs in a hardware or software component
• Error: manifestation of a fault (erroneous information on a hardware line or in a program, caused by a fault)
• Failure: malfunction of a system
• Three-universe model of a system:
fault error failure
Physical Universe
Informational Universe
External Universe
Faults Errors Failures
Integrated Electronic Systems Lab20: Testing 695
Fault Models (1)
• Basis: physical phenomena– Oxide defects
– Missing implants
– Lithographic defects
– Junction defects
– Metal shorts & opens
– Moisture accumulation
– Impurities / Contaminations
– Static discharge
• Examples for physical faults:
Integrated Electronic Systems Lab20: Testing 696
Fault Models (2)
Integrated Electronic Systems Lab20: Testing 697
Fault Models for Gates (1)
• The GATE model: Stuck-at– stuck @0
– stuck @1
– 1 fault at a time (single-stuck)
PHYSICAL (analog)
LOGICAL (digital)
Integrated Electronic Systems Lab20: Testing 698
Fault Models for Gates (2)
• Issue: complexity– as 1 model .......................
• 12 faults
– as 12 gates ......................................................
• 30 (collapsed) faults
• 12x larger netlist
• 30x computation
– as 60 transistors ................
• 90 (collapsed) faults
• 60 transistors
• 400x computation
Integrated Electronic Systems Lab20: Testing 699
Fault Models for Gates (3)
• The controversy:– IBM: comprehensive stuck-at no empirical need for MOS fault
models
– UNISYS: MOS model required for < 1% AQL
Integrated Electronic Systems Lab20: Testing 700
Fault Models for Gates (4)
• The MOS problem: Gates Memory
• Example: the output floats ..................................– Fault-free: C always driven
– Fault: C un-driven;
assumes last value;
sequential !
• Need 2-pattern test ...........– set C to opposite
– test
Set Test
branch A B A B
a 0 0 1 1
0 1 1 1
1 0 1 1
b 1 1 0 1
c 1 1 1 0
Anything works !
Integrated Electronic Systems Lab20: Testing 701
Fault Tolerant Design (1)
• Fault tolerance achieved by redundancy techniques:– Duplication with Complementary Logic
– Self-Checking Logic
– Reconfigurable Array Structures
Fault detection by duplication with complementary logic
Integrated Electronic Systems Lab20: Testing 702
Fault Tolerant Design (2)
4-by-4 array with one spare column
Integrated Electronic Systems Lab20: Testing 703
Fault Tolerant Design (3)
Reconfigured array
Integrated Electronic Systems Lab20: Testing 704
Test Pattern Generation (1)
• manually
• pseudo random (leads up to 60% fault coverage)
• algorithmic
• special test patterns for RAMs
• fault coverage sufficient ?fault simulation
Integrated Electronic Systems Lab20: Testing 705
The D-Algorithm (1)
• Every test generation procedure has to solve the following problems:– Creation of a change at the faulty line– Propagation of the change to the primary output line
• In the D-Algorithm the symbols and are used to refer to the changes. and are used as follows:
– : used if a line has the value 1 in absence of a fault and the value 0 in case of a fault occurrence
– :used if a line has the value 0 if no fault occurs and otherwise the value 1
• The D-algorithm method for path sensitization consists of two principal phases:
– forward drive (propagation) of an D-value to an primary output – backward trace (consistency operation)
• These two steps are iterated for different propagation paths for the D-value from one dedicated internal point i to one dedicated primary output point o until the backward trace phase is finished without any contradiction (a test vector for a fault at i has been found) or until all possible paths from i to o have been examined.
DDDD
D
D
Integrated Electronic Systems Lab20: Testing 706
The D-Algorithm (2)
Basic concept of D-algorithm
Integrated Electronic Systems Lab20: Testing 707
The D-Algorithm (3)
• A primitive D-cube of a failure is a D-cube associated with a fault on the output line l of a gate G. This produces the value or on l and the input lines have values which would produce in the fault-free case.
Primitive D-cube of fault (pdcf) for two-input NAND gate
α/lDD
α
Integrated Electronic Systems Lab20: Testing 708
The D-Algorithm (4)
• A propagation D-cube of a failure specifies the propagation of changes at one (or more) inputs of a gate G to its inputs l.
Propagation D-cube (pdc) for two-input NAND gate
Integrated Electronic Systems Lab20: Testing 709
The D-Algorithm (5)
• A singular cover of a gate G is a 0, 1, X truth table representation of G.
Singular cover for two-input NAND gate
Integrated Electronic Systems Lab20: Testing 710
The D-Algorithm (6)
Singular covers for several basic logic gates
Integrated Electronic Systems Lab20: Testing 711
The D-Algorithm (7)
Construction of the singular cover of a logic module
Integrated Electronic Systems Lab20: Testing 712
D-Algorithm Example (1)
• In the following the D-Algorithm is illustrated for the example circuit given below:
Integrated Electronic Systems Lab20: Testing 713
D-Algorithm Example (2)
Propagation D-cube table
Integrated Electronic Systems Lab20: Testing 714
D-Algorithm Example (3)
Singular cover table
Integrated Electronic Systems Lab20: Testing 715
D-Algorithm Example (4)
D-cube intersection table
Integrated Electronic Systems Lab20: Testing 716
D-Algorithm Example (5)
• Running the D-Algorithm for generating a test for line 5/0:1) Start with D-cube for the fault 5/0:
2) The D of line 5 is automatically propagated to line 6 and 7 by cube j
3) Now the propagation along path 6 9 11 is considered: D on line 6 is propagated to line 9 by cube d. Combining d and k yields cube l:
Integrated Electronic Systems Lab20: Testing 717
D-Algorithm Example (6)
• Running the D-Algorithm (continued):4) If cube i is used with instead of D, the propagation to the output
can be done:
5) Now the consistency phase is started and a value for line 4 has to be found. From the singular cover table it can be seen that a 0 on line 10 implies both line 7 and line 8 to be 1. In cube m line 7 is a D(and also line 5 which is connected to 7 by j), and this D must now be set to 1 which is a contradiction that disables the path sensitization 5 6/7 9 11.
D
Integrated Electronic Systems Lab20: Testing 718
D-Algorithm Example (7)
• Running the D-Algorithm (continued):6) Starting the propagation along 5 7 10 11 leads to the
following cube:
7) From the singular cover table we get the information that a 1 on line 8 is the same as a 0 on line 4. Additionally, it can be seen that the 0 on line 9 can be obtained by a 1 on line 1.
8) This yields the final cube:
1 1 1 0 D D D 1 0
9) A test vector for line 5/0 is given by:
1 1 1 0
D D
Integrated Electronic Systems Lab20: Testing 719
Fault Simulation
• Algorithms: Serial Fault Simulation
• Improved Algorithms:– Parallel Fault Simulation
– Concurrent Fault Simulation
discussed in CAD lecture
Integrated Electronic Systems Lab20: Testing 720
Design for Testability (1)
• Circuit level: restriction of physically possible faults
• Logic level: restrict possibilities of realizations
• System level: restrict size of component and number of states
Testability:
• controllability
• observability
• additional chip area required
• shorter design cycle
Methods to improve controllability and observability:
• ad-hoc techniques
• structured approaches
Integrated Electronic Systems Lab20: Testing 721
Design for Testability (2)
Design for testability: complex gate (a) not testable with stuck-at model; (b) fully testable with stuck-at model
Integrated Electronic Systems Lab20: Testing 722
Design for Testability (3)
• Ad-Hoc Techniques:– developed for special design
– less silicon area
– design automation almost impossible
– partitioning (test of circuit components by use of dedicated multiplexers)
Integrated Electronic Systems Lab20: Testing 723
Design for Testability (4)
Ad-hoc techniques: partitioning for testability
Integrated Electronic Systems Lab20: Testing 724
Design for Testability (5)
A-hoc techniques: insertion of register in order to limit logic depth to a given maximum value
Integrated Electronic Systems Lab20: Testing 725
Design for Testability (6)
Ad-hoc techniques :test shift registers for PLA test (increasing PLA area)
Integrated Electronic Systems Lab20: Testing 726
Scan-Path Methods (1)
• Main idea: test of sequential network is reduced to test of combinational network
• for circuits consisting of logic with some feedbacks
• can be realized by reconfiguration of latches as shift registers (two modes of use)
Feedback logic with scan-path
Integrated Electronic Systems Lab20: Testing 727
Scan-Path Methods (2)
• Test scan-path / register function first:– Flush test ( 0...010...0 ) or
– Shift test ( 00110011... ) (each register transfer is tested by this combination: 0 0, 0 1, 1 1, 1 0 ).
• Cycle for testing combinational logic function:1) Scan mode: Preload Y and set PI
2) System operation mode: Wait until inputs of Y are steady. Clock new state into Y.
3) Shift state out. Compare PO and state values with expected responses.
Integrated Electronic Systems Lab20: Testing 728
Scan-Path Methods (3)
• Advantages:– Testability of clocked circuits is improved and guaranteed at design
stage
– Consistent with good VLSI design practice (rules, abstraction, modularity, ...)
– Does not require special CAD
• Disadvantages:– Wastes silicon
– Constrains designer to design according given conditions
– Additional complexity
• Overhead:– 2% for a fundamentally ‘structured’ design
– 30% for ‘wild’ logic
~~
Integrated Electronic Systems Lab20: Testing 729
Built-In Tests (1)
• System generates test vectors by its own
• Analysis and evaluation of test vectors is also automatically done
• Compromise: silicon testability
Test Pattern Generators:
• Test patterns are generated inside the circuit to be tested
• Short design time, simple test programs, self-test
• Example: Test pattern memories, deterministic generators, counter
Integrated Electronic Systems Lab20: Testing 730
Built-In Tests (2)
Two examples for built-in test pattern generators
Integrated Electronic Systems Lab20: Testing 731
Built-In Tests (3)
• Pseudo Random Number Generators:– used as pseudo random pattern generator
011
1
1
1
)(
2) (mod ))1((*)(
2für )1()(
kxkxkxkxK
txktx
nitxtx
nn
nn
n
iiii
ii
++++=
−=
≤≤−=
−−
=
−
∑L
Integrated Electronic Systems Lab20: Testing 732
Built-In Tests (4)
• Pseudo Random Number Generators:– Example for pseudo random pattern generator:
1)( 4 ++= xxxK
Integrated Electronic Systems Lab20: Testing 733
Evaluation of Testing Data (1)
• Evaluation of testing results inside the circuit
• Counting techniques, signature analysis
Example: Counting techniques for test data evaluation
π*
11
mF −≈
Integrated Electronic Systems Lab20: Testing 734
Evaluation of Testing Data (2)
• Signature analysis– Communication technique: coding theory
– Code words: data stream D, polynomial P(x), division modulo 2
– Evaluation of testing data
P
RQ
P
D+=
Integrated Electronic Systems Lab20: Testing 735
Evaluation of Testing Data (3)
Example: Test data evaluation by signature analysis
Integrated Electronic Systems Lab20: Testing 736
Evaluation of Testing Data (4)
• Signature analysis: Degree of Fault Recognition1) Length of sequence: sequences possible
2) One sequence contains no faults number of erronous sequences is
3) Length of signature register:
4) sequences are mapped on signatures number of non-detectable faults is:
5) Possibility for non-detection of erronous sequence: number of non-detectable faults divided by number of possible faults:
6) Fault detection rate:
mm 2 →bit
12 −m
signatures bit nn 2 →m2 n2
1212
2−=− −nm
n
m
12
12
−−
=−
m
nm
N
n
m
nm
F
F
−
−
−≈−−
−=
21
12
121
Integrated Electronic Systems Lab20: Testing 737
Evaluation of Testing Data (5)
• Interpretation:– all faults recognized if m < n (trivial)
– long sequences: n is important only
– n = 16 bit F = 99,99985%
• Parallel signature register with k inputs:12
121
−−
−=−
mk
nmk
F
Integrated Electronic Systems Lab20: Testing 738
Built-in Logic Block Observation (1)
• A BILBO register is a universal element for use in either a scan-path environment or a self-test (signature analysis) environment.
BILBO register: 1. full circuit, 2. normal use, 3. scan-path, 4. signature analysis
Integrated Electronic Systems Lab20: Testing 739
Built-in Logic Block Observation (2)
• Advantages:– Versatility
• Normal operation
• Scan-path test: enhances testability
• Test vector generation via LFSR
• Data compression via LFSR
• Combined scab-path/self-test using LFSRs
• Disadvantages:– silicon area
• Bilbo latch can be 50% larger than ordinary latch≈
Integrated Electronic Systems Lab20: Testing 740
Built-in Logic Block Observation (3)
Example: Self-testing circuit
feedback disconnect: open in test mode
Test Clock
For clarity, mode control lines, normal system clocks, and preset/clear facilities have been omitted
binary up-counter
decoder
pass gate
red LED,
green LED
go / no go output
Integrated Electronic Systems Lab
21. Future Trends:
Design of robust Circuits and Systems under Consideration of Reliability
Constraints
Integrated Electronic Systems Lab21: Future Trends 742
• Introduction and Definitions
• Reliability Challenges for nano-scaled CMOS Technologies
• Reliability Challenges for Technologies based on new Material Classes: Printed Electronics
• Conclusions and Outlook
Overview
Integrated Electronic Systems Lab21: Future Trends 743
Basic Definitions• Reliability:
... is the ability of a system or a component to perform its required functions under stated conditions for a specified period of time (IEEE)
• RobustnessRobustness is the quality of being able to withstand stresses, pressures, or changes in procedure or circumstance. A system, organism or design may be said to be "robust" if it is capable of coping well with variations (sometimes unpredictable variations) in its operating environment with minimal damage, alteration or loss of functionality. (Wikipedia)
• Zuverlässigkeit:
... eines technischen Produkts ist eine Eigenschaft (Verhaltensmerkmal), die angibt, wie verlässlich eine dem Produkt zugewiesene Funktion in einem Zeitintervall erfüllt wird. Sie unterliegt einem stochastischen Prozess und kann qualitativ oder auch quantitativ (durch die Überlebenswahrscheinlichkeit) beschrieben werden, sie ist nicht unmittelbar messbar. (Wikipedia)
• Robustheit:
... Ist die Eigenschaft eines Systems oder Verfahrens, auch unter ungünstigen Bedingungen noch zuverlässig zu funktionieren (Wikipedia)
Integrated Electronic Systems Lab21: Future Trends 744
Reliability: Devices, Components, Systems
• Technology Issue: solve reliability problems in new technologies; adequate technology modeling
• Device Issues:appropriate device models; device and circuit simulation; robust ciruit design
• Circuit Design Issue: cope with limited device reliability >> device tolerant design techniques
Component Component Component System+ + + ... =
Device Device Device Component+ + + ... =
∏=
=N
iiDC RR
j1
∏=
=M
jjCS RR
k1
Example: %6.3699.0 %4.9099.0 10010 ==
Integrated Electronic Systems Lab21: Future Trends 745
Reliability: Devices, Components, Systems
• System Design Issue: flexible adaptive systems with masking capability for lower level deviations/defects
• Application Design Issue:select adaquate manufacturing technologies, design techniques and system architectures
• Test / Quality Control Issue:test, if guaranteed system functionality is available
Source: NXP / Spoerle
Source: sees-project.net
Physics / Technology Models Test
Integrated Electronic Systems Lab21: Future Trends 746
• Introduction and Definitions
• Reliability Challenges for nano-scaled CMOS Technologies
• Reliability Challenges for Technologies based on new Material Classes: Printed Electronics
• Conclusions and Outlook
Overview
Integrated Electronic Systems Lab21: Future Trends 747
Major Challenges in CMOS IC Design
• Solution:
PowerConsumption
DesignRobustness
contradictoryin nature
Designs for minimizingpower consumption
Reducedreliability
Power Reliability
Joint Optimization
Integrated Electronic Systems Lab21: Future Trends 748
Power Consumption
• Traditionally: the driving force behind technology changes:
• Currently: rapidly-growing power densities (90 nm and beyond)
– Causes: exponential grow in:
NMOS
CMOS
Bipolar
Subthreshold
Gate LeakageCurrents
[Sakurai 2004 ISSCC]
Research for a mature low-power
technology alternative to CMOS:
• Single electron transistors
• Spin transistors
• Carbon nanotube FETs
• Ferromagnetic logic devices
10 ... 20 years
Integrated Electronic Systems Lab21: Future Trends 749
Major Challenges
• Inherent Tradeoff: Critical Delay– Initially: many noncritical delay paths
– Power optimization: distribution pushed towards the initial critical path delay
• (Near-)Critical Paths:– affect the yield due to
Variability Power(particularly Leakage)
require the most additional EDA investment
Intersection: the most efforts from the CAD community
Initial path delay distribution
Timing wallProcess
Variability [Sylvester 2007 ProcIEEE]
Integrated Electronic Systems Lab21: Future Trends 750
Dynamic Power reduction:
• Gate Sizing: linear power reduction; convexproblem to be solved (polynomial time); enhanced standard cell libraries
• Clustered Voltage Scaling (CVS):quadratic power reduction; but: delay penalty
Dual VDD Approaches in most cases(power supply overhead!)
∑=
=N
iddiidyn fVCP
tot1
2α
for each node i, not straightforward to determine αi and Ci
switchingprobability
CVS:
[Usami 98 JSSC]
(plus short circuitcurrent)
[Usami 98 JSSC]
Integrated Electronic Systems Lab21: Future Trends 751
Static Power Minimization• Static Power: to be considered in active mode and standby
– Has become a significant contributor to the total power budget
– Particularly a problem for mobile applications
• Leakage Current: – Affected by the input-vector probability:
Stack Effect
S D
p substrate
n+n+
S D
p substrate
n+n+
Subthreshold Leakage Gate Leakage
the dominant contributon relevant in 65 nm and beyond (use high-K)
[Actel]
Integrated Electronic Systems Lab21: Future Trends 752
Static Power: Active Mode Leakage Reduction
• Multi-Vth Assignment:
– Analog method of dual-Vdd assignment,for leakage power
– Optimal Choice of Vth Values (Opt. Problem):
– Exponential dependence of leakage current on Vth
– Implementation: post-layout
– Sensitive to Vth variations
• Effective Gate Length (Leff) Biasing:
– introduce longer-than-minimum channel lengths(max 10%)
– very small delay and powerpenalties
– substantial reduction in leakage
[Gupta 2004 DAC]
54% less worst-case variability!
DDLowthHighth VVV ⋅≈− %10,,
Integrated Electronic Systems Lab21: Future Trends 753
Standby Mode Leakage Reduction
[Macii 2007 CLEAN Ws]
• Input Vector Controling (IVC)– Uses the stack effect to reduce leakage
– Force gates to a low leakage state
– Only a few nodes in a design can be
assigned to a given state:
– Hard Problem: Determine the state that should be forced: heuristics, random sampling
– leakage reduction up to 20% [1999 Johnson TransCAD]
• High-Vth Sleep Transistors– very large
– area and delay penalties
• Body Biasing:– Reverse body biasing: worse short-channel effects
[Keshavarzi 2002 ISVLSI]
– Current implementations: Forward BB to lower Vth during active mode operation
• Combination of IVC and Dual-Vth:– Up to 5x leakage savings than IVC alone! [Lee 2005 TransCAD]
Vbs(V)
Vth
(V)
Integrated Electronic Systems Lab21: Future Trends 754
Quantifying the Tradeoff• Parametric Yield given Timing and Power Constraints:
• Major Concern: yield loss due to power constraints violation
– Leff variations affect:
Two-sidedyield constraint
Delay Leakage
inversely correlated: opposite sensitivities to Leff
Dynamic Power
Leakage Power
sublinearly
exponentially !
[Sylvester 2007 ProcIEEE]
Integrated Electronic Systems Lab21: Future Trends 755
Total Power Optimization under Variability
~Variations in Dynamic Power
Same range as Process Parameters
Significantly HigherLeakage Current
Variations
Efficient methods are required for:
Statistical Analysis and Optimization of Leakage Power
Combined Approaches:
Dual-Vdd/Vth: improvements of 15%-45% in total power
• Dynamic Power:– Linear dependence on process parameters
• Leakage Power:– Exponential dependence on process parameters
• Interconnect design is another important issue!
Integrated Electronic Systems Lab21: Future Trends 756
• Introduction and Definitions
• Reliability Challenges for nano-scaled CMOS Technologies
• Reliability Challenges for Technologies based on new Material Classes: Printed Electronics
• Conclusions and Outlook
Overview
Integrated Electronic Systems Lab21: Future Trends 757
MaterialsSciences
Electronics
TU Darmstadt: Research Center for Printed Electronics
Chemistry
PrintingTechnology
• Advanced Materials Synthesis
• Materials Optimization
• Materials Characterization
• Circuit Design
• Antenna Design
• Device Modeling
• Device Testing
• Printing, Processing
• Quality Management
Research Topics
[Source: PolyIC]
Application Scenario:
Printed RFID
Integrated Electronic Systems Lab21: Future Trends 758
Manufacturing Technology>> Printing Technology
Materials Research
Circuit Design
Applications
Device&
ProcessModels
– Materials– Printing– Modeling & Design
Technology & Design
Research Center for Printable Electronics
TUD MerckLab:
Joint University / Industry Research Lab
Integrated Electronic Systems Lab21: Future Trends 759
Mixed Level/Domain Models based on Verilog-A: UHF RFID system
• Modeled Components: – Reader– Wireless Channel– Transponder
• Mixed Wave Domain (s-Parameter) and Circuit Modelling
Integrated Electronic Systems Lab21: Future Trends 760
Circuit-level Simulation and Design of a RFID transponder: Rectifier
• Rectifier– Three-stage modified Villard rectifier
– LC matching network
• Rectifier impedance evaluation:
– Simulation result:
ΩK2
VV in 5.0^
= VV 5.1=+
Integrated Electronic Systems Lab21: Future Trends 761
RFID Reader Technology: 13.56 MHz Interrogator
Analog FrontEndLantronix XPort
Xilinx Spartan3FPGA Board
Antenna
Integrated Electronic Systems Lab21: Future Trends 762
• Introduction and Definitions
• Reliability Challenges for nano-scaled CMOS Technologies
• Reliability Challenges for Technologies based on new Material Classes: Printed Electronics
• Conclusions and Outlook
Overview
Integrated Electronic Systems Lab21: Future Trends 763
Future Directions in IC Design• Multiple Cores
– Particularly interesting: nonuniform cores
(different supply voltages and different power/performance ratios)
– Dedicated hardware accelerators
for very low voltages
• Interconnect Design Trends– Problem shrinking wires >> larger delays
– Solutions Requirement:
• Meet stringent timing and signal integrityrequirements
• Reduce both static and dynamic power– Currently: aggressive shielding to avoid highly inductive
lines
– Future: improved signaling techniques:
Low-swing, pulsed signaling, Ultra high-speed serial lines, bus encoding
– Global wiring optimization for low power rather than performance
– Adaptive SoC top-level NoC-based interconnection architectures
[IBM Cell Processor]
Integrated Electronic Systems Lab21: Future Trends 764
Future Directions in IC Design
Robust design strategy
Generality and applicability to many optimization tools
Closely related CAD and technology improvements!
• Advanced circuit modeling and characterisation approaches required (simulation)
• New standard cell design approaches based on reliability criteria
• Usage of assertion based verification techniques on component level
• CAD/Design: Multiobjective Optimization(static and dynamic power, performance, yield)
– Parametric yield should be the objective of CAD flows
• Not simply: timing, power, area, ...
– Possible approaches:
• Use SSTA (statistical static timing analysis) with current optimization engines
• Use fast deterministic analysis with variation space sampling [Sylvester 2007]
Integrated Electronic Systems Lab21: Future Trends 765
Conclusions• NanoScale CMOS:
– Power is the key limiter of Moore‘s law [Sylvester ProcIEEE 2007]
– Design Goals: low-power and robustness (parametric yield)
– Power and robustness has to be considered on all levels of the design flow
– New CAD techniques for multi-objective optimization needed
– Design of adaptive circuits required (adaptive body biasing has been successful)
– Signal transmission one of the central future challenges (smart repeaters, pulses)
• New Technologies: (e.g. Printed organic/inorganic Electronics)– Reliability challenge for new manufacturing technologies
– Multi-level and multi-domain modeling required for optimized circuit design
– Realistic physical and design oriented modeling and characterisation of devices
– Technology modeling
Integrated Electronic Systems Lab21: Future Trends 766
Thank you!
Vielen Dank!
谢谢您!
Integrated Electronic Systems Lab
Advanced Digital Integrated Circuit Design
Exercises
Integrated Electronic Systems Lab
Advanced Digital Integrated Circuit Design
1. Exercise: Short Channel MOSFETs
Integrated Electronic Systems Lab
1. Exercise: Short Channel MOSFETs 769
1. Problem: Short Channel MOSFETs
• Complete the table on the next slide (calculate K‘)
• What is the value of for a long channel MOSFET?
• Estimate the drain current IDS for both MOSFETs in ohmic region using the classical expression and using the velocity saturation effect. Compare both results by calculating the percentage of error between the results.
• Calculate the value of VDSAT and compare it with the classical assumption that the device enters in saturation when VDS≥VGS-VT0
• Find an expression for the on-resistance of short channel devices and estimate the on-resistance for both devices.
κ
Integrated Electronic Systems Lab
1. Exercise: Short Channel MOSFETs 770
1. Problem: Short Channel MOSFETs
Given the following parameters:
VT[V] K‘ [A/V2] µ [cm²/Vs]
NMOS 0.4 µn= 1.15* 104
PMOS -0.4 µp=3.00*103
COX= 10-8 F/cm2
|VGS|=0.6V
|VDS|=0.1V
L=0.25µm
W=0.75µm
EC= 1.5*106 V/m
Integrated Electronic Systems Lab
1. Exercise: Short Channel MOSFETs 771
( ) ( ))Ε(+1
1=
LVVκ
CDSDS
1. Problem: Short Channel MOSFETs
Formulas:
( ) ( ) ⎥⎦
⎤⎢⎣
⎡−−⋅=
2
2DS
DSTGSOXDSDSV
VVVL
WCVI µκ
0
1
→
=DSVDS
on gR
DS
DSDS V
Ig
∂∂
=
Integrated Electronic Systems Lab
Advanced Digital Integrated Circuit Design
2. Exercise:
NMOS and CMOS Inverters
Integrated Electronic Systems Lab
2. Exercise: NMOS and CMOS Inverters 773
1. NMOS Inverter
Assume three types of NMOS inverters:
a) with resistive load
b) with enhancement MOSFET load
c) with depletion MOSFET load
a) resistive load b) enhancement load
c) depletion load
Iout Iout Iout
Integrated Electronic Systems Lab
2. Exercise: NMOS and CMOS Inverters 774
Draw the simplified pull-up characteristic of the three types of NMOS inverters shown before.
Use the appended diagram “Pull-Up-Characteristics” for this purpose
Assume
VDD = 5V,
RL = , VT,enh = 1V,
VT,dep = -1V
λ = 0
The short-circuit current of both inverters with active load is
IQ = 0.2mA
Neglect short channel and body effects of the transistors.
1. NMOS Inverter
Ωk20
Integrated Electronic Systems Lab
2. Exercise: NMOS and CMOS Inverters 775
1. NMOS Inverter
The next appended diagram shows the output characteristics of the driver transistor QS.
The low-state output voltage should not exceed 0.8V. Determine graphically, for an input voltage of 2.5V and 3V, how much current the NMOS inverter can sink if:
• a load resistor is used,
• a depletion transistor with is used, neglecting body and short channel effects
Ω= kRL 20
mAIQ 2.0=
Integrated Electronic Systems Lab
2. Exercise: NMOS and CMOS Inverters 776
1. NMOS Inverter
For the NMOS inverter with saturated enhancement load, the voltage transfer characteristics should be estimated.
Use the appended diagram “Determination of VTC” to determine the Voltage Transfer Characteristic (VTC) of the NMOS inverter with saturated enhancement load graphically. Draw the VTC in the empty diagram “VTC of NMOS-Inverter”.
Integrated Electronic Systems Lab
2. Exercise: NMOS and CMOS Inverters 777
1. NMOS Inverter
This inverter is characterized by the following parameters:
• Calculate VOH
• Calculate VOL
• Calculate VIH
VVDD 5= VF 6.02 =φ
V37.0=γ82
1 ===βββRRK
VVT 0.10 =
Integrated Electronic Systems Lab
2. Exercise: NMOS and CMOS Inverters 778
1. NMOS Inverter
Hints:
• The body effect (influence of the bulk- source voltage) of the load transistor must be taken into account when determining its threshold voltage. Therefore the following equation for the threshold voltage can be used:
• An equation of type x = f(x) can be solved numerically by starting at any value for x and iteratively calculating f(x) until the result reaches the desired precision.
( )||2||20 FSBFTTH VVV φφγ −++=
Integrated Electronic Systems Lab
2. Exercise: NMOS and CMOS Inverters 779
2. VIL and VIH for a CMOS Inverter
A CMOS process is characterized by the following parameters:
• Calculate the values of VIL and VIH for a supply voltage VDD= 5V, 10V and 15V
• At which operation point does the current consumption of the inverter reach its maximum ?
• Calculate the current consumption of the inverter at these supply voltages.
²40,8.00 V
µAVV nnT =+= β
²40,8.00 V
µAVV ppT =−= β
Integrated Electronic Systems Lab
Advanced Digital Integrated Circuit Design
3. Exercise: CMOS Inverter Technology
Integrated Electronic Systems Lab
3. Exercise: CMOS Inverter Technology 781
Problem 1
The figure below shows the layout of a CMOS inverter, whose dimensions are given in micrometers. The inverter is realized in a n-well CMOS process. The oxides capacitance is Cox = 69.1 nF/cm2 for both n and p-channel transistors. The drain-bulk and source-bulk depletion capacitances of the transistors are given by the following parameters:
[ ][ ][ ][ ] 985.0921.0
939.00.879
362.00.107/
0298.00.0975/
0
0
0
20
V
V
mfFC
mfFC
PMOSNMOS
sw
jsw
j
φφ
µµ
Integrated Electronic Systems Lab
3. Exercise: CMOS Inverter Technology 782
( )
( )
2020
OHOLmaxdb,OHOLaveragedb,
00
0
0
0
0
0
1680- :PMOS ; 4080 :NMOS
:rs transistofor the parameters following the Useabove. determined
of value theusingby inverter, for the and Computed)
V and Vbetween Vfor Average ; CV,VKC
. and wiresctinginterconne theignore i.e.
, determineherewith andinverter for the Computec)
11
.separatelyaccount into regions bottom theand sidewall theTake
. and of valuebias zero theDetermineb)
and of valuemaximum theComputea)
.5 is tagesupply vol The ns.calculatioin included bemust and
assumed is m0.3 overlapan figure, in theshown explicitlynot Although
V
µAKV , .V
V
µAKV , .V
Ctt
C
C,VVK
/φV
perimeterC; C
/φV
areaCC
CC
CC
V V
L
ppTnnT
outLHHL
G
outLH
swr
jswsidewall
r
jbottom
dbpdbn
GDpGDn
DD
=′==′+=
⋅=
+
⋅=
+
⋅=
== µ
Integrated Electronic Systems Lab
3. Exercise: CMOS Inverter Technology 783
Hints:
MOS Overlap Capacitors
MOS Gate Capacitances
Integrated Electronic Systems Lab
3. Exercise: CMOS Inverter Technology 784
MOS Gate Capacitances
1. Cutoff: no inversion layer channel
2. Nonsaturation: the channel shields the bulk electrode from the gate
3. Saturation: the channel is pinched off and does not contact the drain n+ region
Integrated Electronic Systems Lab
3. Exercise: CMOS Inverter Technology 785
MOS Gate Capacitances
Combination of the gate capacitances with the overlap contributions:
The Bulk Junction Capacitances
The total depletion capacitance of a pn junction
is given (considering the bottom and sidewall
regions) by:
where Vr is the magnitude of the reverse-bias voltage applied to the junction:
• For drain regions: Vr = VDB
• For source regions: Vr = VSB
Integrated Electronic Systems Lab
3. Exercise: CMOS Inverter Technology 786
An average depletion capacitance can be defined by:
where
Defining a dimensionless voltage factor
yields
Integrated Electronic Systems Lab
3. Exercise: CMOS Inverter Technology 787
Problem 2
The figure below shows the layout of two cascaded CMOS inverters, each stage being identical to the one analysed in the problem 1. Capacitances and the connecting wires are now taken into account. Let Cp-f = 0.0576 fF/um2
and Cm-f = 0.0345 fF/um2.
?dominating escapacitanc two theofone Is 1. problemin calculated of
value theusing , sum theDetermine line.poly theof beginning
thefromseen as stage, second theof ecapacitancinput theDetermineb)
poly.or p ,n overlaps metalin which regions theignoring regions,
field-metal only theConsider stage.second theofcontact poly -metal the
tostagefirst theofoutput thefromecapacitanc field-metal theComputea)
out
gline
C C C +
++
Integrated Electronic Systems Lab
3. Exercise: CMOS Inverter Technology 788
Problem 3
Let’s consider a CMOS inverter with βn = βp = 35 µA/V2 and VT0n = 0.9V,
VT0p = -0.8V. The output capacitance is Cout = 125 fF and the supply voltage
is VDD = 5V.
a) Compute tHL and tLH for the inverter.
b) Determine the propagation delay time tp. You may assume an input voltage that has a rise or fall time of 0ns, i.e. the input signal goes immediately from 0V to 5V and vice versa.
Integrated Electronic Systems Lab
Advanced Digital Integrated Circuit Design
4. Exercise: CMOS and Pass Transistor Logic
Integrated Electronic Systems Lab
4. Exercise: CMOS and Pass Transistor Logic 790
Determine the logic function of the following NMOS circuits:
a)
b)
1. Problem: Logic Function Analysis
Integrated Electronic Systems Lab
4. Exercise: CMOS and Pass Transistor Logic 791
Synthesize the CMOS circuit for a parity generator with four inputs:DCBAZ ⊕⊕⊕=
3. Problem: Full AdderSynthesize the CMOS circuit for a full-adder, which has the following truth table:
4. Problem: CMOS LogicImplement the following function using static CMOS logic:
( ) ( )EDCABf ++=
2. Problem: CMOS Logic
Integrated Electronic Systems Lab
4. Exercise: CMOS and Pass Transistor Logic 792
The figure below shows an implementation with CMOS transmission gates of the function:
a) Build the equivalent multistage circuit with elementary gates (AND, OR, INV)
b) Implement the circuit as a Complex-Gate
c) Compare the transistor count. Point out the advantages and disadvantages of all three solutions
BSSAF +=
5. Problem: Transistor Count
Integrated Electronic Systems Lab
4. Exercise: CMOS and Pass Transistor Logic 793
6. Problem: Pass Transistor Logic
Implement the following function:
You may use 8 PMOS and 8 NMOS transistors respectively. The literals are available in both inverted and non-inverted form.
bcabcaacddcaF ′′+′′++′′=
Integrated Electronic Systems Lab
4. Exercise: CMOS and Pass Transistor Logic 794
7.Problem: Pass Transistor Logic
• Given are the following five logic functions, which are implemented in Pass Transistor Logic.
• Are these implementations correct?
– If not, under which condition of the input signals does the output not show the correct result?
– Hint: Take a look at the Karnaugh charts
– Try to draw the correct circuits
Integrated Electronic Systems Lab
4. Exercise: CMOS and Pass Transistor Logic 795
7. Problem: Pass Transistor Logic (cont)
cbacbaf ++=1
1f
b b c c
a
1a
cdcbdacf ++=2
2f
a a c c
b
1
d
Integrated Electronic Systems Lab
4. Exercise: CMOS and Pass Transistor Logic 796
cbacbdabcf3 ++=
3f
b b c c
a
d
bdadcbdbaf4 +++=
4f
b b d d
1c
a
a
a
7. Problem: Pass Transistor Logic (cont)
Integrated Electronic Systems Lab
4. Exercise: CMOS and Pass Transistor Logic 797
dcbacabf ++=5
5f
b b c c
a
a
d
7. Problem: Pass Transistor Logic (cont)
Integrated Electronic Systems Lab
5. Exercise: Dynamic Logic
Advanced Digital Integrated Circuit Design
Integrated Electronic Systems Lab 799
5. Exercise: Dynamic Logic
Problem 1: Dynamic Logic Full Adder
Draw the transistor level circuit of a dynamic ripple carry full adder, whose logic functions are the following.
( )( ) nnnnnnnn
nnnnnn
CBACBACS
BACBAC
⋅⋅+++=
++⋅=
+
+
1
1
Integrated Electronic Systems Lab 800
5. Exercise: Dynamic Logic
Problem 2: Charge Sharing
The function:
must be implemented using domino logic. Could charge sharing effects occur? If yes, how can they be avoided?
( )FEDCBAZ ++++=
Integrated Electronic Systems Lab 801
5. Exercise: Dynamic Logic
Problem 3: Charge Sharing
All input variables in the above circuit come from domino logic blocks, so that immediately after the precharge we have: .
For which possible 0 →1 transitions has the charge sharing effect the greatest influence? The capacitances are:
Calculate the voltage Vout,1. Make the calculations for .
fFCfFCC outXX 185 , 10 1,21 ===
VFDCBA 0=====
fFCC XX 4021 ==
Integrated Electronic Systems Lab
Advanced Digital Integrated Circuit Design
6. Exercise:
Line Propagation Delay, Buffer Stages
Integrated Electronic Systems Lab
6. Exercise: Line Propagation Delay, Buffer Stages 803
Problem 1: Line Propagation Delay
Assume a poly line with a length of l = 3mm, a line resistance of
r = 12 Ω/µm and a line capacitance of c = 4*10-4 pF/µm.
a) Calculate the delay of the line
b) Insert a buffer with a delay = 3 ns. At which position must the buffer be inserted to achieve a minimum delay (line delay and buffer delay)? Calculate this delay.
Integrated Electronic Systems Lab
6. Exercise: Line Propagation Delay, Buffer Stages 804
Problem 2: Inverter Chain
Consider an inverter chain with M stages like the one depicted below:
Integrated Electronic Systems Lab
6. Exercise: Line Propagation Delay, Buffer Stages 805
Problem 2: Inverter Chain
• Assume the inverters in the chain as symmetrical, this means that the rise and fall times at the output of the inverter are equal. Furthermore, the gate capacitance is for the NMOS of the first stage C1 = 6fF. The line capacitances are negligible. The load capacitance is CL = 150pF.
• Determine M and S, so that the delay of the inverter chain is minimal. The output must not be inverted.
Integrated Electronic Systems Lab
7. Exercise:Gate-Matrix, Stick-Diagrams, Euler Graphs
Advanced Digital Integrated Circuit Design
Integrated Electronic Systems Lab 807
7. Exercise: Gate-Matrix, Stick-Diagrams, Euler Graphs
Problem 1: Full adder - Stick Diagram
Let’s consider a full adder, whose input signals are A, B and Cin. The outputs are S and Cout.
A) Draw the logic table for the full adder and determine the equations for S and Cout.
B) Show the stick-diagram of the full adder
Integrated Electronic Systems Lab 808
7. Exercise: Gate-Matrix, Stick-Diagrams, Euler Graphs
Problem 2: Barrel Shifter
Draw the stick-diagram of a barrel shifter for a 4-bit word, n∈0…3. Each input has its own shift-enable. Assume that these inputs are properly driven by a decoder, i.e. only one input can be enabled at a time.
Integrated Electronic Systems Lab 809
7. Exercise: Gate-Matrix, Stick-Diagrams, Euler Graphs
Problem 3: Gate-Matrix Method
The figure below shows an implementation with CMOS transmission gates of the function: BSSAF +=
a) Build the equivalent multi-stage circuit with elementary gates (AND,OR,INV)
b) Compare the transistor count. Show the advantages and disadvantages of both solutions
c) Implement the circuit from a) using the gate-matrix technique. Draw the corresponding stick-diagram
Integrated Electronic Systems Lab 810
7. Exercise: Gate-Matrix, Stick-Diagrams, Euler Graphs
Problem 4: Euler Graphs
Given the following function:
a) Show the transistor level circuit implemented using static CMOS logic.
b) Build the optimal layout using the Euler graph method.
1) Show the complex-gate implementation
2) Modify the circuit so that, after applying the Euler graph method, to obtain the optimal result
3) Determine the Euler path for the graph reduction and the subsequent graph expansion
c) Draw the layout as stick-diagram.
( )( ) 87654321 iiiiiiiiF ⋅+⋅+++=
Integrated Electronic Systems Lab
8. Exercise: PLA Structures
Advanced Digital Integrated Circuit Design
Integrated Electronic Systems Lab 8128. Exercise: PLA Structures
Problem 1: PLA - Stick diagram
Draw the stick diagram of a NMOS PLA that implements a full adder stage. The input and the output registers are clocked using φ1 and φ2 respectively.
Integrated Electronic Systems Lab 8138. Exercise: PLA Structures
Problem 2: FSM implementation with PLA
Design and implement with PLA a traffic light controller for the crossroad below. The farm road has sensors for detecting waiting cars.
There is also a timer available, which is triggered by the rising edge of a ‘Start’ signal and provides two output signals:
TShort - during the yellow phase
TLong - for timing the green phase
StartTLong
TShort
TLong
TShort
Integrated Electronic Systems Lab 8148. Exercise: PLA Structures
S - Signal when a car is on the farmroad
TL - Timer signal for green (active low)
TS - Timer signal for yellow (active low)
HG - Highway green state
HY - Highway yellow state
FG - Farm road green state
FY - Farm road yellow state
First, draw the schematics of the controller, showing the PLA, the timer and the traffic lights.