Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
1
Variability in Microprocessor Logic Design:
Trends, Sources, Consequences, & Solutions
Keith A. Bowman
Circuit Research Lab, Intel
Acknowledgements:
Jim Tschanz, Vivek De, Tanay Karnik, and Steve Duvall
June 7, 2010
UPC Seminar
2
Problem Statement:
� Variability is one of the primary challenges in the semiconductor industry
� Adversely impacts performance, power, yield, reliability, & time-to-market
Focus Areas:
1) Impact of variations on logic design
2) Variation-tolerant circuits
3) Tomorrow: Resilient microprocessor design for dynamic variation tolerance
3
Outline• Motivation & Technology Trends
• Sources of Variability
• Static Variations:
� Impact on Logic Design
� Variation-Tolerant Circuits
• Dynamic Variations:
� Impact on Logic Design
� Variation-Tolerant Circuits
• Summary
4
Moore’s Law Continues…
� Microprocessors with multi-billion transistors…
� Trillion instructions per second performance…
� Constant power envelope…
� Lower costs…
0.001
0.01
0.1
1
10
100
1000
10000
1970 1980 1990 2000 2010 2020
Transistor Count (106)
8080
8088
286386
486Pentium®
Pentium® II
Pentium® 4 CoreTM 2 Duo
CoreTM i7
Pentium® III
5
Challenge: Variations
� Gate length control is one of the “grand challenges” in the semiconductor industry – ITRS, 2009
65nm90nm
130nm
45nm
32nm
180nm
1980 1990 2000 2010 2020
LithographyWavelength
GenerationGap
193nm248nm
365nm
10
100
1000
nm
6
Cost of Variations
Underestimating Variations
• Functional yield loss
• Performance reduction
• Increases silicon debug time
� Impacts manufacturing
Overestimating Variations
• Increases design time
• Larger power & die size
• Rejection of otherwise good design options
• Missed market windows
� Impacts design
7
Impact of Variations on Revenue
Normalized FMAX
1
10
100
0.85 0.90 0.95 1.00 1.05 1.10 1.15
Normalized Leakage
8
Impact of Variations on Revenue
PowerLimit
Normalized FMAX
1
10
100
0.85 0.90 0.95 1.00 1.05 1.10 1.15
Normalized Leakage
9
Impact of Variations on Revenue
$$$$$/chip$$$$$$$$$$
PowerLimit
Normalized FMAX
1
10
100
0.85 0.90 0.95 1.00 1.05 1.10 1.15
Normalized Leakage
� Revenue exponentially increases across FMAX bins
10
81116223245Technology Node (nm)
High Probability Low ProbabilityBulk Planar CMOS
Low Probability High ProbabilityAlternative Device (3G)
Medium High Very HighVariability
2016 20182014201220102008Year
Technology Outlook
Source: Shekhar Borkar, Intel
11
Gate Overdrive Degradation
� Gate overdrive reduction amplifies impact of VCC, VT, & L variations on drive current
0
1
2
3
4
5
6
1980 1990 2000 2010
VCC or VT (V)
Gate Overdrive(VCC-VT)
VCC
VT
12
Design Reliable Systems with Unreliable Components
Strategic Research Objective
13
Outline• Motivation & Technology Trends
• Sources of Variability
• Static Variations:
� Impact on Logic Design
� Variation-Tolerant Circuits
• Dynamic Variations:
� Impact on Logic Design
� Variation-Tolerant Circuits
• Summary
14
Sources of Variability
1) Static Process Variations
2) Dynamic Operational Variations
3) Simulation Tool Uncertainty
15
Scale of Variations
Within-Die (WID) Variations
Systematic
Die-to-Die (D2D) Variations
Feature ScaleDie Scale
Random
Wafer Scale
16
Static Variations
17
Gate Length Variation
Source: Nagib Hakim
Source: Steve Duvall
-14.7
-8.4
-2.1
4.2
10.5
-9.9
-4.95
0
4.95
9.9
Across scan location (mm)
Across lens location
(mm)
1 2
Die-to-Die Variation• Examples: Processing temperatures, equipment properties, polishing, die placement, resist thickness
Systematic Within-Die Variation• Examples: Lens aberrations, mid-range flare, stepper non-uniformities, scanner overlay control, multiple dies per reticle, wafer topography
Random Within-Die Variation• Examples: Patterning limitations, short-range flare, line edge roughness
18
Systematic-WID Variation
� From circuit design perspective, systematic-WID variation behaves as a correlated random-WID variation
Source: H. Masuda, et al., CICC, 2005.
19
Gate Length Variation Trends
� Total CD control approximately fixed percentage of nominal gate length
� Random-WID variations increase with scaling
250 180 130 90 65 45
Technology Generation (nm)
Relative CD Variation Total D2D & WID
Total WID
Systematic WID
Random WID
λλλλ=248nm λλλλ=193nm
20
Systematic-WID Correlation Length
250 180 130 90 65 45
Technology Generation (nm)
Correlation Length Correlation Length
Ideal Scaling
λλλλ=248nm λλλλ=193nm
� Correlation length scaling by ~1/sqrt(2)
21
Random Dopant Fluctuation
Source: X. Tang
10
100
1000
10000
1000 500 250 130 65 32
Technology Node (nm)
10
100
1000
10000
1000 500 250 130 65 32
Technology Node (nm)
Mean Number of DopantAtoms
22
Interconnect Variations
1) Depth of Focus Variation
�Depends on neighboring interconnects
2) Chemical Mechanical Polishing (CMP)
�Depends on metal density
3) Etching
�Smaller than depth of focus & CMP variations
23
Impact of OPC on Isolated Lines
Bossung Plot Example (Isolated Drawn Lines)
CD
FocusFocus Window
150nm
100nm
Max CD
Min CD
Chip Topography
Focus Variations
Light Source
FP2FN2
24
Dynamic Variations
25
Supply Voltage Variations
� Processor activity change
� Current delivery (RLC)
� Dynamic: ns – 100µµµµs
1.00
1.05
1.10
1.15
1.20
1.25
0 10 20 30 40 50
Time (ns)
VCC (V)
Reliability
Performance
26
Temperature Variations
� Processor activity & ambient change
� Dynamic: 100 – 1000µµµµs
Single Core
Hamann et. al, ITHERM, 2006.
Dual Core
27
Transistor Aging
� NMOS & PMOS threshold voltages degrade from bias & temperature stress
J. Tschanz, et al., Symp. VLSI Circuits, 2009.
0.0%
1.0%
2.0%
3.0%
4.0%
5.0%
6.0%
0.0 0.2 0.4 0.6 0.8 1.0
Stress Time (a.u.)
Path Delay Change
28
Additional Dynamic Variations
Cross-CouplingCapacitance
Multiple-InputSwitching
CC
CC
29
Outline• Motivation & Technology Trends
• Sources of Variability
• Static Variations:
� Impact on Logic Design
� Variation-Tolerant Circuits
• Dynamic Variations:
� Impact on Logic Design
� Variation-Tolerant Circuits
• Summary
30
Impact of Variations on Delay
Statistical Circuit Simulator
Netlist of Critical Paths,Process File, Gate Location
Die-to-Die and Within-Die Process Variation Models
Within-Die Critical Path Delay Distribution
Critical Path Delay (s)
Die-to-Die Critical Path Delay Distribution
Critical Path Delay (s)
K. Bowman, et al., JSSC, 2002.
31
Impact of WID Variations on Delay
K. Bowman, et al., JSSC, 2002.
� As Ncp increases, WID distribution mean increases and variance decreases
0
5
10
15
20
25
0.8 0.9 1.0 1.1 1.2
Normalized Maximum Critical Path Delay
Probability Density
Ncp=1
Ncp=2
Ncp=10
Ncp=100
Ncp=1000
Ncp=10000
Ncp ≡≡≡≡ Number of Independent Critical Paths
32
Impact of Variations on Delay
K. Bowman, et al., JSSC, 2002.
� WID variations impact delay mean
� D2D variations impact delay variance
0
5
10
15
20
0.8 0.9 1.0 1.1 1.2 1.3
Normalized Maximum Critical Path Delay
Probability Density WID Distribution
D2D DistributionD2D & WID Distribution
33
FMAX Distribution Model Validation
K. Bowman, et al., JSSC, 2002.
� Model agrees closely with measured data from a 0.25µµµµm microprocessor in mean, variance, & shape
� No fitting parameters used in the comparison
1.0E-08
1.0E-06
1.0E-04
1.0E-02
1.0E+00
-4 -3 -2 -1 0 1 2 3 4
Number of FMAX Standard Deviations
Normalized Probability Density
Measured DataModel
34
Impact of Variations on FMAX
K. Bowman, et al., JSSC, 2002.
� WID variations impact FMAX mean
� D2D variations impact FMAX variance
0
20
40
60
80
100
0.7 0.8 0.9 1.0 1.1 1.2 1.3
Normalized FMAX
Cumulative Distribution (%)
Model: Only WID Variations
Model: Only D2D Variations
Model: D2D & WID Variations
Measured Data
Mean FMAX Reduction
35
Impact of Variations on Logic Depth
K. Bowman, et al., JSSC, 2002.
� Random-WID variations average across N stages
GATE
T
GATE
T
CP
T
TNT
N
TGATEGATECP
σσσσ====
σσσσ====
σσσσ
12
N12
N
GATE
T
GATE
T
CP
T
TN
1
NT
N
TGATEGATECP
σσσσ====
σσσσ====
σσσσ
Critical Path
Systematic-WID Variations (ρρρρ=1)
Random-WID Variations
36
Impact of Variations on Logic Depth
� Impact of random-WID variation increases with deeper pipelining
� Impact of systematic-WID variation insensitive to pipelining
0%
5%
10%
15%
20%
2 4 8 12 20
Logic depth
% mean Fmax loss Systematic + random
Random only
Deeper pipeline
37
Impact of Variations on Leakage
S. Burns, et al., DAC, 2007.
� WID variations impact leakage median
� D2D variations impact leakage variance
0.0
0.2
0.4
0.6
0.8
1.0
1 10
Normalized Leakage
Cumulative
Nominal @ tttt
WID
D2D
D2D & WID
WID Distribution
Nominal Leakage
D2D Distribution
D2D & WID Distribution
38
Static Variation Compensation
�Measure:
• Clock Frequency
• Power
�Control Knobs:
• Supply Voltage
• Body Bias
39
Body Bias Review
�Reverse body bias (RBB)
• PMOS: VBP > VDD• NMOS: VBN < VSS• VT increases (ION, IOFF reduce)
+Ve
-Ve
VSS
VDD
VBP
VBN
+Ve
-Ve
VSS
VDD
VBP
VBN
�Forward body bias (FBB)
• PMOS: VBP < VDD• NMOS: VBN > VSS• VT reduces (ION, IOFF increase)
40
Adaptive VCC & Body Bias
0.1
1
10
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3 3.1
Fmax (GHz)
Leakage (A)
Power Limit
Apply RBB & Reduce VCC
Apply FBB & Increase VCC (if possible)
Reduce Impact of D2D Variations
41
Adaptive Supply Voltage
J. Tschanz, et al., JSSC, 2003.
1
10
100
0.85 0.9 0.95 1 1.05 1.1
Frequency (normalized)
40°C
0.5 W/cm2
0
200
400
8
9
1010 W/cm
2
1.05V 110°C αααα: 0.030 0
Standby
leakage
power
(normalized)
Total power
(normalized)
Switched
capacitance
(normalized)
1
10
100
0.85 0.9 0.95 1 1.05 1.1
Frequency (normalized)
40°C
0.5 W/cm2
0
200
400
8
9
1010 W/cm
2
1.05V 110°C αααα: 0.030 0
Standby
leakage
power
(normalized)
Total power
(normalized)
Switched
capacitance
(normalized)
0%20%40%60%80%100%
0.9 0.95 1 1.05
Frequency Bin
Die count
0%20%40%60%80%100%
0.9 0.95 1 1.05
Frequency Bin
Die count
Adaptive supplyWithout adaptive supply
42
74%
79%
20%
60%
100%
0.9 0.95 1 1.05
Frequency Bin
Die count
20%
60%
100%
0.9 0.95 1 1.05
Frequency Bin
Die count
74%
79%
20%
60%
100%
0.9 0.95 1 1.05
Frequency Bin
Die count
20%
60%
100%
0.9 0.95 1 1.05
Frequency Bin
Die count
Adaptive supply Adaptive body bias
Effectiveness of Adaptive Biasing
J. Tschanz, et al., JSSC, 2003.
0%
0%
79%
71%
20%
60%
100%
0.9 0.95 1 1.05
Frequency Bin
Die count
0%
0%20%
60%
100%
0.9 0.95 1 1.05
Frequency Bin
Die count
0%
0%
79%
71%
20%
60%
100%
0.9 0.95 1 1.05
Frequency Bin
Die count
0%
0%20%
60%
100%
0.9 0.95 1 1.05
Frequency Bin
Die count
Adaptive body bias Adaptive supply + body bias
-0.4
-0.2
0
0.2
0.4
-0.4
-0.2 0
0.2
0.4
NMOS body bias (V)
PMOS body bias (V)
P FBB
N RBB
P RBB
N RBB
P FBB
N FBB
P RBB
N FBB
Adaptive VBS
-0.4
-0.2 0
0.2
0.4
NMOS body bias (V)
P FBB
N RBB
P RBB
N RBB
P FBB
N FBB
P RBB
N FBB
Adaptive VDD+VBS
-0.4
-0.2 0
0.2
0.4
NMOS body bias (V)
P FBB
N RBB
P RBB
N RBB
P FBB
N FBB
P RBB
N FBB
Adaptive VDD+VBS
PMOS body bias (V)
-0.4
-0.2
0
0.2
0.4
…2% 25%…2% 25%
43
Effectiveness of Adaptive Biasing
�Slower Parts (Lower Power)
• Leakage is a small percentage of total power
• Trade-off leakage increase for performance gain
• More effective to apply a forward body bias (FBB)
�Faster Parts (Higher Power)
• Active & leakage contribute significantly to total power
• VCC reduction lowers both active and leakage power
• More effective to reduce VCC
44
Static ABB for WID Variations
J. Tschanz, et al., JSSC, 2003.
0%
20%
40%
60%
80%
0.85 0.90 0.95 1.00 1.05
Bin Fmax (normalized)Die Count
WID ABB Concept WID ABB Effectiveness
� No body bias for clock
� Requires triple-well process
Adaptive VCC+ ABB
Adaptive VCC
+ WID ABB
More parts in higher bins
PLLPLLPLLPLL
150nm Technology Test-Chip
Body bias2
Body bias3
Body bias
1
45
Outline• Motivation & Technology Trends
• Sources of Variability
• Static Variations:
� Impact on Logic Design
� Variation-Tolerant Circuits
• Dynamic Variations:
� Impact on Logic Design
� Variation-Tolerant Circuits
• Summary
46
FFD
CLK
QFF
CLK
VCC Droop
CLK
D (with VCC Droop)
D (nominal VCC)
Setup Time
TimingGuardband
� Guardbands required to ensure correct operation within the presence of dynamic variations
Impact of Dynamic Variations on Conventional Design
47
� Detect temperature, VCC, & aging variations
� Adapt FCLK & VCC to avoid timing violations
T. Fisher, et al., JSSC, 2006. J. Tschanz, et al., ISSCC, 2007.R. McGowen, et al., JSSC, 2006.
Sensors with Dynamic Voltage & Frequency (DVF) Control
Processor
VCC Clock Generator
Aging Sensor
Delay
Time
VCC
Time
VCC Sensor
Temp
Time
Thermal Sensor
DVF ControlVRM
48
Measured Dynamic Response
2600
2700
2800
2900
3000
3100
0 500 1000 1500 2000 2500 3000 3500
Time (ms)
Frequency (MHz)
0.0
0.2
0.4
0.6
0.8
1.0
Body Bias (V)
0
20
40
60
80
100
Temperature (C)
� Adaptive FCLK & body bias ensures correct operation & lower leakage at higher temperatures
J. Tschanz, et al., ISSCC, 2007.
49
Power Sensor with DVF
Voltage
Power
Current
Transition to High Power Draw
Transition to Low Power Draw
� Power management scheme increases performance within a power & thermal envelope
T. Fisher, et al., JSSC, 2006.R. McGowen, et al., JSSC, 2006.
50
Power Sensor with DVF
� Dynamic adaptation to reduce power variation
T. Fisher, et al., JSSC, 2006.R. McGowen, et al., JSSC, 2006.
Power spread due to
process variation
0
10
20
30
40
50
Power Distribution at Fixed V/F
90 100 110 120 130 140
Power upper
bound
90 100 110 120 130 1400
10
20
30
40
50
Power Range with Foxton
51
Sensors with Dynamic Voltage & Frequency (DVF) Control
Advantages:� Reduces guardbands for slow-changing global dynamic variations
� Low design overhead
Disadvantages:& Cannot detect fast-changing or local dynamic variations
& Requires post-silicon calibration
52
Summary
• Technology trends amplify microprocessor performance & power variability
• Static Variations:
� Within-die impacts FMAX mean & leakage median
� Die-to-die impacts FMAX & leakage variances
• Dynamic Variations:
� Impact FCLK guardbands
• Variation-tolerant circuits mitigate the impact of variations on performance & power
53
References (1)[1] S. G. Duvall, “Statistical Circuit Modeling and Optimization,” in 5th Intl. Workshop Statistical Metrology, June
2000, pp. 56–63.
[2] K. A. Bowman, S. G. Duvall, and J. D. Meindl, “Impact of Die-to-Die and Within-Die Parameter Fluctuations
on the Maximum Clock Frequency Distribution for Gigascale Integration,” IEEE J. Solid-State Circuits, pp.
183-190, Feb. 2002.
[3] S. Borkar, et al., “Parameter Variations and Impact on Circuits and Microarchitecture,” in Proc. 2003 Design
Automation Conf. (DAC), June 2003, pp. 338-342.
[4] H. Masuda, S. Ohkawa, A. Kurokawa, and M. Aoki, “Challenge: Variability Characterization and Modeling for
65- to 90-nm Processes,” in IEEE Custom Integrated Circuits Conf. (CICC), Sept. 2005, pp. 593-600.
[5] Y. Abulafia and A. Kornfeld, “Estimation of FMAX and ISB in Microprocessors,” IEEE Trans. VLSI Syst., pp.
1205–1209, Oct. 2005.
[6] S. M. Burns, M. Ketkar, N. Menezes, K. A. Bowman, J. W. Tschanz, and V. De, “Comparative Analysis of
Conventional and Statistical Design Techniques,” in Proceedings of the 44th ACM/IEEE Design Automation
Conference (DAC), June 2007, pp. 238-243.
[7] K. A. Bowman, A. R. Alameldeen, S. T. Srinivasan, and C. B. Wilkerson, “Impact of Die-to-Die and Within-Die
Parameter Variations on the Clock Frequency and Throughput of Multi-Core Processors,” IEEE Trans. VLSI
Syst., pp. 1679-1690, Dec. 2009..
[8] S. Herbert and D. Marculescu, “Characterizing Chip-Multiprocessor Variability-Tolerance,” in Proc. 2008
Design Automation Conf. (DAC), June 2008, pp. 313-318.
[9] J. Tschanz, et al., “Adaptive Body Bias for Reducing Impacts of Die-to-Die and Within-Die Parameter
Variations on Microprocessor Frequency and Leakage,” IEEE J. Solid-State Circuits, pp. 1396-1402, Nov.
2002.
[10] J. Tschanz, S. Narendra, R. Nair, and V. De, “Effectiveness of Adaptive Supply Voltage and Body Bias for
Reducing Impact of Parameter Variations in Low Power and High Performance Microprocessors,” IEEE J.
Solid-State Circuits, pp. 826-829, May 2003.
[11] K. Bowman, J. Tschanz, M. Khellah, M. Ghoneima, Y. Ismail, and V. De, “Time-Borrowing Multi-Cycle On-
Chip Interconnects for Delay Variation Tolerance,” in Proceedings of the 2006 International Symposium on
Low Power Electronics and Design (ISLPED), Oct. 2006, pp. 79-84.
54
[12] S. Dighe, et al., “Within-Die Variation-Aware Dynamic-Voltage-Frequency Scaling Core Mapping and Thread
Hopping for an 80-Core Processor,” in IEEE ISSCC Dig. Tech. Papers, Feb. 2010, pp. 174-175.
[13] A. Muhtaroglu, G. Taylor, and T. R. Arabi, “On-Die Droop Detector for Analog Sensing of Power Supply
Noise,” IEEE J. Solid-State Circuits, pp. 651-660, Apr. 2004.
[14] T. Fischer, J. Desai, B. Doyle, S. Naffziger, and B. Patella, “A 90-nm Variable Frequency Clock System for a
Power-Managed Itanium Architecture Processor,” IEEE J. Solid-State Circuits, pp. 218-228, Jan. 2006.
[15] R. McGowen, et al., “Power and Temperature Control on a 90-nm Itanium Family Processor,” IEEE J. Solid-
State Circuits, pp. 229-237, Jan. 2006.
[16] J. Tschanz, et al., “Adaptive Frequency and Biasing Techniques for Tolerance to Dynamic Temperature-
Voltage Variations and Aging,” in IEEE ISSCC Dig. Tech. Papers, Feb. 2007, pp. 292-293.
References (2)
55
Q&A