Upload
nitheshksuvarna
View
67
Download
3
Embed Size (px)
DESCRIPTION
hi
Citation preview
04/10/23 1
Fundamentals of Digital Test and DFT
Vishwani D. AgrawalRutgers University, Dept. of ECE
New Jersey
http://cm.bell-labs.com/cm/cs/who/va
January 2003
04/10/23 2
Course Outline Basic concepts and definitions Fault modeling Fault simulation ATPG DFT and scan design BIST Boundary scan IDDQ test
04/10/23 3
VLSI Realization Process
Determine requirements
Write specifications
Design synthesis and Verification
FabricationManufacturing test
Chips to customer
Customer’s need
Test development
04/10/23 4
Definitions Design synthesis: Given an I/O function, develop a
procedure to manufacture a device using known materials and processes.
Verification: Predictive analysis to ensure that the synthesized design, when manufactured, will perform the given I/O function.
Test: A manufacturing step that ensures that the physical device, manufactured from the synthesized design, has no manufacturing defect.
04/10/23 5
Realities of Tests Based on analyzable fault models, which may
not map onto real defects. Incomplete coverage of modeled faults due to
high complexity. Some good chips are rejected. The fraction (or
percentage) of such chips is called the yield loss.
Some bad chips pass tests. The fraction (or percentage) of bad chips among all passing chips is called the defect level.
04/10/23 6
Costs of Testing Design for testability (DFT)
Chip area overhead and yield reduction Performance overhead
Software processes of test Test generation and fault simulation Test programming and debugging
Manufacturing test Automatic test equipment (ATE) capital cost Test center operational cost
04/10/23 7
Cost of Manufacturing Testing in 2000AD
0.5-1.0GHz, analog instruments,1,024 digital pins: ATE purchase price = $1.2M + 1,024 x $3,000 = $4.272M
Running cost (five-year linear depreciation) = Depreciation + Maintenance + Operation = $0.854M + $0.085M + $0.5M = $1.439M/year
Test cost (24 hour ATE operation) = $1.439M/(365 x 24 x 3,600) = 4.5 cents/second
04/10/23 8
Present and Future*
Transistors/sq. cm 4 - 10M 18 - 39M Pin count 100 - 900 160 - 1475Clock rate (MHz) 200 - 730 530 - 1100Power (Watts) 1.2 - 61 2 - 96
Feature size (micron) 0.25 - 0.15 0.13 - 0.101997--2001 2003--2006
* SIA Roadmap, IEEE Spectrum, July 1999
04/10/23 9
Method of Testing
04/10/23 10
ADVANTEST Model T6682 ATE
04/10/23 11
LTX FUSION HF ATE
04/10/23 12
VLSI Chip Yield A manufacturing defect is a finite chip area with
electrically malfunctioning circuitry caused by errors in the fabrication process.
A chip with no manufacturing defect is called a good chip.
Fraction (or percentage) of good chips produced in a manufacturing process is called the yield. Yield is denoted by symbol Y.
Cost of a chip:
Cost of fabricating and testing a wafer-------------------------------------------------------
-------------Yield x Number of chip sites on the
wafer
04/10/23 13
Defect Level or Reject Ratio Defect level (DL) is the ratio of faulty chips
among the chips that pass tests. DL is measured as parts per million (ppm). DL is a measure of the effectiveness of tests. DL is a quantitative measure of the
manufactured product quality. For commercial VLSI chips a DL greater than 500 ppm is considered unacceptable.
04/10/23 14
Example: SEMATECH Chip Bus interface controller ASIC fabricated and
tested at IBM, Burlington, Vermont 116,000 equivalent (2-input NAND) gates 304-pin package, 249 I/O Clock: 40MHz, some parts 50MHz 0.45 CMOS, 3.3V, 9.4mm x 8.8mm area Full scan, 99.79% fault coverage Advantest 3381 ATE, 18,466 chips tested at
2.5MHz test clock Data obtained courtesy of Phil Nigh (IBM)
04/10/23 15
Computed DL
Stuck-at fault coverage (%)
Def
ect
leve
l in
ppm
237,700 ppm (Y = 76.23%)
04/10/23 16
Summary: Introduction VLSI Yield drops as chip area increases; low yield
means high cost Fault coverage measures the test quality Defect level (DL) or reject ratio is a measure of chip
quality DL can be determined by an analysis of test data For high quality: DL < 500 ppm, fault coverage ~
99%
04/10/23 17
Fault Modeling
04/10/23 18
Why Model Faults?
I/O function tests inadequate for manufacturing (functionality versus component and interconnect testing)
Real defects (often mechanical) too numerous and often not analyzable
A fault model identifies targets for testing A fault model makes analysis possible Effectiveness measurable by experiments
04/10/23 19
Some Real Defects in Chips Processing defects
Missing contact windows Parasitic transistors Oxide breakdown . . .
Material defects Bulk defects (cracks, crystal imperfections) Surface impurities (ion migration) . . .
Time-dependent failures Dielectric breakdown Electromigration . . .
Packaging failures Contact degradation Seal leaks . . .
Ref.: M. J. Howes and D. V. Morgan, Reliability and Degradation - Semiconductor Devices and Circuits, Wiley, 1981.
04/10/23 20
Observed PCB DefectsDefect classes
ShortsOpensMissing componentsWrong componentsReversed componentsBent leadsAnalog specificationsDigital logicPerformance (timing)
Occurrence frequency (%)
51 1 613 6 8 5 5 5
Ref.: J. Bateson, In-Circuit Testing, Van Nostrand Reinhold, 1985.
04/10/23 21
Common Fault Models Single stuck-at faults Transistor open and short faults Memory faults PLA faults (stuck-at, cross-point, bridging) Functional faults (processors) Delay faults (transition, path) Analog faults For more examples, see Section 4.4 (p. 60-
70) of the book.
04/10/23 22
Single Stuck-at Fault Three properties define a single stuck-at fault
Only one line is faulty The faulty line is permanently set to 0 or 1 The fault can be at an input or output of a gate
Example: XOR circuit has 12 fault sites ( ) and 24 single stuck-at faults
a
b
c
d
e
f
10
g h i 1
s-a-0j
k
z
0(1)1(0)
1
Test vector for h s-a-0 fault
Good circuit valueFaulty circuit value
04/10/23 23
Fault Equivalence Number of fault sites in a Boolean gate circuit =
#PI + #gates + #(fanout branches). Fault equivalence: Two faults f1 and f2 are
equivalent if all tests that detect f1 also detect f2. If faults f1 and f2 are equivalent then the
corresponding faulty functions are identical. Fault collapsing: All single faults of a logic circuits
can be divided into disjoint equivalence subsets, where all faults in a subset are mutually equivalent. A collapsed fault set contains one fault from each equivalence subset.
04/10/23 24
Equivalence Examplesa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
sa0 sa1
Faults in redremoved byequivalencecollapsing
20Collapse ratio = ----- = 0.625 32
04/10/23 25
Summary: Fault Models Fault models are analyzable approximations of
defects and are essential for a test methodology. For digital logic single stuck-at fault model offers
best advantage of tools and experience. Many other faults (bridging, stuck-open and
multiple stuck-at) are largely covered by stuck-at fault tests.
Stuck-short and delay faults and technology-dependent faults require special tests.
Memory and analog circuits need other specialized fault models and tests.
04/10/23 26
Fault Simulation
04/10/23 27
Problem and Motivation Fault simulation Problem: Given
A circuit A sequence of test vectors A fault model
Determine Fault coverage - fraction (or percentage) of modeled
faults detected by test vectors Set of undetected faults
Motivation Determine test quality and in turn product quality Find undetected fault targets to improve tests
04/10/23 28
Fault simulator in a VLSI Design Process
Verified designnetlist
Verificationinput stimuli
Fault simulator Test vectors
Modeledfault list
Testgenerator
Testcompactor
Faultcoverage
?
Remove tested faults
Deletevectors
Add vectorsLow
AdequateStop
04/10/23 29
Fault Simulation Scenario
Circuit model: mixed-level Mostly logic with some switch-level for high-impedance
(Z) and bidirectional signals High-level models (memory, etc.) with pin faults
Signal states: logic Two (0, 1) or three (0, 1, X) states for purely Boolean
logic circuits Four states (0, 1, X, Z) for sequential MOS circuits
Timing: Zero-delay for combinational and synchronous circuits Mostly unit-delay for circuits with feedback
04/10/23 30
Fault Simulation Scenario (continued)
Faults: Mostly single stuck-at faults Sometimes stuck-open, transition, and path-delay
faults; analog circuit fault simulators are not yet in common use
Equivalence fault collapsing of single stuck-at faults Fault-dropping -- a fault once detected is dropped from
consideration as more vectors are simulated; fault-dropping may be suppressed for diagnosis
Fault sampling -- a random sample of faults is simulated when the circuit is large
04/10/23 31
Essence of Fault Sim.
Disadvantage: Much repeated computation; CPU time prohibitive for VLSI circuits
Alternative: Simulate many faults together
Test vectors Fault-free circuit
Circuit with fault f1
Circuit with fault f2
Circuit with fault fn
Comparator f1 detected?
Comparator f2 detected?
Comparator fn detected?
04/10/23 32
Fault Sampling A randomly selected subset (sample) of faults
is simulated. Measured coverage in the sample is used to
estimate fault coverage in the entire circuit. Advantage: Saving in computing resources
(CPU time and memory.) Disadvantage: Limited data on undetected
faults.
04/10/23 33
Random Sampling Model
All faults witha fixed butunknowncoverage
Detectedfault
Undetectedfault
Randompicking
Np = total number of faults (population size) C = fault coverage (unknown)
Ns = sample size
Ns << Npc = sample coverage (a random variable)
04/10/23 34
Probability Density of Sample Coverage, c (x--C )2 -- ------------ 1 2 2p (x ) = Prob(x < c < x +dx ) = -------------- e 2 1/2
p (x
)
C C +3C -3 1.0xSample coverage
C (1 - C)Variance2 = ------------ Ns
Mean = C
Samplingerror
x
04/10/23 35
Sampling Error Bounds C (1 - C ) | x - C | = 3 -------------- 1/2 NsSolving the quadratic equation for C, we get the
3-sigma (99.7% confidence) estimate: 4.5C 3 = x ------- [1 + 0.44 Ns x (1 - x )]1/2
NsWhere Ns is sample size and x is the measured faultcoverage in the sample.Example: A circuit with 39,096 faults has an actualfault coverage of 87.1%. The measured coverage ina random sample of 1,000 faults is 88.7%. The aboveformula gives an estimate of 88.7% 3%. CPU time for sample simulation was about 10% of that for all faults.
04/10/23 36
Summary: Fault Sim. Fault simulator is an essential tool for test development. Concurrent fault simulation algorithm offers the best choice. For restricted class of circuits (combinational and
synchronous sequential with only Boolean primitives), differential algorithm can provide better speed and memory efficiency (Section 5.5.6.)
For large circuits, the accuracy of random fault sampling only depends on the sample size (1,000 to 2,000 faults) and not on the circuit size. The method has significant advantages in reducing CPU time and memory needs of the simulator.
04/10/23 37
Automatic Test-pattern Generation (ATPG)
04/10/23 38
Functional vs. Structural ATPG
04/10/23 39
Functional vs. Structural(Continued)
Functional ATPG – generate complete set of tests for circuit input-output combinations 129 inputs, 65 outputs: 2129 = 680,564,733,841,876,926,926,749,
214,863,536,422,912 patterns Using 1 GHz ATE, would take 2.15 x 1022 years
Structural test: No redundant adder hardware, 64 bit slices Each with 27 faults (using fault equivalence) At most 64 x 27 = 1728 faults (tests) Takes 0.000001728 s on 1 GHz ATE
Designer gives small set of functional tests – augment with structural tests to boost coverage to 98+ %
04/10/23 40
Random-Pattern Generation
Flow chart for method
Use to get tests for 60-80% of faults, then switch to D-algorithm or other ATPG for rest
04/10/23 41
Path Sensitization Method Circuit Example
1 Fault Activation2 Fault Propagation3 Line Justification
04/10/23 42
Path Sensitization Method Circuit Example Try path f – h – k – L blocked at j, since
there is no way to justify the 1 on i
10
D
D1
1
1DDD
04/10/23 43
Path Sensitization Method Circuit Example Try simultaneous paths f – h – k – L and g – i – j – k – L blocked at k because
D-frontier (chain of D or D) disappears
1DD D
DD
1
1
04/10/23 44
Path Sensitization Method Circuit Example Final try: path g – i – j – k – L – test found!
0
D D D
1 DD
1
0
1
04/10/23 45
Sequential Circuits A sequential circuit has memory in addition to
combinational logic. Test for a fault in a sequential circuit is a
sequence of vectors, which Initializes the circuit to a known state Activates the fault, and Propagates the fault effect to a primary output
Methods of sequential circuit ATPG Time-frame expansion methods Simulation-based methods
04/10/23 46
Concept of Time-Frames
If the test sequence for a single stuck-at fault contains n vectors,
Replicate combinational logic block n times Place fault in each block Generate a test for the multiple stuck-at fault
using combinational ATPG with 9-valued logic
Comb.block
Fault
Time-frame
0
Time-frame
-1
Time-frame-n+1
Unknownor given
Init. state
Vector 0Vector -1Vector -n+1
PO 0PO -1PO -n+1
Statevariables
Nextstate
04/10/23 47
An Example of Seq. ATPG
FF2
FF1
A
B
s-a-1
04/10/23 48
Nine-Valued Logic (Muth)0,1, 1/0, 0/1, 1/X, 0/X, X/0, X/1,
XA
B
X
X
X
0s-a-1
0/1
A
B
0/X 0/X
0/1
X
s-a-1X/1
FF1 FF1
FF2 FF20/1 X/1
Time-frame -1 Time-frame 0
04/10/23 49
Seq. ATPG Results s1423 s5378 s35932
Total faults 1,515 4,603 39,094
Detected faults 1,414 3,639 35,100
Fault coverage 93.3% 79.1% 89.8%
Test vectors 3,943 11,571 257
CPU time 1.3 hrs. 37.8 hrs. 10.2 hrs.HP J200 256MB
Ref.: M. S. Hsiao, E. M. Rudnick and J. H. Patel, “Dynamic State Traversal for Sequential Circuit Test Generation,” ACM Trans. on Design Automation of Electronic Systems (TODAES), vol. 5, no. 3, July 2000.
04/10/23 50
Summary: ATPG Combinational ATPG is significantly more efficient
than sequential ATPG. Combinational ATPG tools are commercially
available. Design for testability is essential if the circuit is
large (million or more gates) and high fault coverage (~95%) is required.
04/10/23 51
Design for Testability
04/10/23 52
Definition Design for testability (DFT) refers to those design
techniques that make test generation and test application cost-effective.
DFT methods for digital circuits: Ad-hoc methods Structured methods:
Scan Partial Scan Built-in self-test (BIST) Boundary scan
DFT method for mixed-signal circuits: Analog test bus
04/10/23 53
Ad-Hoc DFT Methods Good design practices learnt through experience are used as
guidelines: Avoid asynchronous (unclocked) feedback. Make flip-flops initializable. Avoid redundant gates. Avoid large fanin gates. Provide test control for difficult-to-control signals. Avoid gated clocks. . . . Consider ATE requirements (tristates, etc.)
Design reviews conducted by experts or design auditing tools.
Disadvantages of ad-hoc DFT methods: Experts and tools not always available. Test generation is often manual with no guarantee of high
fault coverage. Design iterations may be necessary.
04/10/23 54
Scan Design Circuit is designed using pre-specified design rules. Test structure (hardware) is added to the verified
design: Add a test control (TC) primary input. Replace flip-flops by scan flip-flops (SFF) and connect to
form one or more shift registers in the test mode. Make input/output of each scan shift register
controllable/observable from PI/PO. Use combinational ATPG to obtain tests for all
testable faults in the combinational logic. Add shift register tests and convert ATPG tests into
scan sequences for use in manufacturing test.
04/10/23 55
Scan Design Rules
Use only clocked D-type of flip-flops for all state variables.
At least one PI pin must be available for test; more pins, if available, can be used.
All clocks must be controlled from PIs. Clocks must not feed data inputs of flip-flops.
04/10/23 56
Scan Flip-Flop (SFF)D
TC
SD
CK
Q
QMUX
D flip-flop
Master latch Slave latch
CK
TC Normal mode, D selected Scan mode, SD selected
Master open Slave open t
t
Logicoverhead
04/10/23 57
Level-Sensitive Scan-Design Flip-Flop (LSSD-SFF)D
SD
MCK
Q
Q
D flip-flop
Master latch Slave latch
t
SCK
TCK
SCK
MCK
TCK Norm
alm
ode
MCKTCK Sc
anm
ode
Logic
overhead
04/10/23 58
Adding Scan Structure
SFF
SFF
SFF
Combinational
logic
PI PO
SCANOUT
SCANINTC or TCK Not shown: CK or
MCK/SCK feed allSFFs.
04/10/23 59
Comb. Test Vectors
I2 I1 O1 O2
S2S1 N2N1
Combinational
logic
PI
Presentstate
PO
Nextstate
SCANINTC SCANOUT
04/10/23 60
Testing Scan Register Scan register must be tested prior to application
of scan test sequences. A shift sequence 00110011 . . . of length nsff+4 in
scan mode (TC=0) produces 00, 01, 11 and 10 transitions in all flip-flops and observes the result at SCANOUT output.
Total scan test length: (ncomb + 2) nsff + ncomb + 4 clock periods.
Example: 2,000 scan flip-flops, 500 comb. vectors, total scan test length ~ 106 clocks.
Multiple scan registers reduce test length.
04/10/23 61
Scan Overheads IO pins: One pin necessary. Area overhead:
Gate overhead = [4 nsff/(ng+10nff)] x 100%, where ng = comb. gates; nff = flip-flops; Example – ng = 100k gates, nff = 2k flip-flops, overhead = 6.7%.
More accurate estimate must consider scan wiring and layout area.
Performance overhead: Multiplexer delay added in combinational path;
approx. two gate-delays. Flip-flop output loading due to one additional
fanout; approx. 5-6%.
04/10/23 62
ATPG Example: S5378Original
2,781 179 0 0.0% 4,603 35/49 70.0% 70.9% 5,533 s 414 414
Full-scan
2,781 0 179 15.66% 4,603214/228 99.1% 100.0% 5 s 585105,662
Number of combinational gatesNumber of non-scan flip-flops (10 gates each)Number of scan flip-flops (14 gates each)Gate overheadNumber of faultsPI/PO for ATPGFault coverageFault efficiencyCPU time on SUN Ultra II, 200MHz processorNumber of ATPG vectorsScan sequence length
04/10/23 63
Summary: Scan Design Scan is the most popular DFT technique:
Rule-based design Automated DFT hardware insertion Combinational ATPG
Advantages: Design automation High fault coverage; helpful in diagnosis Hierarchical – scan-testable modules are easily
combined into large scan-testable systems Moderate area (~10%) and speed (~5%) overheads
Disadvantages: Large test data volume and long test time Basically a slow speed (DC) test
04/10/23 64
Built-In Self-Test(BIST)
04/10/23 65
BIST Process
Test controller – Hardware that activates self-test simultaneously on all PCBs
Each board controller activates parallel chip BIST Diagnosis effective only if very high fault coverage
04/10/23 66
Example External XOR LFSR
Characteristic polynomial f (x) = 1 + x + x3
(read taps from right to left)
04/10/23 67
Definitions Aliasing – Due to information loss,
signatures of good and some bad machines match
Compaction – Drastically reduce # bits in original circuit response – lose information
Compression – Reduce # bits in original circuit response – no information loss – fully invertible (can get back original response)
Signature analysis – Compact good machine response into good machine signature. Actual signature generated during testing, and compared with good machine signature
04/10/23 68
Example Modular LFSR Response Compacter
LFSR seed value is “00000”
04/10/23 69
Multiple-Input Signature Register
(MISR) Problem with ordinary LFSR response
compacter: Too much hardware if one of these is put
on each primary output (PO) Solution: MISR – compacts all outputs into
one LFSR Works because LFSR is linear – obeys
superposition principle Superimpose all responses in one LFSR –
final remainder is XOR sum of remainders of polynomial divisions of each PO by the characteristic polynomial
04/10/23 70
Modular MISR Example
04/10/23 71
Built-in Logic Block Observer (BILBO)
Combined functionality of D flip-flop, pattern generator, response compacter, & scan chain
Reset all FFs to 0 by scanning in zeros
04/10/23 72
Circuit Initialization Full-scan BIST – shift in scan chain seed before
starting BIST Partial-scan BIST – critical to initialize all FFs before
BIST starts Otherwise we clock X’s into MISR and signature is
not unique and not repeatable Discover initialization problems by:
1. Modeling all BIST hardware2. Setting all FFs to X’s3. Running logic simulation of CUT with BIST
hardware
04/10/23 73
Summary: BIST LFSR pattern generator and MISR response
compacter – preferred BIST methods BIST has overheads: test controller, extra
circuit delay, Input MUX, pattern generator, response compacter, DFT to initialize circuit & test the test hardware
BIST benefits: At-speed testing for delay & stuck-at faults Drastic ATE cost reduction Field test capability Faster diagnosis during system test Less effort to design testing process Shorter test application times
04/10/23 74
IEEE 1149.1 Boundary Scan Standard
04/10/23 75
System Test Logic
04/10/23 76
Serial Board / MCM Scan
04/10/23 77
Parallel Board / MCM Scan
04/10/23 78
Tap Controller Signals Test Access Port (TAP) includes these signals:
Test Clock Input (TCK) -- Clock for test logic Can run at different rate from system
clock Test Mode Select (TMS) -- Switches system
from functional to test mode Test Data Input (TDI) -- Accepts serial test
data and instructions -- used to shift in vectors or one of many test instructions
Test Data Output (TDO) -- Serially shifts out test results captured in boundary scan chain (or device ID or other internal registers)
Test Reset (TRST) -- Optional asynchronous TAP controller reset
04/10/23 79
Summary: Bound. Scan Functional test: verify system hardware, software,
function and performance; pass/fail test with limited diagnosis; high (~100%) software coverage metrics; low (~70%) structural fault coverage.
Diagnostic test: High structural coverage; high diagnostic resolution; procedures use fault dictionary or diagnostic tree.
SOC design for testability: Partition SOC into blocks of logic, memory and analog
circuitry, often on architectural boundaries. Provide external or built-in tests for blocks. Provide test access via boundary scan and/or analog
test bus. Develop interconnect tests and system functional tests. Develop diagnostic procedures.
04/10/23 80
IDDQ Test
04/10/23 81
Basic Principle of IDDQ Testing
Measure IDDQ current through Vss bus
04/10/23 82
Capacitive Coupling of Floating Gates
Cpb – capacitance from poly to bulk
Cmp – overlapped metal wire to poly
Floating gate voltage depends on capacitances and node voltages
If nFET and pFET get enough gate voltage to turn them on, then IDDQ test detects this defect
K is the transistor gain
04/10/23 83
Sematech Results Test process: Wafer Test Package Test Burn-In & Retest Characterize & Failure
Analysis Data for devices failing some, but not all, tests.
passpassfailfail
pass
146
52pass
pass60136fail
fail14633413
1251pass
fail718
fail
passfail
passfail
Scan
-bas
ed S
tuck
-at
IDDQ (5 A limit)
Functional Scan
-bas
ed d
elay
04/10/23 84
Summary: IDDQ Test IDDQ tests improve reliability, find defects
causing: Delay, bridging, weak faults Chips damaged by electro-static discharge
No natural breakpoint for current threshold Get continuous distribution – bimodal would be
better Conclusion: now need stuck-fault, IDDQ, and delay
fault testing combined Still uncertain whether IDDQ tests will remain
useful as chip feature sizes shrink further
04/10/23 85
References M.L. Bushnell and V. D. Agrawal, Essentials of
Electronic Testing for Digital, Memory and Mixed-Signal VLSI Circuits, Boston: Kluwer Academic Publishers, 2000, ISBN 0-7923-7991-8.
For the material on a course taught by the authors at Rutgers University, and a complete bibliography from the above book, see website:
http://cm.bell-labs.com/cm/cs/who/va