Upload
beatrice-gardner
View
219
Download
0
Embed Size (px)
Citation preview
Electronic PartsEngineering
Ramin RoostaJet Propulsion Laboratory
Office 514
Xilinx SRAM Based FPGA Testing, Testability, and Reliability Issues
New Electronic Technologies and Insertion intoFlight Programs Workshop
January 30- February 1, 2007 at NASA/GSFC in Greenbelt, MD
2-1-2007 Ramin Roosta
2
Electronic PartsEngineering
● FPGA Testing FPGA Test Goals
FPGA Testing Phases Why FPGA Testing and Testability Analysis is Difficult? FPGA testing approach requirements
● Virtex Product Test Flow ● Application-independent Testing
Problems of Application-independent Testing
● Application Dependent Testing Interconnect Testing
Configurable Logic Blocks (CLB) Testing The fault models in FPGA testing Related works
● Deep Submicron process and its effects on Testing● FPGA (Virtex-4 Power Reduction)
● FPGA Reliability Analysis/Concerns
● Conclusion
Table of Contents
2-1-2007 Ramin Roosta
3
Electronic PartsEngineeringFPGA Test Goals
● Fully verify all parameters and the functionality of all features and resources to ensure full compliance with the data sheet
● Product features are fully characterized across temperature and voltages, with key parameters measured to guarantee performance
2-1-2007 Ramin Roosta
4
Electronic PartsEngineeringFPGA Testing Phases
● Design verification phase: behavioral simulation, static timing simulation (analysis), post layout functional and timing simulation, back-annotated testing and prototyping testing. (Relies heavily on design automation tools, such as simulation, logic/physical synthesis, and place & route tools). How Accurate are these tools?
● Production phase test: includes screening tests such as; burn in test, functional test, fault coverage analysis, internal speed test, at speed test, external speed test including verifications of set up/hold and delay characteristics of the IC, IO level test, and finally analog parametric tests including gain, noise, delay, time constants, precision and margins.
2-1-2007 Ramin Roosta
5
Electronic PartsEngineeringWhy FPGA Testing and Testability Analysis is Difficult?
● To Xilinx the FPGA looks like an ASIC. To the consumers it is an FPGA. This distinction should be kept in mind when testing the device
●Today's FPGAs are practically “System on a Chip”, thus testing the chip thoroughly is a daunting task, especially without the benefit of DFT
● Re-programmability of Xilinx FPGAs should be used to make several (different) images to test specific resource(s) of the FPGA
2-1-2007 Ramin Roosta
6
Electronic PartsEngineeringFPGA Testing Approach
Requirements● Test methodology must be generic, uniform and
application independent
● Test methodology must be scalable and independent of array size
● Test methodology must be reusable and lend itself to automation
● Test methodology must be must have measurable test quality metrics
Source[1]
2-1-2007 Ramin Roosta
7
Electronic PartsEngineeringXilinx Test Flow
2-1-2007 Ramin Roosta
8
Electronic PartsEngineering
● I/O testing Opens and Shorts Icc and Leakage I/O Parametric
● Functional tests CLB Test
BRAM memory test Configuration memory test
● Router Driven Test Methods● Layout Driven Metal Test Methods● Speed tests
FPGA Functional Test Descriptions
2-1-2007 Ramin Roosta
9
Electronic PartsEngineeringFPGA Architecture (Virtex-II Fabric)
2-1-2007 Ramin Roosta
10
Electronic PartsEngineering
Every instance of LUT, SELRAM, Flip-Flops, TBUF, BRAM, DCM, Global Clocks, Carry, etc are tested
FPGA Architecture (Virtex-II Fabric)
2-1-2007 Ramin Roosta
11
Electronic PartsEngineeringVirtex-4 Architecture
2-1-2007 Ramin Roosta
12
Electronic PartsEngineeringTesting the Slice Using Serial Shift
Register
● Easy to understand and document
● Quick diagnostics of a failure
● Consistency across the array
● Few I/Os required for test environment
2-1-2007 Ramin Roosta
13
Electronic PartsEngineeringBlockRAM Memory
2-1-2007 Ramin Roosta
14
Electronic PartsEngineeringTesting Configuration Memory
● ReadbackProcess of reading back the contents of configuration memory
● Four Readback test patterns for Configuration Memory All Zeros for Stuck-At-1 All ones for Stuck-At-0 Checkerboard for Coupling (AND/OR) Inverted Checkerboard for Coupling (AND/OR)
2-1-2007 Ramin Roosta
15
Electronic PartsEngineering
● Address Fault (AF)Caused by defects in the address lines and address decoder
● Stuck-at Fault (SAF)The logic value of a stuck-at memory cell is always 0 or 1
● Transition Fault (TF)A faulty cell or line with a rising (falling) transition fault fails to undergo a 0-1 (1-0) transition when written
● Stuck Open Fault (SOF)The word line retains the previous value when certain cells are accessed (typically, an open in word line access transistors)
● Coupling Fault (CF)Shorts and crosstalk between memory cells or linesIdempotent (forces a cell), Inversion (flips a cell), or Bridging (AND/OR)
● Passive Neighborhood Pattern Sensitive Fault (PNPSF)The contents of a memory cell cannot be changed due to a certain neighborhood pattern
Memory Fault Models
2-1-2007 Ramin Roosta
16
Electronic PartsEngineeringRouter Driven Test Method
-Patterns routed using same software as customer (Xilinx PAR)-Most Favorable Interconnect routed first (not all interconnects routed equal)-Utilization of interconnect is <3% for a single design, customer or test-99% interconnect coverage when compared to customer design utilization
Test Pattern generation flow- Input routed design to PAR- Route the design using the PAR - Routing information enter DB -PAR references DB to route next design
2-1-2007 Ramin Roosta
17
Electronic PartsEngineeringConfigurable Logic
Block Tile
2-1-2007 Ramin Roosta
18
Electronic PartsEngineeringRouting Phases
2-1-2007 Ramin Roosta
19
Electronic PartsEngineeringFPGA Routing
Resources
● The nature and availability of “Routing Resources” ultimatelydictates the interconnect scenarios within the FPGA.
● Interconnects is a major impediment to theperformance/Power consumption -Wires consume power, threatening chip performance.
Main routing components are :Wire segmentSwitching MatrixLow-skew (clock)Low-skew distribution
IEEE Spectrum - June 2006
2-1-2007 Ramin Roosta
20
Electronic PartsEngineeringFPGA Test Coverage Claims (Xilinx)
● Features Coverage 100% of FPGA features are testedEvery instance of LUT RAM, Flip-Flops, Carry, Tbuf, BlockRAM, DCM, etc. are tested
● Interconnect CoverageOverall Interconnect Coverage is > 99.7%For Customer Designs, Coverage is > 99.9%Interconnect is SAF and TF coverage by pattern construction, structured (BIST) method
● Today’s (Xilinx) test program contains 1800+ test configurations● Zero customer returns related to missing interconnect coverage
CF coverage is in development
2-1-2007 Ramin Roosta
21
Electronic PartsEngineeringLayout Driven Metal Test (LDMT)
● Test Metal Lines based on the physical Layout
● Attach test logic to each lineProven to be very effective to detect metal short
2-1-2007 Ramin Roosta
22
Electronic PartsEngineeringThe Fault Models in FPGA Testing
● Bridging FaultA short between a group of signalsThe logic value of the shorted
1-dominant (OR bridge)0-dominant (AND bridge)Indeterminate
● Stuck-at FaultA fixed (0 or 1) value to a signal line in the circuit single stuck-at faults: Most popular form (classical fault model)
● Delay FaultThe fault by the combinational delay of a circuit to exceed clock period
2-1-2007 Ramin Roosta
23
Electronic PartsEngineeringThe Functional Defects
● Interconnect defectModeled by Bridging faults and/or stuck-at faults
● CLB defectA faulty CLB can be detected through “the functional test” of the CLB.
● IOB defectThe information exchange with other components in the system may not be possible or reliable
CLB: configurable logic block, IOB: input/output block
2-1-2007 Ramin Roosta
24
Electronic PartsEngineeringProblems of Application-independent Testing
● Low efficiency in detecting timing-related faultsIt is impossible to test even a small fraction of “all possible” interconnection patterns that may occur in the user-defined configurations
● The decreased yield of FPGA vendorsSome defected chips are used in some designsThe defected resources are not used by the designs
2-1-2007 Ramin Roosta
25
Electronic PartsEngineeringApplication (Specific) dependent Testing
● Only resources used by a specific configuration (Design) is tested
● Avoids the disadvantage of the application-independent FPGA testing
● Time (to test) saving ● The increased yield of FPGA vendors● Xilinx uses this approach
“A Dynamic Platform for Reliability and Environmental Test of Re-programmable Xilinx Virtex-II 3000 FPGA”; Ramin Roosta, Ph.D. et al, Electronic Parts Engineering, NASA/JPL (Sponsored By NASA Electronic Parts and Packaging Program (NEPP))
2-1-2007 Ramin Roosta
26
Electronic PartsEngineeringMulti-Configuration Strategy (MCS)
The MCS have three test configurations:
● Interconnect testing (2)
-All the (built-in) LUTs in the used CLBs are reconfigured to implement logic “AND” or logic “OR” functions (All-0/1 pattern test vectors at the PIs)
-All the flip-flops in the application need to be preset to value “1” or “0”
● CLB Testing (1)Reprogramming the interconnect network and make each used CLB controlled by the primary inputs (PIs)
MCS: Multi-configuration strategy, CLB: configurable logic block, LUT: Look-Up Table
2-1-2007 Ramin Roosta
27
Electronic PartsEngineeringGeneral Model of the Interconnect Test Configuration
AND AND
AND AND
Combinational Part
Flip Flops
Clock
1D C1
PIs POsall-1s
all-1s
2-1-2007 Ramin Roosta
28
Electronic PartsEngineeringAn Example of Interconnect Testing
L1
L7
L4
AND
L2
AND
L5
L3
L9
L6
L8
AND
AND
AND AND AND
AND
AND
1
1
1
1
1
1
1
1
1
1
1
LUT Flip-Flop
n1
2-1-2007 Ramin Roosta
29
Electronic PartsEngineeringOriginal Application Configuration
F1 F4
F2 F5 F7 F9
F6 F8
F3
2-1-2007 Ramin Roosta
30
Electronic PartsEngineeringModified Configuration
2-1-2007 Ramin Roosta
31
Electronic PartsEngineering
● Law of physics: leakage current increases as channel and gate oxide thickness decrease
Xilinx Triple-Oxide Technology (90nm)
● Two oxide thicknesses are commonly used- Thin oxide in the fast core logic- Thick oxide in the versatile I/O
● Virtex-4 adds a third medium thickness oxide to reduce leakage current without compromising performance
Deep Submicron process and its effects on Testing
2-1-2007 Ramin Roosta
32
Electronic PartsEngineering
Nanometer-scale CMOS technologies Challenges
● Modeling, simulation and verification of system components● Accurate prediction of timing and powerdissipation● Design robustness and fault tolerance in the presence of highly unpredictable device behavior● Some physical design issues such as floor planning and routing give rise to challenges in system timing and signal integrity
2-1-2007 Ramin Roosta
33
Electronic PartsEngineering Nanometer-scale CMOS technologies Issues
● Design size and complexity● Timing based on signal integrity and IR drop● IR Drop ● Crosstalk and Inductance● Electro-migration● Digital/Analog Integration● Power consumption● System signal transmission● Manufacturing rules● Yield optimization
2-1-2007 Ramin Roosta
34
Electronic PartsEngineering Deep Submicron process and its effects on Testing
● Increased variability (on chip)
●Decreased reliability
● Leakage Current, Power Consumption
●Loss of operating margin
● Junction Temperature and Thermal Issues (Thermal Run away)
● Signal Integrity Issues (caused by faster I/O) at Chip/Board Level
● Design Entry and Power prediction tools’ Accuracy
2-1-2007 Ramin Roosta
35
Electronic PartsEngineering
Static Power variation & Saving
● At 90nm process technology static power becomes the dominant power factor (I/O’s are drawing minimal power)- Some FPGAs offer a lower power mode feature that disables the I/O putting it into a sleep mode that further reduces static power
● Static Power from leakage increases exponentially with temperature-Proportional to voltage (0.3 VCCINT /1.2)-Increases exponentially due to source → Drain leakage -At 1.26V static power for VCCINT is ~20% higher than at 1.2V (Try to use VCCINT close to 1.2V) -Keep junction temperature as low as possible
● Static power scales linearly with part size -Use smallest part to reduce leakage (Lx60 has 40% less leakage power than LX100)
● Static power is increased with process variation [VT and gate length (2.5x)]
-Look at worst case and typical at a given temperature
2-1-2007 Ramin Roosta
36
Electronic PartsEngineering Dynamic Power Variation & Saving
Dynamic power consumption (and performance) is very sensitive to switched capacitance, (mainly routing capacitance in Xilinx FPGAs )-Dynamic Power = N*CV2f
N = Number of nodes switchingC = Capacitive loadV = Voltage swingf = Switching rate – Dynamic power varies linearly with frequency
● Tighten VCCINT and run at center of range or down to 5% below center (Reduces leakage by better than 10% over Run at 1.2 V vs. 1.26V)
● Run non-critical functions with a low speed clock (rather than an arbitrary high speed clock present in the design)
.
●
2-1-2007 Ramin Roosta
37
Electronic PartsEngineering Overall Power Minimization in Virtex-4
● Power Minimization fall into a few areas
- Static & Dynamic Power (Adjustment to operating environment) -Design Code Optimization -Interconnect transistors --Bump up performance target for XST router (maybe able to gain 5-10% power
improvement--Minimize path length (capacitance and power is lowered)--Minimize interconnect hops (capacitance and power is lowered)--Interconnect capacitance--Use a Relationally Placed Macros (RPM) or other placement method to guide tighter placement and help reduce routing length, especially on repeated macros
--Number of nodes switching into a capacitive load--Minimize logic levels, Try to pack logic (if possible)--Clocks Driving Loads (Use BUFGMUX), reduces switching at target flip-flops, a common practice in ASIC design
2-1-2007 Ramin Roosta
38
Electronic PartsEngineeringFPGA Reliability Analysis/Concerns
● Transient errors due to complexity and feature size reduction in FPGAs thru redundancy based techniques● The rate of degradation due to the accelerated aging phenomena is dependent on: Supply voltage, temperature, switching activity, and leakage currents● Impact of different aging phenomenon resulting in permanent failures of the FPGAs’ components/interconnect circuitry such as; TDDB (reduces as gate leakage increases), Impact on HCE (as function of switching activities) and EM (interconnect)
● Aging impact of TDDB, EM and HCE on Xilinx style (SRAM Based) FPGAs using a set of benchmarks show that a significant portion of the FPGA resources (LUTs) may fail in the first 3 to 5 years of operation (commercial) [8]
2-1-2007 Ramin Roosta
39
Electronic PartsEngineeringConclusion (FPGA Testing)
● Multi-Configuration Strategy (MCS) provides a simple way to perform the interconnect and CLB testing in the application-dependent testing
● FPGAs are really SOCs requiring some DFT to be built in● Use BIST for Memory (MBIST) Megacells (ROMs, RAMs, FIFO)● Imbedded Scan to improve manufacturability
● Iddq measurement as a substitute or complementing Burn-In
● Signal Integrity related issues will dominate
● Power consumption, Junction Temperature, Thermal Runaway● Some Design Code & Place/Route Optimization is required
●Leave plenty of Margin
2-1-2007 Ramin Roosta
40
Electronic PartsEngineeringReferences
[1] M. B. Tahoori, E. J. McCluskey, M. Renovell, P. Faure, “A Multi-Configuration Strategy for an Application Dependent Testing of FPGAs,” Proc. VLSI Test Symp., 2004.
[2] M. B. Tahoori, “Application-Dependent Testing of FPGA Interconnects,” Proc. Int’l Symp. On Defect and Fault Tolerance, 2003.
[3] C. Jordan, W. P. Marnane, “Incoming inspection of FPGAs”, Proc. European Test Conf. pp. 371-377, 1993.
[4] W. K. Huang, F. J. Meyer, X.-T. Chen, F. Lombardi, “Testing Configurable LUT-Based FPGAs,” IEEE Trans. on VLSI Systems, pp. 276-283, June 1998.
[5] M. Abramovici, C. Stroud, “BIST-Based Detection and Diagnosis of Multiple Faults in FPGAs,” Proc. of Int’l Test Conf., 2000.
[6] A. Krasniewski, “Application-Dependent Testing of FPGA Delay Faults,” Proc. 25th EUROMICRO Conf., vol. 1, pp. 260-267, 1999.
[7] Das, N. A. Touba, “A Low Cost Approach for Detecting, Locating, and Avoiding Interconnect Faults in FPGA-Based Reconfigurable Systems,: Proc. of Int’l Conf. On VLSI Design, 1999.
[8] S. Srinivasan, N. Vijaykrishnan,K. Sarpatvari, “ FLAW: FPGA Lifetime Awareness” ,DAC 2006, July 24-28, 2006, San Francisco,
[9] S. Mahapatra, V. R. Rao, B. Cheng, M. Khare, C. D. Parikh, J. C. S. Woo and J. M. Vasi. “Performance and hot-carrier reliability of 100 nm channel
length jet vapor deposited Si3N4 MNSFETs” IEEE Transactions on Electron Devices, vol.48, (no.4), April 2001. pp 679-84.
[10] S. M. Alam, C. L. Gan, D. E. Troxel, and C. V. Thompson “Circuit-Level Reliability Analysis of Cu Interconnects” In Proceedings of International Symposium on Quality Electronics Design (ISQED) , 2004.
[11] J. Srinivasan, S. V. Adve, P. Bose and J. A. Rivers, “The Impact of Technology Scaling on Lifetime Reliability ” In Proceedings of International Conference on Dependable Systems and Networks (DSN), 2004.
[12] X. Xuan, A. Chatterjee, and A. D. Singh “Local Redesign for Reliability of CMOS Digital Circuits Under Device Degradation” In proceedings of International Reliability
Physics Symposium (IRPS), 2004.
[13] F. N. Najm “Transition density, a stochastic measure of activity in digital circuits” In Proceedings of Annual ACM IEEE Design Automation Conference, 1991.
[14] J. H. Anderson, F. Najm, and T. Tuan. “Active leakage power optimization for FPGAs,” In Proceedings of ACM/SIGDA International Symposium on Field-programmable gate arrays, 2004.
[15] “Critical Reliability Challenges for the International Technology Roadmap for Semiconductors” In International Sematech Technology transfer 03024377A-TR, 2003.
[16] S. Srinivasan, A. Gayasen, N. VijayKrishnan and T. Tuan “Leakage control in FPGA routing fabric” In Proceedings of Asia-Pacific Design Automation Conference (ASPDAC) , 2005.