Electronic Parts Engineering Ramin Roosta Jet Propulsion Laboratory Office 514 Xilinx SRAM Based...

Preview:

Citation preview

Electronic PartsEngineering

Ramin RoostaJet Propulsion Laboratory

Office 514

Xilinx SRAM Based FPGA Testing, Testability, and Reliability Issues

New Electronic Technologies and Insertion intoFlight Programs Workshop

January 30- February 1, 2007 at NASA/GSFC in Greenbelt, MD

2-1-2007 Ramin Roosta

2

Electronic PartsEngineering

● FPGA Testing FPGA Test Goals

FPGA Testing Phases Why FPGA Testing and Testability Analysis is Difficult? FPGA testing approach requirements

● Virtex Product Test Flow ● Application-independent Testing

Problems of Application-independent Testing

● Application Dependent Testing Interconnect Testing

Configurable Logic Blocks (CLB) Testing The fault models in FPGA testing Related works

● Deep Submicron process and its effects on Testing● FPGA (Virtex-4 Power Reduction)

● FPGA Reliability Analysis/Concerns

● Conclusion

Table of Contents

2-1-2007 Ramin Roosta

3

Electronic PartsEngineeringFPGA Test Goals

● Fully verify all parameters and the functionality of all features and resources to ensure full compliance with the data sheet

● Product features are fully characterized across temperature and voltages, with key parameters measured to guarantee performance

2-1-2007 Ramin Roosta

4

Electronic PartsEngineeringFPGA Testing Phases

● Design verification phase: behavioral simulation, static timing simulation (analysis), post layout functional and timing simulation, back-annotated testing and prototyping testing. (Relies heavily on design automation tools, such as simulation, logic/physical synthesis, and place & route tools). How Accurate are these tools?

● Production phase test: includes screening tests such as; burn in test, functional test, fault coverage analysis, internal speed test, at speed test, external speed test including verifications of set up/hold and delay characteristics of the IC, IO level test, and finally analog parametric tests including gain, noise, delay, time constants, precision and margins.

2-1-2007 Ramin Roosta

5

Electronic PartsEngineeringWhy FPGA Testing and Testability Analysis is Difficult?

● To Xilinx the FPGA looks like an ASIC. To the consumers it is an FPGA. This distinction should be kept in mind when testing the device

●Today's FPGAs are practically “System on a Chip”, thus testing the chip thoroughly is a daunting task, especially without the benefit of DFT

● Re-programmability of Xilinx FPGAs should be used to make several (different) images to test specific resource(s) of the FPGA

2-1-2007 Ramin Roosta

6

Electronic PartsEngineeringFPGA Testing Approach

Requirements● Test methodology must be generic, uniform and

application independent

● Test methodology must be scalable and independent of array size

● Test methodology must be reusable and lend itself to automation

● Test methodology must be must have measurable test quality metrics

Source[1]

2-1-2007 Ramin Roosta

7

Electronic PartsEngineeringXilinx Test Flow

2-1-2007 Ramin Roosta

8

Electronic PartsEngineering

● I/O testing Opens and Shorts Icc and Leakage I/O Parametric

● Functional tests CLB Test

BRAM memory test Configuration memory test

● Router Driven Test Methods● Layout Driven Metal Test Methods● Speed tests

FPGA Functional Test Descriptions

2-1-2007 Ramin Roosta

9

Electronic PartsEngineeringFPGA Architecture (Virtex-II Fabric)

2-1-2007 Ramin Roosta

10

Electronic PartsEngineering

Every instance of LUT, SELRAM, Flip-Flops, TBUF, BRAM, DCM, Global Clocks, Carry, etc are tested

FPGA Architecture (Virtex-II Fabric)

2-1-2007 Ramin Roosta

11

Electronic PartsEngineeringVirtex-4 Architecture

2-1-2007 Ramin Roosta

12

Electronic PartsEngineeringTesting the Slice Using Serial Shift

Register

● Easy to understand and document

● Quick diagnostics of a failure

● Consistency across the array

● Few I/Os required for test environment

2-1-2007 Ramin Roosta

13

Electronic PartsEngineeringBlockRAM Memory

2-1-2007 Ramin Roosta

14

Electronic PartsEngineeringTesting Configuration Memory

● ReadbackProcess of reading back the contents of configuration memory

● Four Readback test patterns for Configuration Memory All Zeros for Stuck-At-1 All ones for Stuck-At-0 Checkerboard for Coupling (AND/OR) Inverted Checkerboard for Coupling (AND/OR)

2-1-2007 Ramin Roosta

15

Electronic PartsEngineering

● Address Fault (AF)Caused by defects in the address lines and address decoder

● Stuck-at Fault (SAF)The logic value of a stuck-at memory cell is always 0 or 1

● Transition Fault (TF)A faulty cell or line with a rising (falling) transition fault fails to undergo a 0-1 (1-0) transition when written

● Stuck Open Fault (SOF)The word line retains the previous value when certain cells are accessed (typically, an open in word line access transistors)

● Coupling Fault (CF)Shorts and crosstalk between memory cells or linesIdempotent (forces a cell), Inversion (flips a cell), or Bridging (AND/OR)

● Passive Neighborhood Pattern Sensitive Fault (PNPSF)The contents of a memory cell cannot be changed due to a certain neighborhood pattern

Memory Fault Models

2-1-2007 Ramin Roosta

16

Electronic PartsEngineeringRouter Driven Test Method

-Patterns routed using same software as customer (Xilinx PAR)-Most Favorable Interconnect routed first (not all interconnects routed equal)-Utilization of interconnect is <3% for a single design, customer or test-99% interconnect coverage when compared to customer design utilization

Test Pattern generation flow- Input routed design to PAR- Route the design using the PAR - Routing information enter DB -PAR references DB to route next design

2-1-2007 Ramin Roosta

17

Electronic PartsEngineeringConfigurable Logic

Block Tile

2-1-2007 Ramin Roosta

18

Electronic PartsEngineeringRouting Phases

2-1-2007 Ramin Roosta

19

Electronic PartsEngineeringFPGA Routing

Resources

● The nature and availability of “Routing Resources” ultimatelydictates the interconnect scenarios within the FPGA.

● Interconnects is a major impediment to theperformance/Power consumption -Wires consume power, threatening chip performance.

Main routing components are :Wire segmentSwitching MatrixLow-skew (clock)Low-skew distribution

IEEE Spectrum - June 2006

2-1-2007 Ramin Roosta

20

Electronic PartsEngineeringFPGA Test Coverage Claims (Xilinx)

● Features Coverage 100% of FPGA features are testedEvery instance of LUT RAM, Flip-Flops, Carry, Tbuf, BlockRAM, DCM, etc. are tested

● Interconnect CoverageOverall Interconnect Coverage is > 99.7%For Customer Designs, Coverage is > 99.9%Interconnect is SAF and TF coverage by pattern construction, structured (BIST) method

● Today’s (Xilinx) test program contains 1800+ test configurations● Zero customer returns related to missing interconnect coverage

CF coverage is in development

2-1-2007 Ramin Roosta

21

Electronic PartsEngineeringLayout Driven Metal Test (LDMT)

● Test Metal Lines based on the physical Layout

● Attach test logic to each lineProven to be very effective to detect metal short

2-1-2007 Ramin Roosta

22

Electronic PartsEngineeringThe Fault Models in FPGA Testing

● Bridging FaultA short between a group of signalsThe logic value of the shorted

1-dominant (OR bridge)0-dominant (AND bridge)Indeterminate

● Stuck-at FaultA fixed (0 or 1) value to a signal line in the circuit single stuck-at faults: Most popular form (classical fault model)

● Delay FaultThe fault by the combinational delay of a circuit to exceed clock period

2-1-2007 Ramin Roosta

23

Electronic PartsEngineeringThe Functional Defects

● Interconnect defectModeled by Bridging faults and/or stuck-at faults

● CLB defectA faulty CLB can be detected through “the functional test” of the CLB.

● IOB defectThe information exchange with other components in the system may not be possible or reliable

CLB: configurable logic block, IOB: input/output block

2-1-2007 Ramin Roosta

24

Electronic PartsEngineeringProblems of Application-independent Testing

● Low efficiency in detecting timing-related faultsIt is impossible to test even a small fraction of “all possible” interconnection patterns that may occur in the user-defined configurations

● The decreased yield of FPGA vendorsSome defected chips are used in some designsThe defected resources are not used by the designs

2-1-2007 Ramin Roosta

25

Electronic PartsEngineeringApplication (Specific) dependent Testing

● Only resources used by a specific configuration (Design) is tested

● Avoids the disadvantage of the application-independent FPGA testing

● Time (to test) saving ● The increased yield of FPGA vendors● Xilinx uses this approach

“A Dynamic Platform for Reliability and Environmental Test of Re-programmable Xilinx Virtex-II 3000 FPGA”; Ramin Roosta, Ph.D. et al, Electronic Parts Engineering, NASA/JPL (Sponsored By NASA Electronic Parts and Packaging Program (NEPP))

2-1-2007 Ramin Roosta

26

Electronic PartsEngineeringMulti-Configuration Strategy (MCS)

The MCS have three test configurations:

● Interconnect testing (2)

-All the (built-in) LUTs in the used CLBs are reconfigured to implement logic “AND” or logic “OR” functions (All-0/1 pattern test vectors at the PIs)

-All the flip-flops in the application need to be preset to value “1” or “0”

● CLB Testing (1)Reprogramming the interconnect network and make each used CLB controlled by the primary inputs (PIs)

MCS: Multi-configuration strategy, CLB: configurable logic block, LUT: Look-Up Table

2-1-2007 Ramin Roosta

27

Electronic PartsEngineeringGeneral Model of the Interconnect Test Configuration

AND AND

AND AND

Combinational Part

Flip Flops

Clock

1D C1

PIs POsall-1s

all-1s

2-1-2007 Ramin Roosta

28

Electronic PartsEngineeringAn Example of Interconnect Testing

L1

L7

L4

AND

L2

AND

L5

L3

L9

L6

L8

AND

AND

AND AND AND

AND

AND

1

1

1

1

1

1

1

1

1

1

1

LUT Flip-Flop

n1

2-1-2007 Ramin Roosta

29

Electronic PartsEngineeringOriginal Application Configuration

F1 F4

F2 F5 F7 F9

F6 F8

F3

2-1-2007 Ramin Roosta

30

Electronic PartsEngineeringModified Configuration

2-1-2007 Ramin Roosta

31

Electronic PartsEngineering

● Law of physics: leakage current increases as channel and gate oxide thickness decrease

  Xilinx Triple-Oxide Technology (90nm)

● Two oxide thicknesses are commonly used- Thin oxide in the fast core logic- Thick oxide in the versatile I/O

● Virtex-4 adds a third medium thickness oxide to reduce leakage current without compromising performance

Deep Submicron process and its effects on Testing

2-1-2007 Ramin Roosta

32

Electronic PartsEngineering

Nanometer-scale CMOS technologies Challenges

  ● Modeling, simulation and verification of system components● Accurate prediction of timing and powerdissipation● Design robustness and fault tolerance in the presence of highly unpredictable device behavior● Some physical design issues such as floor planning and routing give rise to challenges in system timing and signal integrity

2-1-2007 Ramin Roosta

33

Electronic PartsEngineering Nanometer-scale CMOS technologies Issues

 ● Design size and complexity● Timing based on signal integrity and IR drop● IR Drop ● Crosstalk and Inductance● Electro-migration● Digital/Analog Integration● Power consumption● System signal transmission● Manufacturing rules● Yield optimization

2-1-2007 Ramin Roosta

34

Electronic PartsEngineering Deep Submicron process and its effects on Testing

 ● Increased variability (on chip)

●Decreased reliability

● Leakage Current, Power Consumption

●Loss of operating margin

● Junction Temperature and Thermal Issues (Thermal Run away)

● Signal Integrity Issues (caused by faster I/O) at Chip/Board Level

● Design Entry and Power prediction tools’ Accuracy

2-1-2007 Ramin Roosta

35

Electronic PartsEngineering

Static Power variation & Saving

● At 90nm process technology static power becomes the dominant power factor (I/O’s are drawing minimal power)- Some FPGAs offer a lower power mode feature that disables the I/O putting it into a sleep mode that further reduces static power

● Static Power from leakage increases exponentially with temperature-Proportional to voltage (0.3 VCCINT /1.2)-Increases exponentially due to source → Drain leakage -At 1.26V static power for VCCINT is ~20% higher than at 1.2V (Try to use VCCINT close to 1.2V) -Keep junction temperature as low as possible

● Static power scales linearly with part size -Use smallest part to reduce leakage (Lx60 has 40% less leakage power than LX100)

● Static power is increased with process variation [VT and gate length (2.5x)]

-Look at worst case and typical at a given temperature

2-1-2007 Ramin Roosta

36

Electronic PartsEngineering Dynamic Power Variation & Saving

Dynamic power consumption (and performance) is very sensitive to switched capacitance, (mainly routing capacitance in Xilinx FPGAs )-Dynamic Power = N*CV2f

N = Number of nodes switchingC = Capacitive loadV = Voltage swingf = Switching rate – Dynamic power varies linearly with frequency

● Tighten VCCINT and run at center of range or down to 5% below center (Reduces leakage by better than 10% over Run at 1.2 V vs. 1.26V)

● Run non-critical functions with a low speed clock (rather than an arbitrary high speed clock present in the design)

.

2-1-2007 Ramin Roosta

37

Electronic PartsEngineering Overall Power Minimization in Virtex-4

 

● Power Minimization fall into a few areas

- Static & Dynamic Power (Adjustment to operating environment) -Design Code Optimization -Interconnect transistors --Bump up performance target for XST router (maybe able to gain 5-10% power

improvement--Minimize path length (capacitance and power is lowered)--Minimize interconnect hops (capacitance and power is lowered)--Interconnect capacitance--Use a Relationally Placed Macros (RPM) or other placement method to guide tighter placement and help reduce routing length, especially on repeated macros

--Number of nodes switching into a capacitive load--Minimize logic levels, Try to pack logic (if possible)--Clocks Driving Loads (Use BUFGMUX), reduces switching at target flip-flops, a common practice in ASIC design

2-1-2007 Ramin Roosta

38

Electronic PartsEngineeringFPGA Reliability Analysis/Concerns

● Transient errors due to complexity and feature size reduction in FPGAs thru redundancy based techniques● The rate of degradation due to the accelerated aging phenomena is dependent on: Supply voltage, temperature, switching activity, and leakage currents● Impact of different aging phenomenon resulting in permanent failures of the FPGAs’ components/interconnect circuitry such as; TDDB (reduces as gate leakage increases), Impact on HCE (as function of switching activities) and EM (interconnect)

● Aging impact of TDDB, EM and HCE on Xilinx style (SRAM Based) FPGAs using a set of benchmarks show that a significant portion of the FPGA resources (LUTs) may fail in the first 3 to 5 years of operation (commercial) [8]

2-1-2007 Ramin Roosta

39

Electronic PartsEngineeringConclusion (FPGA Testing)

● Multi-Configuration Strategy (MCS) provides a simple way to perform the interconnect and CLB testing in the application-dependent testing

● FPGAs are really SOCs requiring some DFT to be built in● Use BIST for Memory (MBIST) Megacells (ROMs, RAMs, FIFO)● Imbedded Scan to improve manufacturability

● Iddq measurement as a substitute or complementing Burn-In

● Signal Integrity related issues will dominate

● Power consumption, Junction Temperature, Thermal Runaway● Some Design Code & Place/Route Optimization is required

●Leave plenty of Margin

2-1-2007 Ramin Roosta

40

Electronic PartsEngineeringReferences

[1] M. B. Tahoori, E. J. McCluskey, M. Renovell, P. Faure, “A Multi-Configuration Strategy for an Application Dependent Testing of FPGAs,” Proc. VLSI Test Symp., 2004.

[2] M. B. Tahoori, “Application-Dependent Testing of FPGA Interconnects,” Proc. Int’l Symp. On Defect and Fault Tolerance, 2003.

[3] C. Jordan, W. P. Marnane, “Incoming inspection of FPGAs”, Proc. European Test Conf. pp. 371-377, 1993.

[4] W. K. Huang, F. J. Meyer, X.-T. Chen, F. Lombardi, “Testing Configurable LUT-Based FPGAs,” IEEE Trans. on VLSI Systems, pp. 276-283, June 1998.

[5] M. Abramovici, C. Stroud, “BIST-Based Detection and Diagnosis of Multiple Faults in FPGAs,” Proc. of Int’l Test Conf., 2000.

[6] A. Krasniewski, “Application-Dependent Testing of FPGA Delay Faults,” Proc. 25th EUROMICRO Conf., vol. 1, pp. 260-267, 1999.

[7] Das, N. A. Touba, “A Low Cost Approach for Detecting, Locating, and Avoiding Interconnect Faults in FPGA-Based Reconfigurable Systems,: Proc. of Int’l Conf. On VLSI Design, 1999.

[8] S. Srinivasan, N. Vijaykrishnan,K. Sarpatvari, “ FLAW: FPGA Lifetime Awareness” ,DAC 2006, July 24-28, 2006, San Francisco,

[9] S. Mahapatra, V. R. Rao, B. Cheng, M. Khare, C. D. Parikh, J. C. S. Woo and J. M. Vasi. “Performance and hot-carrier reliability of 100 nm channel

length jet vapor deposited Si3N4 MNSFETs” IEEE Transactions on Electron Devices, vol.48, (no.4), April 2001. pp 679-84.

[10] S. M. Alam, C. L. Gan, D. E. Troxel, and C. V. Thompson “Circuit-Level Reliability Analysis of Cu Interconnects” In Proceedings of International Symposium on Quality Electronics Design (ISQED) , 2004.

[11] J. Srinivasan, S. V. Adve, P. Bose and J. A. Rivers, “The Impact of Technology Scaling on Lifetime Reliability ” In Proceedings of International Conference on Dependable Systems and Networks (DSN), 2004.

[12] X. Xuan, A. Chatterjee, and A. D. Singh “Local Redesign for Reliability of CMOS Digital Circuits Under Device Degradation” In proceedings of International Reliability

Physics Symposium (IRPS), 2004.

[13] F. N. Najm “Transition density, a stochastic measure of activity in digital circuits” In Proceedings of Annual ACM IEEE Design Automation Conference, 1991.

[14] J. H. Anderson, F. Najm, and T. Tuan. “Active leakage power optimization for FPGAs,” In Proceedings of ACM/SIGDA International Symposium on Field-programmable gate arrays, 2004.

[15] “Critical Reliability Challenges for the International Technology Roadmap for Semiconductors” In International Sematech Technology transfer 03024377A-TR, 2003.

[16] S. Srinivasan, A. Gayasen, N. VijayKrishnan and T. Tuan “Leakage control in FPGA routing fabric” In Proceedings of Asia-Pacific Design Automation Conference (ASPDAC) , 2005.