10
1 P201-L/MAPLD2005 SEE Validation of SEU Mitigation Methods for FPGAs Carl Carmichael 1 , Sana Rezgui 1 , Gary Swift 2 , Jeff George 3 , & Larry Edmonds 2 1 Xilinx Corporation, San Jose CA 2 Jet Propulsion Laboratory, Pasadena CA 3 Aerospace Corporation, Albuquerque NM "This work was carried out in part by the Jet Propulsion Laboratory, California Institute of Technology, under contract with the National Aeronautics and Space Administration." "Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not constitute or imply its endorsement by the United States Government or the Jet Propulsion Laboratory, California Institute of Technology."

SEE Validation of SEU Mitigation Methods for FPGAs

Embed Size (px)

DESCRIPTION

SEE Validation of SEU Mitigation Methods for FPGAs. Carl Carmichael 1 , Sana Rezgui 1 , Gary Swift 2 , Jeff George 3 , & Larry Edmonds 2 1 Xilinx Corporation, San Jose CA 2 Jet Propulsion Laboratory, Pasadena CA 3 Aerospace Corporation, Albuquerque NM. - PowerPoint PPT Presentation

Citation preview

Page 1: SEE Validation of SEU Mitigation Methods for FPGAs

1 P201-L/MAPLD2005

SEE Validation of SEU Mitigation Methods for

FPGAs Carl Carmichael1 , Sana Rezgui1, Gary Swift2, Jeff George3, & Larry

Edmonds2

1Xilinx Corporation, San Jose CA2Jet Propulsion Laboratory, Pasadena CA3Aerospace Corporation, Albuquerque NM

"This work was carried out in part by the Jet Propulsion Laboratory, California Institute of Technology, under contract with the National Aeronautics and Space Administration."

"Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not constitute or imply its endorsement by the United States Government or the Jet Propulsion Laboratory, California Institute of Technology."

Page 2: SEE Validation of SEU Mitigation Methods for FPGAs

2 P201-L/MAPLD2005

XTMR SEE Testing• Experiments were devised to focus TMR mitigation on

major architectural elements of the Virtex-II FPGA.– Sequential State-Machines were created with Registers,

Multipliers, and Memories• Configurable Logic Block

– Combinatorial Logic, Sequential Logic, Arithmetics, Multiplexing.– Design implementation is an array of counters.

• Multipliers– Dedicated 18 x 18 bit multiply function blocks.– Design implementation is array of Multiply and Accumulate functions.

• Block Memories– Synchronous Dual Port 18k bit RAM blocks.– First Design is large memory block rewritten externally.– Second design Design implemented as an array of ROMs initialized to

incrementing values with internal EDAC.

Page 3: SEE Validation of SEU Mitigation Methods for FPGAs

3 P201-L/MAPLD2005

Plot Definitions• Predicted SEFI cross-section

– Static and Dynamic SEE Characterization of the Virtex-II FPGA revealed several Single Event Functional Interrupt Modes: POR (2.5E-06), SMAP (1.72E-06), IOB (4.2E-06)

– These combined cross-sections represent the minimum functional error cross-section for a single Virtex-II (XQR2V6000) device on orbit.

• Worst Case Orbital Upset Rate– CREME96 calculation of the worst case orbital upset rate for a XQR2V6000 is 7,740 bit-errors/day (9E-02

bit-errors/sec) in a GEO orbit at 36,000km during the worst day of an Anomalously Large Solar Flare accounting for both Heavy Ion and Proton. In a 40MeV Kr beam the exact same upset rate is achieved with a Flux of 1.25E-01 p/cm2/s. This denotes that the equivalent upset rates for all other orbits and solar conditions would reside to the LEFT of this line.

• Single Event Functional Interrupts– This is the average cross-section of the observed SEFI(s) while collecting the data represented in the plot.

This cross-section is not Flux dependent. Variations from the predicted value are due to statistical significance of the total accumulated fluence during each test.

• Functional Errors– Data plot of the observed events when the Device Under Test returned an incorrect result. Cross-section is

determined by the number of error events divided by total fluence at the specified flux. TMR denotes that the DUT design was fully mitigated with XTMR and scrubbing. The Unmitigated results were obtained with an identically functional design without XTMR, however scrubbing was also used for the unmitigated test.

• Extrapolation– A derived function describing the relation between Mitigation failure as a function of upset rate. Extension of

the function predicts functional error cross-sections at worst case orbital upset rates to be less than SEFI cross-sections.

Page 4: SEE Validation of SEU Mitigation Methods for FPGAs

4 P201-L/MAPLD2005

XQR2V6000 Mitigation Error Statistics(CLB/IOB Logic: State-Machines)

1.00E-07

1.00E-06

1.00E-05

1.00E-04

1.00E-03

1.00E-02

1.E-02 1.E-01 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04

Beam Flux (particles/cm2/s)

Sig

ma (

cm2 /d

evic

e)

Unmitigated Functional Errors

TMR Functional Errors

Extrapolation (square rootfunction)Single Event FunctionalInterupts (SEFIs)Worst Case Orbital Upset Rate(9E-2 Upsets/Sec)Predicted SEFI Cross-Section

PLOT 1

36,000km GEO Orbit Worst Day Solar Flare 8,000 bit-errors/day

All other orbits

SEFIs drive error rate for all designs and all orbits.

Mitigation errors on orbit are always less than SEFI errors by orders of magnitude

3.5E-02 3.5E-01 3.5E+00 3.5E+01 3.5E+02 3.5E+03Configuration Bit

Errors per Scrub Cycle

40 MeV Kr LET= 22.3 MeV/cm2/mg

Page 5: SEE Validation of SEU Mitigation Methods for FPGAs

5 P201-L/MAPLD2005

XQR2V6000 Mitigation Error Statistics(Dedicated Multipliers: Multiply-and-Accumulate)

1.00E-07

1.00E-06

1.00E-05

1.00E-04

1.00E-03

1.00E-02

1.E-02 1.E-01 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05

Beam Flux (particles/cm2/s)

Sig

ma

(cm2 /d

evic

e)

Unmitigated Functional Errors

TMR Functional Errors

Extrapolation (square rootfunction)Single Event FunctionalInterupts (SEFIs)Worst Case Orbital Upset Rate(9E-2 Upsets/Sec)Predicted SEFI Cross-Section

PLOT 2

36,000km GEO Orbit Worst Day Solar Flare 8,000 bit-errors/day

All other orbits

SEFIs drive error rate for all designs and all orbits.

Mitigation errors on orbit are always less than SEFI errors by orders of magnitude

3.5E-02 3.5E-01 3.5E+00 3.5E+01 3.5E+02 3.5E+03Configuration Bit

Errors per Scrub Cycle

40 MeV Kr LET= 22.3 MeV/cm2/mg

3.5E+03

Page 6: SEE Validation of SEU Mitigation Methods for FPGAs

6 P201-L/MAPLD2005

XQR2V6000 Mitigation Error Statistics(Block Memory: Read/Write)

1.00E-12

1.00E-10

1.00E-08

1.00E-06

1.00E-04

1.00E-02

1.00E+00

1.00E+02

1.E-02 1.E-01 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05

Beam Flux (particles/cm2/s)

Sig

ma(

cm2)

TMR Functional Errors

Extrapolation (square rootfunction)Single Event FunctionalInterupts (SEFIs)Worst Case Orbital Upset Rate(9E-2 Upsets/Sec)Predicted SEFI Cross-Section

PLOT 3

36,000km GEO Orbit Worst Day Solar Flare 8,000 bit-errors/dayAll other

orbitsSEFIs drive error rate for all designs and all orbits.

Mitigation errors on orbit are always less than SEFI errors by orders of magnitude

3.5E-02 3.5E-01 3.5E+00 3.5E+01 3.5E+02 3.5E+03Configuration Bit

Errors per Scrub Cycle

40 MeV Kr LET= 22.3 MeV/cm2/mg

3.5E+03

Page 7: SEE Validation of SEU Mitigation Methods for FPGAs

7 P201-L/MAPLD2005

Improved SEE Test Methodology for Mitigation

• There is an expected physical relationship between functional error rate of a mitigated system as a function of upset rate. The expected relationship is a function that predicts the increasing probability of upsetting bit combinations that will cause a mitigated (TMR) system to fail as a function of bit upset rate:

– R = Mitigation Error Rate– M = Number of groups of relevant bits– NB = Average number of relevant bits per group– TC = Scrub Time– r = Upset Rate of relevant bits.

• Therefore, testing at extremely high fluxes over several orders of magnitude variation can be performed to reveal this functional relationship between mitigation error rate and bit upset rate.

• This function can then be extrapolated to make predictions at the much lower upset rates of earth orbits.

.)]3exp(2)2exp(3[11

1

M

iCiCi

CCTrNTrN

TTR

Page 8: SEE Validation of SEU Mitigation Methods for FPGAs

8 P201-L/MAPLD2005

Mitigation System Topology

Group 1

Module 1 Module 2 Module 3

Block (1,1)

N1 bits. . .

Block (1,2)

N1 bits. . .

Block (1,3)

N1 bits. . .

Group 2

Block (2,1)

N2 bits. . .

Block (2,2)

N2 bits. . .

Block (2,3)

N2 bits. . .

Group M

Block (M,1)

NM bits. . .

Block (M,2)

NM bits. . .

Block (M,3)

NM bits. . .

… … … …

Page 9: SEE Validation of SEU Mitigation Methods for FPGAs

9 P201-L/MAPLD2005

Probability Function Fit for Counter Data

Counters

r (bit errors/bit-second)

1e-7 1e-6 1e-5 1e-4 1e-3

R (

syst

em e

rror

s/se

cond

)

0.0001

0.001

0.01

0.1

1

10

datasmall-r form extrapolatedfit using exact equation

M=9224Ni=200 (same number of bits in each block )Sigma per bit =2.1E-8 cm2TC=0.266 sec

Page 10: SEE Validation of SEU Mitigation Methods for FPGAs

10 P201-L/MAPLD2005

Conclusions

• Efficiency and accuracy of the validation of mitigation techniques is greatly improved by demonstrating the upset rate dependency of the mitigation method by testing at Flux rates that overwhelm the mitigation.

• The static SEFI cross-section is the dominating factor for calculating orbital error rates for any Virtex-II design when mitigated with Full XTMR & Scrubbing.– Additional Work

• Self-Scrubbing BlockRAMs• Self Scrubbing FPGA Configuration• Soft-core processors (e.g. Microblaze)