114
SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade Federal do Rio Grande do Sul Porto Alegre – RS – Brazil

SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Embed Size (px)

Citation preview

Page 1: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs

Prof. Fernanda Lima Kastensmidt, Ph.D.

Instituto de InformaticaUniversidade Federal do Rio Grande do SulPorto Alegre – RS – Brazil

Page 2: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Motivation A large set of electronics devices used in avionic, space and ground-level applications can be upset by ionized particles.

memoryprocessors

Analog electronics

FPGA

ASIC

Hardened components

COTS componentsX

$$$$$$$$$$$$$$ $$$

high reliability low reliability

General System

Page 3: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Motivation Solution I:

If it is too expensive, so the solution may be design your own hardened device!

– Which fault tolerance techniques should be used?– How much fault tolerance is enough?

It is necessary to qualify your hardened design.

Hardened components

$$$$$$$$$$$$$$

high reliability

Page 4: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Motivation Solution II:

It is necessary to qualify the device to analyze its robustness to the application!

– Is it possible to apply some fault tolerance technique? Software level Component replication level

COTS components

$$$

low reliability

Page 5: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Types of SEE

Single event phenomena can be classified into threeeffects (in order of permanency):

Single event upset and Single event transient (soft error)

Single event latchup (soft or hard error) Single event burnout (hard failure)

Hard errors or Single Event Latchup (SEL) are due to shorts between ground and power, and cause permanent functional damages.

Page 6: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Depending on the circuit, transistor size, charge energy, different current amplitude, duration and shapes will appear.

Collected Charge

Page 7: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

IC(t) = ICRITICAL(t) = IP(t) – ION(t)

IP

ION IC

Ion Ip

Ic

Soft Error occurs when Qcollected > Qcritical

Charge Collection Mechanism

Page 8: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Fault Tolerance

+-

-++-

ionization

FAILUREFault Masking: any technique that prevents faults from introducing errors to the output (failure)

Page 9: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Fault Tolerance

+-

-++-

ionization Transient current (injected or

extracted from the junction)

Transient voltage pulse(capacitor node)

FAULTERROR

clk

BIT-FLIP

FAULT EFFECT

FAILURE

Sensors(detection)

Error latencyFault latency

Fault Masking (hardening by design):Hardware and time redundancy

Hardened memory cellsError-correction codes

Self-checking mechanisms with recovery

shielding

Page 10: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Fault Tolerance

+-

-++-

ionization Transient current (injected or

extracted from the junction)

Transient voltage pulse(capacitor node)

FAULTERROR

clk

BIT-FLIP

FAULT EFFECT

FAILURE

Sensors(detection)

Error latencyFault latency

Redundant Spare

componentsFault Masking (hardening by design):

Hardware and time redundancyHardened memory cellsError-correction codes

Self-checking mechanisms with recovery

Number of faults overcome the mitigation technique

Page 11: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Outline

Radiation Effects on Digital ICs

Radiation Hardening by Design: Strategies for ASICs

Radiation Effects on FPGAs

Radiation Hardening by Design: Strategies for FPGAs

Final Remarks

Page 12: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Outline

Radiation Effects on Digital ICs

Radiation Hardening by Design: Strategies for ASICs

Radiation Effects on FPGAs

Radiation Hardening by Design: Strategies for FPGAs

Final Remarks

Page 13: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Single Event Effects (SEEs)

Single Event Upset (SEU): bit-flip in a sequential logic element

Digital Single Event Transient (DSET): transient voltage pulse in the combinational logic

Combinational logic

sequential logic sequential logic

0010

1

10 1

11

Transient Effect

Page 14: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

SEU in Sequential Logic

1 0

OF

F

OF

F

PN N

gnd

OF

F

OF

F

0 1

BIT-FLIP

ionization

P

WL WL

Page 15: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Hardened Memories

Approach 1: use decoupling resistors to slow the cell regenerative feedback response avoiding the bit-flip

[Rocket, R., IEEE TNS, 1992]

Page 16: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Hardened Memory

clk

D /D/QQ

clk

PE PF

PA PB

PC PD

A

B

C

Vss Vss

Vdd Vdd

Vdd Vdd

Vdd Vdd

N1 N2

P1 P2

N3 N4

Vdd

Vdd Vdd

Vdd

Vss Vss

D /D

clk

Q /Q

M LMP1 MP2

MN1 MN2

MN5 MN6

MN4MN3

MP4MP3

MP6MP5

Approach 2: add transistors to create an appropriate feedback devoted to restore the data corrupted.

IBM Memory Cell [Rockett cell, 88] HIT Memory Cell (Velazco, 92]

Page 17: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Hardened MemoriesThe principle is to store the data in two different locationstwo different locations

within the cell in such way that the corrupted part can be restored.

D

D /Q

Q

/clk

clk

Vss Vss

/D

MN0 MN1 MN2 MN3

clk

MN6MN5MN4 MN7

D

MP0 MP1 MP2 MP3

A B C D

Vss Vss Vss Vss

Vdd Vdd Vdd Vdd

Whitaker/Liu Memory Cell [Liu, 92] DICE Memory Cell [Calin, 96]

Page 18: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Dual Interlocked storage Cell (DICE)

clk clk

0

0

01

1O

FF

OF

F

OF

F

OF

F

Qa Qb

Page 19: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

clk clk

0

0

01

1

OF

F

0

OF

F

OF

F

OF

F

OF

F

OF

F

Qa Qb

Dual Interlocked storage Cell (DICE)

Page 20: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

OF

F

0

clk clk

0

01 1

OF

F

The original value is restored

OF

F

OF

F

OF

F

OF

F

OF

F

Qa Qb

0

Dual Interlocked storage Cell (DICE)

Page 21: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Challenges in Sequential Logic

Particle incidence angle Transistor Dimensions Voltage Supply Memory Array Density

+ - + -+ - + -+ - + -+ - + -

MULTIPLE BIT UPSETS

Single memory cell Multiple memory cells

Page 22: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Charge Sharing (NMOS transistor)T=0 T=100ps

T=250ps

T=50ps

T=800ps T=2ns

[Reed, et al., New Electronic Technologies Insertion into Flight Programs Workshop, 2007]

Page 23: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Limitations of Hardened Memory

Multiple nodes collecting charge are able to upset hardened memory cells.

Solutions: Shallow Trench Isolation (STI) structures Suitable transistors placement and routing Hardened memory cells combined with hardware

redundancy.

+-

-++-

ionization

-+

+-

+-

-+

Page 24: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Triple Modular Redundancy

OK

MAJ000 0001 0010 0011 1100 0101 1110 1111 1

inputs

MAJ

clk

Sequential logic

Combinational logic

X

Each master-slave flip-flip can be composed of: standard latches: robust to multiple node collected

charge in the same latch hardened latches: robust to multiple node collected

charge in crossing domain latches too

Page 25: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Triple Modular Redundancy

MAJ000 0001 0010 0011 1100 0101 1110 1111 1

inputs

clk

Sequential logic

Combinational logic

X

MAJ

Voter’s output can show a transient wrong value that may be captured by the next memory cell.

X 0

X 1

Page 26: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Triple Modular Redundancy

clk

Sequential logic

Combinational logic

MAJ

MAJ

MAJ

OK

Current strength

Triple MAJ voter

OK

OK • Increases current drive helping keeping the node in the original value.

Page 27: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Triple Modular Redundancy

MAJ000 0001 0010 0011 1100 0101 1110 1111 1

inputs

clk

Sequential logic

Combinational logic

X

X

X

X

Catastrophic effect: the system votes three wrong values out of three and the result is assumed to be correct.

TripleMAJvoter

Page 28: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

SET in Combinational Logic

Each node has an associated: Capacitance Resistance

curr

ent

time

Charge Qi

QDrift

Qdiffusion

Critical ChargeQCRIT

SET pulseAmplitude x Width

Page 29: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

SET in Combinational Logic

e0e1

e2a3

Q

10

0 1

Not all SETs are captured by a memory cell.

They can be: Logical masked Electrical masked Latch window masked

Logical masked

01

1

Page 30: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

SET in Combinational Logic

e0e1

e2a3

Q

01

1

Electrical masked

01

1

0

0

Not all SETs are captured by a memory cell.

They can be: Logical masked Electrical masked Latch window masked

Negligible pulse

Page 31: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

SET in Combinational Logic

e0e1

e2a3

Q

01

101

1

0

clk edge

0

Latch window masked

Not all SETs are captured by a memory cell.

They can be: Logical masked Electrical masked Latch window masked

Page 32: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Electrical Masking

[Bruguier, G., et al., IEEE TNS, 1996]

Heavy Ion Radiation Results: 180nm CMOS

Pulse too narrow!!!

Page 33: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

SET vs. Frequency

Radiation Results:DSET for 180nm vs. Freq

Freq.

clk

[Benedetto et al, IEEE TNS, 2004]

Page 34: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

TW

TW

TW

Challenges in Combinational Logic

SET Transient Width (TW) may vary from few hundred of pico seconds to few nano seconds according to LET.

clk

clk

clk[Dodd, P., IEEE TNS 2004]

TW

100

Crit

ical

Tra

nsie

nt W

idth

(p

s)

100 Ghz

5Ghz

2.5 GHz

1Ghz

500 Mhz

Process technology (nm)

Page 35: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

SET vs. SEU Error Rate

Page 36: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Challenges in Combinational Logic

According to the logic topology fan-out, a single SET may originate multiple SETs.

a0

a1a2a3

a4a5

y0

y1

Q0

Q1

X

X

Page 37: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Identifying the most sensitive nodes

Fault injection performed by electrical (SPICE) and logic simulations can identify the most sensitive nodes:

Lower critical charge (QCRIT)

Lower SET logical mask probability

AB

CD

E

F

Z

most sensitive nodes

Page 38: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Transistor Resizing

AB

CD

E

F

Z

most sensitive nodes

[Zhou et al., IRPS 2004] [Cazeaux et al., IOLTS 2005] [Dhillon et al., IEEE Transaction on ISVLSI 2006]

QCRITICAL

Page 39: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Gate Replication

AB

CD

E

F

Z

most sensitive nodes

[Lisboa, C., et al., SBCCI 2005]

• Increases current drive helping keeping the node in the original value.

[Nieuwland et al., IOLTS 2006]

Current strength

Page 40: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Temporal Filtering

Votes the SET out by time redundancy. The time redundancy is implemented by delays at the

clock lines or at the latch/flip-flops inputs.

clk

Sequential logic

Combinational logic

clk+ T

clk+ 2.T

X OK

Sequential logic

Combinational logic

clk

X OK

2.T

T

Tripleor

Single

MAJvoter

Tripleor

Single

MAJvoter

[Nicolaidis, VTS 1999], [Anghel et al., DATE 2000]

Page 41: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Full time redundancy

clk

Sequential logic

Combinational logic

clk+T

clk+ 2.T

X

clk

clk+T

T

comb

clk+2.T

T

SET

ffp0

ffp1

ffp2

MAJ

MAJ + comb delays

T

OK

[Nicolaidis, VTS 1999][Anghel et al., DATE 2000]

The .T is directly proportional to the SET Transient Width (TW)

Tripleor

Single

MAJvoter

TW

Page 42: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Full time redundancy

clk

Sequential logic

Combinational logic

clk+2.T

clk+4.T

XOK

clk

clk+2.T

T

comb

clk+4.T

T

SET

ffp0

ffp1

ffp2

MAJ

MAJ + comb delays

T

TW clk period (T)

Tripleor

Single

MAJvoter

2. TW

Page 43: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Temporal Latching to Trigger SETs

[Benedetto et al., IEEE TNS 2004]

Error cross-section decreases with the increase of T

.T

Page 44: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Triple Sample Memory Robust to Multiple Bit Upsets and SET

[MAVIS, IRPS 2002]

/D

MN0 MN1 MN2 MN3

clk

MN6MN5MN4 MN7

D

MP0 MP1 MP2 MP3

A B C D

Vss Vss Vss Vss

Vdd Vdd Vdd Vdd

combinational logic

Shifted clocks

Page 45: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Triple Sample Memory Robust to Multiple Bit Upsets and SET

[MAVIS, IRPS 2002]

combinational logic

Shifted clocks

X

OK

Page 46: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Triple Sample Memory Robust to Multiple Bit Upsets and SET

[MAVIS, IRPS 2002]

combinational logic

Shifted clocks

Multiple nodes collected charge

OK

OK

OK

X

Page 47: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Triple Sample Memory Robust to Multiple Bit Upsets and SET

[MAVIS, IRPS 2002]

combinational logic

Shifted clocks

OK

OK

OK

OK

Multiple nodes collected charge

Page 48: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Full Triple Modular Redundancy (TMR) with self-recovery

voter

voter

voter

TR0

TR1

TR1TR2

TR0TR2

TR2TR0TR1

TRV0

TRV1

TRV2

E0

E1

E2

D0

D1

D2

clk0

clk1

clk2

X

OK

OK

OK

combinational logic

combinational logic

combinational logic

Page 49: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Full Triple Modular Redundancy (TMR) with self-recovery

voter

voter

voter

TR0

TR1

TR1TR2

TR0TR2

TR2TR0TR1

TRV0

TRV1

TRV2

E0

E1

E2

D0

D1

D2

clk0

clk1

clk2

combinational logic

combinational logic

combinational logic

X

OK

OK

OK

Page 50: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Full Triple Modular Redundancy (TMR) with self-recovery

voter

voter

voter

TR0

TR1

TR1TR2

TR0TR2

TR2TR0TR1

TRV0

TRV1

TRV2

E0

E1

E2

D0

D1

D2

clk0

clk1

clk2

combinational logic

combinational logic

combinational logic

output pad

wired voter

output pads

Page 51: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

How much mitigation is enough?

The circuits are becoming more and more complex

Hardware and Time redundancy techniques can provide a certain level of protection on:– Single Event Upsets (SEU)– Single Event Transient (SET)– Multiple Bits or Nodes Upsets

Problem: in some cases multiple faults can overcome the mitigation techniques, provoking a system failure.

Page 52: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Multiple Faults in the Full TMR

voter

voter

voter

TR0

TR1

TR1TR2

TR0TR2

TR2TR0TR1

TRV0

TRV1

TRV2

E0

E1

E2

D0

D1

D2

clk0

clk1

clk2

combinational logic

combinational logic

combinational logic

X

X

WR

ON

G V

AL

UE

Page 53: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

How much mitigation is enough?

How is it possible to know that the mitigation technique is working properly for a certain Soft Error Rate (SER)?

It is necessary to have a mechanism to inform the system when the number of multiple faults have passed a certain level.

Built-in Self Test (BIST) Mechanism: – sensors working as watch dogs– each time an ionization occurs, the system is informed

Page 54: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

How about sensors working as watch dogs?

voter

voter

voter

TR0

TR1

TR1TR2

TR0TR2

TR2TR0TR1

TRV0

TRV1

TRV2

D0

D1

D2

clk0

clk1

clk2

combinational logic

combinational logic

combinational logic

sensors

sensors

sensors

sensors

sensors

sensors

Full TMR with sensors

Page 55: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

How about sensors working as watch dogs?

voter

voter

voter

TR0

TR1

TR1TR2

TR0TR2

TR2TR0TR1

TRV0

TRV1

TRV2

D0

D1

D2

clk0

clk1

clk2

combinational logic

combinational logic

combinational logic

sensors

sensors

sensors

sensors

sensors

sensors

If sensors detect:

• One upset per time

Technique is working!

Full TMR with sensors

Page 56: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

How about sensors working as watch dogs?

voter

voter

voter

TR0

TR1

TR1TR2

TR0TR2

TR2TR0TR1

TRV0

TRV1

TRV2

D0

D1

D2

clk0

clk1

clk2

combinational logic

combinational logic

combinational logic

sensors

sensors

sensors

sensors

sensors

sensors

If sensors detect:

• Two or more upsets in distinct redundant modules per time

Technique is not working!

Full TMR with sensors

X

Page 57: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Bulk Built-in Current Sensors

During normal operation, the current in the bulk is approximately zero. When an energetic particle generates an ionization, it creates a current that flows through the stroke node and Vdd or gnd. The bulk-BICS senses the current generated by ionization at the bulk terminal.

+ - + - + -

[Henes Neto et al. IEEE MICRO, 2006]

Bulk-BICS

Page 58: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Bulk Built-in Current Sensors

Circuit Design

Vdd’

Gnd’

Vdd

Vdd

BICS-N

BICS-P

n1 n2

n4 n3

n5

p4

n6

p6p5

p1 p2

p3

nRST

RST

Vdd

01

NP P

ionization

01

Flips the BICS latch

Page 59: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Trade-offs

There is always some penalty to be paid when protecting circuits against upsets.

Each technique may present a combination of:– area overhead, – performance penalty,– power dissipation increase.

The challenge is to select the most cost-effective techniques for the target circuit application.

Page 60: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

CASE-STUDY: Adder

ADDER X

XDetection• SET• SEU

ADDER

ADDER

=

Duplication with Comparison (DWC)

ADDER

Bulk-BICS

Bulk-BICS

ADDER

Recomputing with Shifted Operands

<<

<< >>

=

S = A + B

2.S = 2.A + 2.B

Page 61: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

CASE-STUDY: Adder

ADDER X

XSEU correction

ADDER

Hardened Flip-flops

ADDER

Error-Correction Code (Hamming)

enc dec

enc dec

enc dec

Page 62: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

CASE-STUDY: Adder

SEU and SET correction

ADDER

ADDER

ADDER

voter

ADDER

ADDER

ADDER

voter

voter

voter

TMR with single voter TMR with triple voter

Page 63: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

CASE-STUDY: Adder

SEU and SET correction

ADDER

voter

voter

voter

voter

voter2.T

T

Time redundancy with TMR in the registers

Page 64: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt0 500 1000 1500 2000 2500 3000

No protected

DWC

bulk-BICS

Recomputation with Shifted Operands

Hardened memory

ECC hamming

TMR single voter

TMR triple voter

Time Redundancy + TMR registers

Performance

Area

AREA vs. PERFORMANCE

SEU and SET detection

SEU correction

SEU and SET correction

Less than 50%

More than 200%

Less than 50%

Page 65: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

How about Qualifying for SEE?

Testing by fault injection:– Model the SEU and SET effect at:

Spice level Logic level or RTL level

Testing in a Laser Facility Testing at ground-level facilities

– (in front of a beam of Protons,

heavy ions, neutrons)

Testing in space (actual environment)

accu

racy

cost

Page 66: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

When testing in a Ground Level facility for SEE: Static Testing:

– no application is running during the test. – The register files are read during or after the test to check for SEU or/and SET and compared to a gold file. – Test in memories, microprocessors, ASICs in general

Dynamic Testing:– Applications are running during test.– Outputs are been analyzed and compared to a gold design. – SEU and SET can be checked during test – Test in memories, microprocessors, ASICs in general, analogcircuits, etc…

Page 67: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

General System

memory

processors

Analog logic

FPGA

ASIC

Page 68: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Outline

Radiation Effects on Digital ICs

Radiation Hardening by Design: Strategies for ASICs

Radiation Effects on FPGAs

Radiation Hardening by Design: Strategies for FPGAs

Final Remarks

Page 69: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Field-Programmable Gate Arrays An array of logic blocks and interconnections customizable by programmable switches. High logic density Customizable by the end user to realize different designs

Configurable logic blocks

(CLBs)

interconnections

Switches for customization

Page 70: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Programmable Technologies

Programmable switches can be based on:

Antifuse: (Antifuses based FPGAs)– when an electrically programmable switch forms a low resistance path between

two metal layers. – One-time configurable

SRAM: (SRAM based FPGAs) – the state of a static latch controls pass transistors or multiplexers connected to

pre-defined metal layers– Re-configurable

Flash: (Flash based FPGAs)– Floating gate controls the switches– Re-configurable

Page 71: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Antifuse-based FPGAs

Non-volatile: hold the customizable content even when not connected to the power supply. They can be programmed just once.

FPGAs products for Space– ACTEL– AEROFLEX (based on Quicklogic)

Page 72: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

ACTEL: RTAX-S device

RAM

CT

RAM

RAM

RAM

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

SCSCSCSCSCSC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RAMC

RD

RD

RD

RD

RD

RD

RD

RD

RD

RD

RD

RD

RD

RD

RD

RD

RD

RD

RD

RD

RD

RD

RD

RD

RD

RD

RD

HDHDHDHDHDHDHDHDHDHDHDHDHD

[Actel, RTAX-S RadTolerant FPGAs 2007]

C RRX

TX

RX

TX

RX

TX

RX

TX

BC CC R

Super Cluster

Page 73: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

ACTEL: RTAX-S device

C RC

D2

D0

DB

A0

A1 Y

D3

D1

B1

B0

FC

I

CF

N

10

10

10

10

10

D2

D0

DB

A0

A1

FC

O Y

D3

D1

B1

B0

CF

N10

10

10

10

10

10

C-CELL R-CELL

Robust to SEU

Susceptible to SET

[Actel, RTAX-S RadTolerant FPGAs 2007]

X

X

X

ERROR

C-CELL

Page 74: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Effects of Frequency Response

Circuit: Shift Register with 8 levels of C-cell between R-cells

Error cross-section increases when frequency increases.

# E

RR

OR

clk edge

[Berg, M. et al., IEEE TNS 2006]

Page 75: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

hardened flip-flops

ViaLink connections

RadHard Eclipse FPGA from Aeroflex

Robust to SEU

X

ERROR

Page 76: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Antifuse FPGAs: summary

Customized routing is not sensitive to SEU

Flip-flops are not sensitive to SEU– Actel and Aeroflex provides one solution where all

flip-flops are hardened.

Logic are susceptible to DSETs– The user may protect the logic by using high level

mitigation techniques in the VHDL/VERILOG description of the design (TMR, duplication and others)

Page 77: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

SRAM-based FPGAs

Volatile: loose their contents information when the memories are not connected to the power supply. They can be reprogrammed as many times as necessary at the work site They are programmed by loading a bitstream

FPGAs products for Space

– XILINX – ATMEL– HONEYWELL

Page 78: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

SRAM-based FPGAs

Basic board must be composed of:

FPGA

Osc.

IO Interface

Power SupplyCore & IO

EEPROM FPGA

LOADER& MEMORY

Pro

gra

mm

ing

Int

erfa

ce

The original design bitstream must be stored in a memory outside the FPGA.

Memory size needed:Bitstream may range from Kbytes to several Mbytes.

110101011

Page 79: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Reconfigurability

Can offer benefits for space and remote applications by:

saving space in the system: the same circuitry can be used with different configurations at different stages of a mission, reducing weight and power requirements.

allowing in-orbit design changes reducing the mission cost by correcting errors

If part of an FPGA fails, then circuitry can be reprogrammed to make use of remaining functional portions of the chips.

Page 80: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

FPGA Design Flow

Hardware Description Language

Synthesis optimizations

Logic mapping Placement

Routing

configuration bitstream… 101001110100000111…

Page 81: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Technology Scaling in Xilinx FPGAs

Nanometer technologies

Embedded Hard microprocessor

Embedded memories (BRAM)

Page 82: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

SRAM-based FPGA Architecture

Configurable logic block (CLB)

GRM

slices

A B C D

Lookup Table (LUT)

‘0’

0

1

1

1

1

11

1

0

1

0

0

1

01

0

BRAM

Boolean FunctionF(A,B,C,D)

Xilinx FPGA

Page 83: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Configuration memory bits

SEU in SRAM-based FPGAs: CLB slice

CLB slice000101

11

000010111

I1 I2 I3 I4

LUT

routing

LUT

Persistent effect (corrected by scrubbing)

Transient Effect (corrected at next ffp load)

Page 84: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Configuration memory bits

SET in SRAM-based FPGAs : CLB slice

CLB slice

000101

11

000010111

I1 I2 I3 I4

LUT

routing

X

LUT

SET may be captured by the ffp.

Page 85: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Direct connections

Hex connections

General Routing Matrix (GRM)

Direct lines

Double lines

CLB CLB CLB

CLB CLB

CLB CLB CLB

CLB CLB

Long lines

Hex lines

CLB CLB CLB CLB CLB CLB

CLB CLB CLB CLB

Fast connect

CLB

Page 86: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

0 1

short

1 0

open

Direct connections: Hex connections:

open

short

0 1 1 1

SEU in SRAM-based FPGAs: Routing

short

open

Page 87: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Other sensitive structures

Digital Clock Manager (DCM)

Power-on Reset (POR)

Input and Output Blocks (IOB)

• Low probability of occurrence• Signature: done pin transitions low, I/O becomes tri-stated, no user functionality available• Solution: reconfigure device

Single-Event-Functional Interrupts (SEFI)

SelectMAP and JTAG controllers• Low probability of occurrence• Signature: loss of communication, read access to configuration memory returns constant value.• Solution: reconfigure device

Power-PC Hard IP

Multi-Gigabit Transceivers (MGT)

Page 88: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

SEE Characterization – Heavy Ion: Static Testing in Virtex4

BRAMs present higher error cross-section compared to CLBs

Error cross-section of POR in Virtex4 has improved compared to Virtex-II.

[George, et al. IEEE Radiation Effects Data Workshop, 2006]

Page 89: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Scrubbing(full or partial

reconfiguration)

Scrubbing

Hardware Description Language

configuration bitstream … 101001110100000111…

TMR by hand

ISE tool Synthesis optimizations

Logic mapping Placement

Routing

ISE tool Placement

Routing

Fault Injection(fault tolerance verification)

10101011..

output

Page 90: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Scrubbing: continuous configuration

SRAM-based FPGA

OSC

INITDONE

CCLK

OE/RESET

CLK

XQR18V04DATA[7:0] DATA[7:0]

CE

WRGND

OE/RESET

CLK

XQR18V04DATA[7:0]

CE

I/O

GND

CS

BOOT

SCRUB

• No application interruption

PROM

It does not correct upsets in:- Embedded Memory (BRAM)- CLB flip-flops

00000001010101010101001010101001010101010101010101001011111111110111100000000111010101011010101010100101000010

10001101010

00000001010101010101001010101001010101000101010101001011111111110111100000000111010101011010101010100101000010

I/O

I/OI/O

SCRUB Controller

I/O

Configuration bits

Original bitstream

Page 91: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Configuration Scrubbing Example: to correct persistent effect faults

ScrubColumn

x

ConfigurationUpset

Page 92: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

ScrubColumn

ConfigurationUpsetRepaired

Scrubbing rate is important to reduce the probability of multiple upsets.

Scrubbing can be performed:

– from outside the FPGA by another FPGA controller

– from inside the FPGA: Hardware Internal Configuration Access Port (HWICAP)

Configuration Scrubbing Example: to correct persistent effect faults

Page 93: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Scrubbing(full or partial

reconfiguration)

Mitigation Techniques

Hardware Description Language

configuration bitstream … 101001110100000111…

TMR by hand

ISE tool Synthesis optimizations

Logic mapping Placement

Routing

ISE tool Placement

Routing

Fault Injection(fault tolerance verification)

10101011..

output

Page 94: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

X-TMR

Full TMR in: Combinational logic Sequential Logic Inputs/Output pads

INPUT

package PIN

REDUNDANT LOGIC (tr0)

REDUNDANT LOGIC (tr1)

REDUNDANT LOGIC (tr2)

OUTPUT

package PINT

MR

flip-flo

p

TM

R O

utput Vote

r

FPGA

REDUNDANT LOGIC (tr0)

REDUNDANT LOGIC (tr1)

REDUNDANT LOGIC (tr2)

TM

R flip

-flop

REDUNDANT LOGIC (tr0)

REDUNDANT LOGIC (tr1)

REDUNDANT LOGIC (tr2)

Why do we need full TMR?

To guarantee the correct output in the presence of the persistent effect errors that are corrected only by loading the correct bitstream.

Page 95: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

MAJ

MAJ

MAJ

clk0

clk1

clk2

TMR flip-flop

INPUT

package PIN

REDUNDANT LOGIC (tr0)

REDUNDANT LOGIC (tr1)

REDUNDANT LOGIC (tr2)

OUTPUT

TM

R flip

-flop

TM

R O

utput Vote

r

FPGA

REDUNDANT LOGIC (tr0)

REDUNDANT LOGIC (tr1)

REDUNDANT LOGIC (tr2)

TM

R flip

-flop

REDUNDANT LOGIC (tr0)

REDUNDANT LOGIC (tr1)

REDUNDANT LOGIC (tr2)

LUT: 00010111_00010111

R0 R1 R20 0 00 0 10 1 00 1 11 0 01 0 11 1 01 1 1

MAJ00010111

tr0

tr1

tr2

The recovery path is mandatory to correct the state of the flip-flops, specially in FSM.

Page 96: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

INPUT

package PIN

REDUNDANT LOGIC (tr0)

REDUNDANT LOGIC (tr1)

REDUNDANT LOGIC (tr2)

OUTPUT

package PIN

TM

R flip

-flop

TM

R O

utput Vote

r

FPGA

REDUNDANT LOGIC (tr0)

REDUNDANT LOGIC (tr1)

REDUNDANT LOGIC (tr2)

TM

R flip

-flop

REDUNDANT LOGIC (tr0)

REDUNDANT LOGIC (tr1)

REDUNDANT LOGIC (tr2)

R0

R1

R2

O_voter

O_voter

O_voterR2

R1

R0

R0 R1 R20 0 00 0 10 1 00 1 11 0 01 0 11 1 01 1 1

MAJ00011000

REF

LUT: 00011000_00011000

3-state_0

3-state_1

3-state_2

0: it allows the data to pass to the output pad.

1: it blocks the data

Page 97: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Evaluating TMR I/O pads

Inputs at 66 MHz

[Swift et al, IEEE TNS 2004]

Page 98: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Heavy Ion

[Swift et al., IEEE TNS 2004]

Evaluating TMR I/O pads

Page 99: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Evaluating Multiple Bit Upsets

220nm CMOS 130nm CMOS

Heavy ion radiation static test:

[Quinn, et al., IEEE TNS, 2005]

Virtex Family Virtex II Family

Page 100: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Bit-flips in the routing can generate short cut connections among different blocks of the TMR (tr0, tr1 and tr2).

INPUT

package PIN

REDUNDANT LOGIC (tr0)

REDUNDANT LOGIC (tr1)

REDUNDANT LOGIC (tr2)

OUTPUT

package PIN

TM

R re

giste

r w

ith vo

ters a

nd

refre

sh

tr0

tr1

tr2

TM

R O

utp

ut

Ma

jority V

ote

rFPGA

a

Bit-flip a: affects only the redundant logic tr0, consequently, the majority voter choose the correct result (two out of three outputs).

Domain Crossing Events

X

OK

OK

OK

Page 101: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Bit-flips in the routing can generate short cut connections among different blocks of the TMR (tr0, tr1 and tr2).

INPUT

package PIN

REDUNDANT LOGIC (tr0)

REDUNDANT LOGIC (tr1)

REDUNDANT LOGIC (tr2)

OUTPUT

package PIN

TM

R re

giste

r w

ith vo

ters a

nd

refre

sh

tr0

tr1

tr2

TM

R O

utp

ut

Ma

jority V

ote

rFPGA

b

Bit-flip b: affect two redundant logic parts, consequently, the majority voter will not choose the correct result (two out of three outputs).

Domain Crossing Events

OK

X

X

X

Page 102: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Solution to Reduce Domain Crossing Events

Voters Insertion:Barrier of voters can reduce the probability of a bit-flip in the

routing causing a short cut connection among two or more redundant blocks.

INPUT

package PIN

REDUNDANT LOGIC (tr0)

REDUNDANT LOGIC (tr1)

REDUNDANT LOGIC (tr2)

tr0

tr1

tr2

OUTPUT

package PIN

tr0

tr1

tr2

TM

R re

giste

r w

ith vo

ters a

nd

refre

sh

TM

R M

ajo

rity Vo

ter

tr0

tr1

tr2

TM

R M

ajo

rity Vo

ter

TM

R O

utp

ut

Ma

jority V

ote

r

FPGA

logic partition

[Kastensmidt, et al., DATE 2005]

b OK

OK

OKOK

X

X

OK

OK

OK

OK

Page 103: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Upsets in BRAMs are not corrected by scrubbing.

TMR with refreshing must be used to mitigate upsets.

Need to use Dual Port BRAMs.

Mechanism to refresh the memory contents– Counter– Voters

TMR BRAM (Embedded memory)

X

OKOK

Page 104: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Scrubbing(full or partial

reconfiguration)

Verifying the Mitigated Design

Hardware Description Language

configuration bitstream … 101001110100000111…

TMR by hand

ISE tool Synthesis optimizations

Logic mapping Placement

Routing

ISE tool Placement

Routing

Fault Injection(fault tolerance verification)

10101011..

output checking

Page 105: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Flash-based: ActelProASIC3

Page 106: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Flash-based FPGA: CLB tile

Page 107: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

SummaryAntifuse FPGAs:

- Fault tolerance techniques applied in VHDL/Verilog- protect SET (SEU is protected by the vendor)

SRAM FPGA- Fault tolerance techniques applied in VHDL/Verilog- Scrubbing to clean persistent faults- protect SET and SEU- New FPGA protected by Vendor is coming out!

Flash FPGA- Fault tolerance techniques applied in VHDL/Verilog- protect SEU and SET- Flash transistor sensitivity for SEE is low, still under

Investigation

Page 108: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Outline

Radiation Effects on Digital ICs

Radiation Hardening by Design: Strategies for ASICs

Radiation Effects on FPGAs

Radiation Hardening by Design: Strategies for FPGAs

Final Remarks

Page 109: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Final Remarks

Mitigation techniques for ASICs and FPGAs must take into account SEUs and SETs considering single and multiple effects.

ASICs: Integrated systems fabricated at nanometer technologies should have mitigation techniques at different levels to ensure robustness:– charge dissipation (transistor resizing, capacitors,

resistors)– Sensors (bulk-BICS)– hardware and time redundancy– Error-correction codes (ECCs)– Self-checking and recomputation

Page 110: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Final Remarks

FPGAs: new FPGA generations bring more flexibility and design capabilities but also more reliable design challenges.

The design can always be protected by high level techniques (VHDL, VERILOG) such as TMR.

In order to reduce the cost of TMR, solutions at the FPGA architectural level must be done in:– CLB logic:

Combinational blocks Sequential blocks Programmable switches

– Routing programmable switches… to mitigate against SEU and SET!

Page 111: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Conferences

NSREC – IEEE Nuclear and Space Radiation Effects Conference www.nsrec.com

RADECSEuropean Conference on Radiation Effects on Components and Systems www.radecs.org

2011- RADECS in Sevilla, SPAIN

Page 112: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Schools

SERESSA

First: 2006 - Manaus - BrazillSecond: 2007 - Sevilla - SpainThird: 2008 - Buenos Aires - ArgentinaFourth: 2009 - Florida, USA

2010 - France

2011 - Brazil

Page 113: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

Prof. Fernanda Lima Kastensmidt

Takasaki, JapanDecember 2-4th, 2009

TECHNICAL PROGRAM

Registration Daniel Loveless (Vanderbilt Univ.) TBD (ONERA)Basics Radiation testing

Welcome Michel Pignol (CNES)Robert Ecoffet (CNES) System hardening& Pascal Fouillat (IMS) Dale McMorrow (NRL)Environments & Anomalies & Vincent Pouget (IMS)

Laser testing

TBD (JAXA) Massimo Violante (Polito)TBD Software hardening TBD (JAEA)

TBD

Sarah Armstrong (NSWC) Fernanda Lima-Kastensmidt (UFRGS) Tour of JAEA Radiation testing facilitiesBasics SEU & SET in FPGA Ron Schrimpf (Vanderbilt Univ & ISDE)Single event effects

Raoul Velazco (TIMA) Guy Berger (UCL)Experiments & Rate prediction & Paul Peronnard (TIMA)

Remote Heavy Ion testing

Hugh Barnaby (ASU)Total dose effects Vincent Pouget (IMS)

TBD (LIMMS) Remote laser testingMEMS in space applications

Philippe Adell (JPL)Rad effects ConclusionsPower systems Bob Walters (NRL)

Radiation effects in solar cells

4

5

AM

PM

9

10

11

12

1

2

3

Page 114: SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs Prof. Fernanda Lima Kastensmidt, Ph.D. Instituto de Informatica Universidade

SEE Mitigation Strategies for Digital Circuit Design Applicable to ASIC and FPGAs

Fernanda Lima Kastensmidt, Ph.D.

[email protected]