12
1 CMOS CMOS INTEGRATED CIRCUIT DESIGN TECHNIQUES INTEGRATED CIRCUIT DESIGN TECHNIQUES University of University of Ioannina Ioannina On OnLine Testing Line Testing On OnLine Testing Line Testing Dept. of Computer Science and Engineering Dept. of Computer Science and Engineering Y. Tsiatouhas Overview Overview CMOS Integrated Circuit Design Techniques CMOS Integrated Circuit Design Techniques Overview Overview 1. 1. Reliability issues Reliability issues – Transient faults Transient faults 2. 2. Soft errors Soft errors 3. 3. Timing errors Timing errors On-Line Testing 2 4. 4. Error tolerance design techniques Error tolerance design techniques VLSI Systems and Computer Architecture Lab

CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN …tsiatouhas/CCD/Section_8_7-2p.pdf · 2017. 11. 14. · CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN TECHNIQUES University of Ioannina

  • Upload
    others

  • View
    15

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN …tsiatouhas/CCD/Section_8_7-2p.pdf · 2017. 11. 14. · CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN TECHNIQUES University of Ioannina

1

CMOSCMOS INTEGRATED CIRCUIT DESIGN TECHNIQUESINTEGRATED CIRCUIT DESIGN TECHNIQUES

University of University of IoanninaIoannina

OnOn‐‐Line TestingLine TestingOnOn‐‐Line TestingLine Testing

Dept. of Computer Science and EngineeringDept. of Computer Science and Engineering

Y. Tsiatouhas

OverviewOverview

CMOS Integrated Circuit Design TechniquesCMOS Integrated Circuit Design Techniques

OverviewOverview

1.1. Reliability issues Reliability issues –– Transient faultsTransient faults

2.2. Soft errorsSoft errors

3.3. Timing errorsTiming errors

On-Line Testing 2

4.4. Error tolerance design techniquesError tolerance design techniques

VLSI Systemsand Computer Architecture Lab

Page 2: CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN …tsiatouhas/CCD/Section_8_7-2p.pdf · 2017. 11. 14. · CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN TECHNIQUES University of Ioannina

2

Reliability in Nanometer TechnologiesReliability in Nanometer Technologies

The evolution (scaling) of CMOS technology

Soft Error Rate (SER) increases with technology scaling

( g) gyresults in:

• the reduction of the transistor size• the increment of the operating frequency• the reduction of the power supplyvoltage

• the increment of the transistors’ numberin a die

hi h i t ff t th i it i i

1MB SRAM

64K DynamicLogic

On-Line Testing 3

which in turn affect the circuit noise marginsand reduce their reliability.

SER vs VDD

IEDM ‐ 1999

• Permanent Faults: faults that permanently affect the circuit operation.• Temporary Faults: faults that do not affect permanently the circuit

operation and are discriminated to:f l h l k

Transient Faults and OnTransient Faults and On‐‐Line TestingLine Testing

Transient Faults: due to random fault generation mechanisms like powersupply disturbance, electromagnetic interference, radiation e.t.c.

Intermittent Faults: due to the degradation of the circuit characteristics.

• On‐Line Testing: testing procedures are performed during the circuitoperation.

Concurrent Testing: testing is performed concurrently to the circuit normalti

On-Line Testing 4

operation.

Periodic Testing: testing is performed periodically when the circuit is in theidle mode of operation.

• Off‐Line Testing: testing is performed when the circuit is not used (e.g.manufacturing testing).

Page 3: CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN …tsiatouhas/CCD/Section_8_7-2p.pdf · 2017. 11. 14. · CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN TECHNIQUES University of Ioannina

3

RequirementsRequirements

We need design techniques and architectures that will guarantee the correctoperation under any circumstances. Techniques that will provide error resiliencyor error detection / correction capabilities.We need self‐checking and concurrent testing mechanisms. We need self‐healingarchitectures that will dynamically react to overcome technology andenvironmental related variabilities and which will be capable to recover after anerror generation and will operate correctly even in the presence of failures.

Towards this direction error tolerance techniques have been proposed:

On-Line Testing 5

Towards this direction, error tolerance techniques have been proposed:• error correction codes and self‐checking circuits and checkers• error mitigation techniques aiming to reduce error rates• error detection and correction methodologies

RadiationRadiation and Soft Errors (I)and Soft Errors (I)

Cosmic Rays

PrimariesDisappear

Low Energy

Secondary Cosmic Ray

Disappear~25Km

Deflected

SecondaryCosmic Rays

(neutrons & pions)~1/cm2 ‐ sat sea level

Si

Burst of 

On-Line Testing 6

EarthElectronic Charge1M electrons/μm Si

Problem with Soft Errors in VLSI circuits due to Single Event Transients (SETs) and SingleEvent Upsets (SEUs) generated by:

• emitted α‐particles by package impurities• cosmic ray particles (neutrons, protons and pions)

Page 4: CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN …tsiatouhas/CCD/Section_8_7-2p.pdf · 2017. 11. 14. · CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN TECHNIQUES University of Ioannina

4

0

1 1

Transient pulse attenuation/masking!

RadiationRadiation and Soft Errors (II)and Soft Errors (II)

SET

1

10D Q

CLK

01 0

1

11

10

Soft Error generation!CLK

Transient Pulseof Duration 

SET

On-Line Testing 7

D Q

CLK

01 0

1

1

11

11

0 D

Soft Error

Q

Timing Errors (I)Timing Errors (I)

• Transient faults due to crosstalk, power supply disturbance or ground bounceand environmental variations (e.g. temperature gradients) contribute totiming error generation. Device aging (BTI, HCI effects) is also an important

f ti isource of timing errors.

• Timing verification turns to be a hard task. Moreover, the increased path delaydeviations, due to process variations and the statistical behavior of nano‐devices, as well as the manufacturing defects that affect circuit speed mayresult in timing errors that are not easily detectable (in terms of test cost) inhigh frequency and high device count ICs. Considering also the huge numberof paths in modern circuit designs along with the complexity of testing, it is

t li th t i ifi t b f d f ti IC th

On-Line Testing 8

easy to realize that a significant number of defective ICs may pass thefabrication tests.

• Modern systems running at multiple frequency and voltage levels may sufferfrom timing errors due to environmental and process related (and also datadependent) variabilities that can affect circuit performance.

Page 5: CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN …tsiatouhas/CCD/Section_8_7-2p.pdf · 2017. 11. 14. · CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN TECHNIQUES University of Ioannina

5

D Q

CLK

01 0

1

1

Timing Errors (II)Timing Errors (II)Error free case

Timing Error generation!

CLK

Delayed Pulseby d

D Q

CLK

01 0

1

1

On-Line Testing 9

D

CLK

d

Timing Error

Q

System Flip‐Flop

CLK

Logic Stage 

SLatch PH2Logic Stage 

The SThe Scan Design Topology of Intelcan Design Topology of Intel

Latch PH1C1 C2

C1

UPDATE

Sj+1Latch PH2

Latch LA

g g

Sj

Scan Flip‐Flop

D1D2

D1

Latch LB

D1

C1

SCA SCB

C1

D1

D2

C2CAPTURE

SI

SOLatch LA

C1

D1

D2

C2 Latch LB

D1

C1

On-Line Testing 10

Separated System and Scan Flip‐Flops

Extra cost of one Flip‐Flop

Page 6: CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN …tsiatouhas/CCD/Section_8_7-2p.pdf · 2017. 11. 14. · CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN TECHNIQUES University of Ioannina

6

Intel’s Error Resilience ApproachIntel’s Error Resilience Approach

Main Flip‐Flop

CLK

Logic Stage Logic Stage Latch PH1

C1 C2

C1

UPDATE

Error_L

Sj+1Latch PH2

Latch LA

Logic Stage 

Sj

Scan Flip‐Flop

Latch PH1

D1D2

D1

Latch LB

D1

C1

C1

D1

D2

C2CAPTURE

Scan_IN

Scan_OUT

XOR

XOR

OR

AND

On-Line Testing 11

IEEE Computer, 38 (2), 2005

• Additional cost of:  1Flip‐Flop,  2 XOR,  1 AND  and  1 OR  per  Flip‐Flop !

• The error detection latency is high ! Univ. of Stanford & Intel

SCA SCB

The BISER ArchitectureThe BISER Architecture

BISER Flip‐Flop

CLK

MainFli FlLogic Stage

ALogic Stage 

Sj+1

Flip‐Flop

ShadowLatch

Logic Stage 

Sj

SecondaryFlip‐Flop

C

Weak Keeper

C Element

A

VDD

B

B

Q

D

On-Line Testing 12

ICICDT, 2007

Univ. of Stanford & Intel

Gnd

Q

A

B

B

Muller C‐Element  

Page 7: CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN …tsiatouhas/CCD/Section_8_7-2p.pdf · 2017. 11. 14. · CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN TECHNIQUES University of Ioannina

7

Latch

Φ1

Logic Stage 0 Main

Logic Stage 

Register

The GRAAL ArchitectureThe GRAAL Architecture

Sj+10

1

Error_j

MainLatch

ShadowFlip‐Flop

MUXSj

XOR

Φ2

Error_j+1

On-Line Testing 13

ITC, 2007

TIMA & iRoc

Φ1

Φ1

Φ1

Φ2

Φ2

Φ2

Razor Flip‐Flop

CLK

Logic Stage 

S0 MainFli Fl

Logic Stage 

S

The Razor TopologyThe Razor Topology

0

Error_Rj

Sj+11

D_CLK

Error_L

Flip‐Flop

ShadowLatch

MUXSj

E

1

MUX

ShadowLatch

OR

XOR

On-Line Testing 14

D_CLK

ErrorFlip‐Flop

Error

IEEE Computer, 37 (3), 2004

ErrorFlip‐Flop

• Additional cost of:   1 Latch, 1 XOR, and 1 MUX  per Flip‐Flop !Univ. of Michigan & ARM

Page 8: CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN …tsiatouhas/CCD/Section_8_7-2p.pdf · 2017. 11. 14. · CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN TECHNIQUES University of Ioannina

8

Razor Flip‐Flop

CLK

Logic Stage 

S0 MainFli Fl

Logic Stage 

S

The Razor Operation (I)The Razor Operation (I)

CLK

Error_Rj

Sj+11

D_CLK

Error_L

Flip‐Flop

ShadowLatch

MUXSj

E0

00

Error Free CaseError Free Case

XOR

OR

On-Line Testing 15

D_CLK

ErrorFlip‐Flop

Error0

Razor Flip‐Flop

CLK

Logic Stage 

S0 MainFli Fl

Logic Stage 

S

The Razor Operation (II)The Razor Operation (II)

CLK

Error_Rj

Sj+11

D_CLK

Error_L

Flip‐Flop

ShadowLatch

MUXSj

E0

00

Erroneous CaseErroneous Case

00

11

01

11

00

10

XOR

OR

On-Line Testing 16

D_CLK

ErrorFlip‐Flop

Error0

D_CLK

01

• No error detection latency! • One clock cycle cost for error correction!

10

Page 9: CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN …tsiatouhas/CCD/Section_8_7-2p.pdf · 2017. 11. 14. · CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN TECHNIQUES University of Ioannina

9

The Time Dilation TopologyThe Time Dilation Topology

TD Flip‐Flop

CLK

Logic Stage 

S0 Main

TD Register

D

QM

Logic Stage 

S 0

Error_Rj

Sj+11

C CLK

Error_L

Flip‐Flop

Capture

QMMUX

ErrorFlip‐Flop

Sj

Error

1

MUX

XOR

OR

OR

On-Line Testing 17

Cap_CLK

Delay

• Additional cost of:   1 XOR, and 1 MUX  per Flip‐Flop.IEEE ICECS, 2006

Univ. of Ioannina & Athens

The Time Dilation Operation (I)The Time Dilation Operation (I)

TD Flip‐Flop

CLK

Logic Stage 

S0 Main

TD Register

D

QM

Logic Stage 

S

CLK

Error_Rj

Sj+11

C CLK

Error_L

Flip‐Flop

Capture

QMMUX

ErrorFlip‐Flop

Sj

Error

Error Free CaseError Free Case

0

0

0XOR

OR

OR

On-Line Testing 18

Cap_CLK

Delay

Page 10: CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN …tsiatouhas/CCD/Section_8_7-2p.pdf · 2017. 11. 14. · CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN TECHNIQUES University of Ioannina

10

The Time Dilation Operation (II)The Time Dilation Operation (II)

TD Flip‐Flop

CLK

Logic Stage 

S0 Main

TD Register

D

QM

Logic Stage 

S

CLK

Error_Rj

Sj+11

C CLK

Error_L

Flip‐Flop

Capture

QMMUX

ErrorFlip‐Flop

Sj

Error0

0

0

Erroneous CaseErroneous Case

0

0

1

1

Cap CLK

01

1

1

0

0

10

XOR

OR

OR

On-Line Testing 19

Cap_CLK

Delay

Cap_CLK

• No error detection latency!• One clock cycle cost for error correction!

Timing DiagramsTiming Diagrams

CLK

Cycle i Cycle i+1 Cycle i+2 Cycle i+3

Timing FaultValid Data 

Delayed Arrival

Cap_CLK

Data i Data i+1D

Error

Valid Data Correct Arrival

MUX Latch Memory State

MUX LatchExtended 

Memory State

On-Line Testing 20

Capture

Data i Data i+1Q

Erroneous DataTiming Error

Error Correction

Correct Data Correct Data

Data i

Page 11: CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN …tsiatouhas/CCD/Section_8_7-2p.pdf · 2017. 11. 14. · CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN TECHNIQUES University of Ioannina

11

The Time Dilation ArchitectureThe Time Dilation Architecture

eg.Logic

Stage  eg.Logic

Stage eg.Logic

Stage  eg.Logic

Stage eg.

CLK

D0 D1 D2 D3 D4Q0 Q1 Q2 Q3 Q4

TD FF Reg

IF

TD FF Reg

ID

TD FF Reg

EX

TD FF Reg

MEM

TD FF Re

Error_R0 Error_R1 Error_R2 Error_R3 Error_R4

Capture

ErrorFF

ErrorOR

OR

On-Line Testing 21

Cap_CLKDelay

Pipeline Organization

Pipeline RecoveryPipeline Recovery

Time in cycles

Failing stage Erroneous stage

nstructions

IF ID

IF

EX

ID

IF

MEM

~EX

ID

IF

EX

MEM

ID

IF

MEM

EX

ID

IF

MEM

EX

ID

On-Line Testing 22

I

Re‐execution with correct values at stage inputs

Page 12: CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN …tsiatouhas/CCD/Section_8_7-2p.pdf · 2017. 11. 14. · CMOSCMOS INTEGRATED INTEGRATED CIRCUIT DESIGN TECHNIQUES University of Ioannina

12

ReferencesReferences

• “Making Typical Silicon Matter with Razor”, T. Austin, D. Blaauw, T. Mudge and K.Flautner, IEEE Computers, vol. 37, no. 3, pp. 57‐65, 2004.

“R b S D i i h B il I S f E R ili ” S Mi N S if M• “Robust System Design with Built‐In Soft‐Error Resilience”, S. Mitra, N. Seifert, M.Zhang, Q. Shi and K‐S. Kim, IEEE Computers, vol. 38, no. 2, pp. 43‐52, 2005.

• “Built‐In Soft Error Resilience for Robust System Design,” S. Mitra et al,International Conference on IC Design and Technology, pp. 263‐268, 2007.

• “GRAAL: A New Fault Tolerant Design Paradigm for Mitigating the Flaws of DeepNanometric Technologies,” M. Nicolaidis, International Test Conference, p. 4.3,2007.

On-Line Testing 23

• “Testing and Reliable Design of CMOS Circuits”, N. Jha and S. Kundu, KluwerAcademic Publishers, 1990.