Upload
rozene
View
36
Download
1
Embed Size (px)
DESCRIPTION
Fault Tolerance: Basic Mechanisms. mMIC-SFT September 2003 Anders P. Ravn Aalborg University. Fault Tolerance. Means to isolate component faults. ... And mask them. Prevents system failures. May increase system dependability. Dependability - means. Fault prevention Fault tolerance - PowerPoint PPT Presentation
Citation preview
Fault Tolerance: Basic Mechanisms
mMIC-SFT September 2003
Anders P. Ravn
Aalborg University
Fault Tolerance
Means to isolate component faults
Prevents system failures
May increase system dependability
... And mask them
Dependability - means
• Fault prevention • Fault tolerance• Error Removal• Failure Forecasting
BW p. 106, ...
Fault Tolerance
FT - levels
• Full tolerance
• Graceful Degradation
• Fail safeBW p. 107
FT basis: Redundancy
• Time
• Space
Try Retry Retry ...
TryTry
Try
...
BW p. 109
N-version programming
V1 V2 V3
Driver (comporator)
Comparison vectors (votes)
Comparison status indicators
BW p. 109Comparison points
Fault classification (scope of N-VP)
• Origin
• Kind
• Property
• physical (internal/external)
• logical (design/interaction)
• omission
• value
• timing
byzantine
• duration (permanent, transient)
• consistency (determinate, nondeterminate)
• autonomy (spontaneous, event-dependent)
++
(+)++(+)
+ / (+)
+ / ++ / +
Dynamic Redundancy
1. Error detection
2. Damage confinement and assessment
3. Error recovery
4. Fault treatment and continued service
BW p. 114
Error Detection
f: State x Input State x Output
• Environment (exception)
• Application
BW p. 115
Assertion:• precondition (input)• postcondition (input, output)• invariant(state, state’)
Timing:• WCET(f, input) • Deadline (f,input)
D
Damage Confinement
• Static structure
• Dynamic structure
BW p. 117
object
object
II
Error Recovery
• Forward
• Backward
BW p. 118
Repair the state – if you can !
• define recovery points• checkpoint state at r. p.• roll back• retry
Domino effect
Recovery blocks
ENSURE acceptance_testBY { module_1 }ELSE BY { module_2 } ...ELSE BY { module_m }ELSE ERROR
BW p. 120
The ideal FT-component
Exception HandlerNormal mode
Request/response
Request/response
Interfaceexception
Interfaceexception
Failureexception
Failureexception
BW p. 126