Upload
lelien
View
225
Download
2
Embed Size (px)
Citation preview
System SafetyM6 Common Cause Analysis V1.1
Matthew Squair
UNSW@Canberra
12 October 2015
1 Matthew Squair M6 Common Cause Analysis V1.1
Except for images whose sources are specifically identified, this copyright work islicensed under a Creative Commons Attribution-Noncommercial, No-derivatives 4.0International licence.
To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/
2 Matthew Squair M6 Common Cause Analysis V1.1
1 Introduction
2 Overview
3 A simple model of common cause failures
4 Methodology
5 Modelling common cause hazards
6 Controlling common cause failure hazards
7 Limitations, advantages and disadvantages
8 Conclusions
9 Further reading
3 Matthew Squair M6 Common Cause Analysis V1.1
Introduction
1 Introduction
2 Overview
3 A simple model of common cause failures
4 Methodology
5 Modelling common cause hazards
6 Controlling common cause failure hazards
7 Limitations, advantages and disadvantages
8 Conclusions
9 Further reading
4 Matthew Squair M6 Common Cause Analysis V1.1
Introduction
Learning outcomes
To be able to appropriately apply common cause analysis techniques aspart of a hazard analysis
To understand the strengths and weaknesses of the method
5 Matthew Squair M6 Common Cause Analysis V1.1
Overview
1 Introduction
2 Overview
3 A simple model of common cause failures
4 Methodology
5 Modelling common cause hazards
6 Controlling common cause failure hazards
7 Limitations, advantages and disadvantages
8 Conclusions
9 Further reading
6 Matthew Squair M6 Common Cause Analysis V1.1
Overview
Overview
Systems by their nature exhibit emergent behaviour
Component focused hazard analyses (e.g FMECA) will find a set ofhazards unique to that component
But if we look for unintended interactions of that component with thesystem and environment we will find another set
we also need to consider how other components reduce or amelioratecomponent hazards
These unintended interactions have been termed Common CauseFailures (CCF)
Many safety standards (ARP 4761, DEF-STAN 00-56, MIL-STD-882 interalia) require CCF risks to be assessed
7 Matthew Squair M6 Common Cause Analysis V1.1
Overview
CCF can undermine designs & safety analyses
8 Matthew Squair M6 Common Cause Analysis V1.1
Overview
Key definitions
Dependent failure. The likelihood of a set of events, the probabilityof which cannot be expressed as simple product of the unconditionalfailure probabilities of the individual events
Common cause failure. This is a specific type of dependent eventthat arises in redundant components where simultaneous (or nearsimultaneous) multiple failures result in different channels from asingle shared event
Common mode failure. Common cause failures in which multipleitems fail in the same mode
Cascade failure. All dependent failures that are not common cause,i.e. they do not affect redundant components
9 Matthew Squair M6 Common Cause Analysis V1.1
Overview
Common cause analysis and the system lifecycle
CCA requires detailed knowledge of the system
10 Matthew Squair M6 Common Cause Analysis V1.1
A simple model of common cause failures
1 Introduction
2 Overview
3 A simple model of common cause failures
4 Methodology
5 Modelling common cause hazards
6 Controlling common cause failure hazards
7 Limitations, advantages and disadvantages
8 Conclusions
9 Further reading
11 Matthew Squair M6 Common Cause Analysis V1.1
A simple model of common cause failures
CCF basic model
12 Matthew Squair M6 Common Cause Analysis V1.1
A simple model of common cause failures
Case study: The loss of XV230
Figure: (XV230 lost due to a fuel fire in No. 7 ’dry’ bay)
13 Matthew Squair M6 Common Cause Analysis V1.1
A simple model of common cause failures
Case study: The loss of XV230 (The design error)
Figure: (No. 7 dry bay: Close proximity of fuel, catchment and ignition)
14 Matthew Squair M6 Common Cause Analysis V1.1
A simple model of common cause failures
Case study: The loss of XV230 (The flawed safety analysis)
Figure: (No. 7 dry bay (ID 312): How not to do a zonal hazard analysis)
15 Matthew Squair M6 Common Cause Analysis V1.1
A simple model of common cause failures
Objectives of CCA
Common cause analysis techniques focus on the detection ofnon-independence between events we may have assumed are independent
Functional or data sharing
Shared-equipment and services
Physical interactions (shared physical-spatial environment)
Human-interface (shared human error potential)
Is a known cause, unknown effect analysis
16 Matthew Squair M6 Common Cause Analysis V1.1
A simple model of common cause failures
Case study. Aircraft hydraulic systems failure
Figure: (Data source: Hawker Siddely)
17 Matthew Squair M6 Common Cause Analysis V1.1
Methodology Generic methodology
General process steps
Although there are different types of CCA, there is a general sequence ofsteps that are performed:
1 Identify groups of critical components from preceding analyses
2 Identify common features (location, design, services, data etc)
3 Identify credible common failure modes (EMI, electrical short)
4 Consider coupling mechanisms, failure mechanisms and their causes
5 Identify potential hazard controls (diversity, separation etc)
6 Assess the likelihood and risk (this is difficult without detail)
7 Document the analysis
18 Matthew Squair M6 Common Cause Analysis V1.1
Methodology Generic methodology
CCA analysis techniques
Specific analysis methods may be used to perform a CCA, depending uponwhat type pf CC is of concern:
Zonal (physical proximity)
Particular (external hazards)
Common mode (independence of components)
Cascading (knock on effects)
Common Cause analyses are often performed as part of the System HazardAnalysis (SHA)
19 Matthew Squair M6 Common Cause Analysis V1.1
Methodology Zonal hazard analysis
Zonal hazard analysis (proximity hazards)
The closer critical components are together (the more coupled) the higherthe likelihood they will interact
Increased coupling = increased likelihood = increased risk
Systems also contain natural physical choke points (conduits etc)
System cabling, pipe-work etc comes together at these points
Production/Maintenance (and access) is more difficult
Inspection is more difficult
Higher interaction (coupling) with vehicle structure
Zonal and particular hazards analysis address these proximity drivenhazards
20 Matthew Squair M6 Common Cause Analysis V1.1
Methodology Zonal hazard analysis
Case study: Shuttle Boron-Al structural support tubes
Tubes provided structural support, but thin walled so if bumped the tubesbuckled easily, unfortunately maintainers need access...
21 Matthew Squair M6 Common Cause Analysis V1.1
Methodology Zonal hazard analysis
Zonal hazard analysis methodology
The zonal hazard analysis examine each physical zone of a system toensure that equipment installation and potential physical interference withadjacent systems do not violate the independence requirements of thesystem
Should be based on the latest issue of drawings and mockups togetherwith an examination of the first representative system installation
22 Matthew Squair M6 Common Cause Analysis V1.1
Methodology Zonal hazard analysis
Zonal hazard analysis methodology (cont’d)
However
Design is mature at this point
Higher cost of rectifying hazards if identified
Major redesigns may be out of reach
Addressing zonal hazards as part of the PHA
If the conceptual design permits, it may be possible to identify genericzonal hazards and hazard controls. These should then be allocated tospecific chunks of the physical design. Requirements may take the form ofinstallation guidelines, for example, ”don’t install the fuel service lines overthe air conditioning bleed air pack”
23 Matthew Squair M6 Common Cause Analysis V1.1
Methodology Zonal hazard analysis
Case study: PHA fire hazard allocation to zones
24 Matthew Squair M6 Common Cause Analysis V1.1
Methodology Zonal hazard analysis
Zonal hazard analysis methodology (steps)
1 Develop a checklist of coupling/common cause mechanisms
2 Subdivide the system into zones
3 Identify component location by zone
4 Identify component specific hazards (fire, leakage)
5 Review compliance with installation rules and guidelines
6 Assess impact on other components of component hazards
7 Identify overall coupling/common cause mechanisms
8 Review internal zone then external zone interactions
9 Assess likelihood and risk (difficult may need detailed review)
10 Document the analysis
25 Matthew Squair M6 Common Cause Analysis V1.1
Methodology Zonal hazard analysis
Case study: External zone interactions (Fluid blow off)
Figure: Fluid tracking from overboard discharge into other zones26 Matthew Squair M6 Common Cause Analysis V1.1
Methodology Zonal hazard analysis
Case study: Jet pipe removal task for Gloster Javelin
Figure: Task is to remove jet pipe (Source: HISE U.York)
27 Matthew Squair M6 Common Cause Analysis V1.1
Methodology Zonal hazard analysis
Zonal coupling/common cause mechanisms
Spatial positioning of components
Stresses on pipes & cables
Projectile and debris paths from explosive events
Fuel pooling and catchment areas
Cross contamination from clean to dirty systems
The spatial position of components within the system
Degradation of hazard ameliorators in the zone
Access, installation & removal difficulty and damage potential
Location (vulnerability) of operators or control systems
Major zone events (fire, flood etc)
Environment (temperature, EMI, moisture, shock or vibration)
Wiring and electrical cabling co-location (short potential)
28 Matthew Squair M6 Common Cause Analysis V1.1
Methodology Zonal hazard analysis
Case Study: Electrical wiring
Functionally simple:
insulation
conductors
connectors
circuit breakers
clamps
conduits
But physically:
lots of it
crosses multiple zones
no single causal factor in failures
’fit and forget’ attitude
almost every function runs through it
failure can be catastrophic (TWA 800)
Spatial and functional design dependence
Spatial (geometrical) design and it’s design decisions can affect functionaldesign in unexpected ways.
29 Matthew Squair M6 Common Cause Analysis V1.1
Methodology Particular hazard analysis
Particular hazard analysis
Specialist (technology/circumstance dependent) analyses that evaluateexternal hazards (e.g fire, rotor bursts, HIRF) which could compromisefunctional redundancy. Usually deterministic.
Figure: Example:Penetration of wing by HPT disc fragment
30 Matthew Squair M6 Common Cause Analysis V1.1
Methodology Common mode hazard analysis
Common mode hazard analysis
Common mode hazard analyses verify that ANDed events in safetyanalyses such as Fault Trees, Dependence Diagrams or Markov Analysesare actually independent in real life
Example common mode failures:
Software design errors (important to model!)
Hardware design errors (important to model!)
Hardware failures
Production or repair flaws
Stress related events (abnormal environment)
Environment (temperature,vibration, acoustic)
Methodology is covered further in the fault tree module
31 Matthew Squair M6 Common Cause Analysis V1.1
Methodology Cascading failure hazard analysis
Cascading hazard analysis
Cacading failures are common in power and other services systems, whereone element fails the additional load on others pushes them beyond theirlimits and their failure rate increases
Cascading failure can also occur in structural systems, e.g the so calledzipper effect
Complex networks like the power grid and computer networks also exhibitcascade failure effects, where an apparently small initiating event cancause loss of the system
32 Matthew Squair M6 Common Cause Analysis V1.1
Methodology Cascading failure hazard analysis
Case study: Boeing 787 LiON battery cell cascade
Note the effect of progressivefailure of battery cells due tophysical proximity
Note also that the batteriesvented into the internalbattery box volume
Figure: NTSB X-ray of JAL 787 battery box
33 Matthew Squair M6 Common Cause Analysis V1.1
Methodology Cascading failure hazard analysis
Cascading hazard analysis
For electrical components relationships between load and failure rate arequantified in MIL-HDBK-217 Reliability Prediction of ElectronicEquipment[DOD (US) 1995]
Cascading failure in structure are normally addressed by adherence todesign standards that preclude these effects, rather than formal analysis.Although sometimes these standards get it wrong, see Aloha Flight 243 asan example
The analysis of complex network systems requires system specificsimulation and modelling of the system and it’s responses
Network-to-network interactions can cause cascading failures. For examplea power outage knocks out phone lines that are used to control powersystem substations
34 Matthew Squair M6 Common Cause Analysis V1.1
Modelling common cause hazards
Modelling CCF
Two ways to model CCFs and hazards
Explicit
Explicitly model in fault or event trees
Model structural and functional interdependencies
system-specific but they dont cover impact of potential CCF on safetycompletely
Implicit
Implicitly model the residual CCF fractions
Various types (Marshall Olkin, beta, MGL etc)
In principle can cover all CCF, but misses structural/functionaldependencies
35 Matthew Squair M6 Common Cause Analysis V1.1
Modelling common cause hazards
Explicit modelling example
Figure: Explicit modelling of CCF [Clements 1996]36 Matthew Squair M6 Common Cause Analysis V1.1
Modelling common cause hazards
Implicit modelling example - The Beta model
A simple model. Failures in a group are either independent (k=1) or allcomponents fail (k = n). Component probability has an independent andCCF component
Qt = Q1 + Qn .... total failure probability (1)
Beta is the ratio of CCF probability to total failure probability
β = Qk/Qt .... Ratio. (2)
And after some work (not shown) we get the failure probabilities
Qk =
(1 − β)Qt if k = 1,
0 if m > k > 1,
β.Qt if k = n
(3)
37 Matthew Squair M6 Common Cause Analysis V1.1
Modelling common cause hazards
Implicit modelling example - The Beta model (cont’d)
Beta can be determined fairly easily from operational experience
To conservative in the case of simultaneous failures of N>2 items
As you decrease the CCF failure rates you increase the independent failurerates...which is odd
Easy to apply and a good ’sand boxing’ technique
IEC’s functional safety standard[IEC 61508] uses the beta factor method
38 Matthew Squair M6 Common Cause Analysis V1.1
Modelling common cause hazards
Case study. NASA seal CCF analysis
Design concept for triple redundant seal considered only independentfailures. Heritage data indicates CCF of redundant but similar seals can beexpected
Modelling CCF using a Beta factor (assumed to be 0.1) and anindependent seal failure rate of 0.001 (from historical records) gives
Q3 = β × Q1 = 0.10 × 0001 = 0.0001
Compare to the independent failure rate of 10E-9
Common cause failure dominance
As redundancy increases, common cause failure rates come to dominatesafety considerations
39 Matthew Squair M6 Common Cause Analysis V1.1
Modelling common cause hazards
Case study. NASA seal CCF analysis (cont’d)
Figure: Fault tree representation of CCF [NASA OSMA 2002]
40 Matthew Squair M6 Common Cause Analysis V1.1
Modelling common cause hazards
CCFs are rare
Individual systems present limited experience
Global industry (multi-system) experience is needed to make statisticalinferences
But there can be significant variability among systems due to differences incoupling mechanisms and defenses
Modelling needs to address this uncertainty of CCF base data
CCF uncertainty
Uncertainty of CCF data due to the rare nature of such events means thatas we increase system redundancy uncertainty of risk grows with theincreasing significance of common cause effects
41 Matthew Squair M6 Common Cause Analysis V1.1
Modelling common cause hazards
Model completeness and uncertainty
Figuring out what aspects of the real world you don’t have to model is animportant (and subjective) part of model building
If we make our models too complex, we’ll have a hard time figuring outwhether they’re internally consistent and makes sense
But when we simplify we need to understand what sort of uncertainty weare introducing and therefore risk
Incompleteness in modelling CCF
The completeness with which CCF is (or is not) modelled can introduceuncertainty in the assessment of risk, in the worst case we may beunderestimating the system risk
42 Matthew Squair M6 Common Cause Analysis V1.1
Controlling common cause failure hazards
1 Introduction
2 Overview
3 A simple model of common cause failures
4 Methodology
5 Modelling common cause hazards
6 Controlling common cause failure hazards
7 Limitations, advantages and disadvantages
8 Conclusions
9 Further reading
43 Matthew Squair M6 Common Cause Analysis V1.1
Controlling common cause failure hazards
Common cause hazard controls
Reduce functional coupling (common services/processes)
Reduce physical/spatial coupling
Reduced shared environments of redundant equipmentIsolate and contain physical failure effectsHarden component(s) against environmental effectsIncreased inspection of high wear/stress zones
Reduce human error coupling
Error proof the systemsIndependent inspection and cross checksMinimise the conduct of simultaneous critical maintenance tasks
44 Matthew Squair M6 Common Cause Analysis V1.1
Limitations, advantages and disadvantages
1 Introduction
2 Overview
3 A simple model of common cause failures
4 Methodology
5 Modelling common cause hazards
6 Controlling common cause failure hazards
7 Limitations, advantages and disadvantages
8 Conclusions
9 Further reading
45 Matthew Squair M6 Common Cause Analysis V1.1
Limitations, advantages and disadvantages
Limitations
Limitations of the technique
Deciding on zones for zonal analysis can be problematic
Zonal and particular hazard analysis are qualitative hazardidentification techniques
The level of detail can bog the analysis
Common mode analysis is really an analysis of a previous analysis,what happens when it hasnt been done?
46 Matthew Squair M6 Common Cause Analysis V1.1
Limitations, advantages and disadvantages
Advantages and disadvantages
CCA has the following advantages
it can address common cause failure modes which are a significantcontributor to system hazards
it can help focus design effort on aspects of system failure that aretraditionally poorly handled
it can be used to refine FTA and ETA analyses
And the following disadvantages
The analysis effort climbs steeply with the number of elements
For complex systems (e.g wiring) a database may be required
Require a maturity of design that makes subsequent changes difficult
47 Matthew Squair M6 Common Cause Analysis V1.1
Conclusions
1 Introduction
2 Overview
3 A simple model of common cause failures
4 Methodology
5 Modelling common cause hazards
6 Controlling common cause failure hazards
7 Limitations, advantages and disadvantages
8 Conclusions
9 Further reading
48 Matthew Squair M6 Common Cause Analysis V1.1
Conclusions
Conclusions
Spatial design and functional design are not independent
Common cause failures dominate system behaviour as redundancy isincreased
Their presence can invalidate safety analyses based on the premise ofindependence of events
Without some form of common cause analysis you are effectively makingan assumption of independence of failure events, this may or may not betrue
These issues tend to cross subsystem boundaries and fall between thecracks of system design, therefore worthwhile to focus early system safetyattention on these issues
49 Matthew Squair M6 Common Cause Analysis V1.1
Further reading
Bibliography
[Clements 1996] Clements, P., (1996) Sverdrup System Safety Course Notes, Sverdrup.
[DOD (US) 1995] DoD (US) (1995) MIL HDBK-217F, Reliability Prediction ofElectrical Equipment, Ed. F, 1991, Not. 2.
[DoD (US) 1993] DoD (US) (1993) Standard Practice for System Safety (1993) USDept of Defense Standard MIL-STD-882C, 19 January 1993.
[Humphreys, Johnston 1987] Humphreys, P., and Johnston, B.D., (1987) DependentFailure Procedure Guide SRD-418, UK AEA, Safety and Reliability Directorate.
[IEC 61508] IEC 61508,(2011) International Standard 61508 (2011) Functional safety ofelectrical/electronic/programmable electronic safety related systems, InternationalElectrotechnical Commission, Geneva.
[NASA OSMA 2002] NASA (2002) Fault Tree Handbook with AerospaceApplications,Office of Safety and Mission Assurance (OSMA), V1.1.
50 Matthew Squair M6 Common Cause Analysis V1.1