18
Contents lists available at ScienceDirect Reliability Engineering and System Safety journal homepage: www.elsevier.com/locate/ress SIL verication for SRS with diverse redundancy based on system degradation using reliability block diagram Long Ding a,b, , Hong Wang a , Jin Jiang b , Aidong Xu a a Key Laboratory of Networked Control Systems, Chinese Academy of Sciences, Shenyang Institute of Automation, University of Chinese Academy of Sciences, No. 114 Nanta Street, Shenyang 110016, PR China b Department of Electrical and Computer Engineering, Western University, 1151 Richmond Street North, London, Ontario, Canada N6A 5B9 ARTICLE INFO Keywords: Functional safety SIL verication Diverse redundancy System degradation MooN(D) voting group RBD ABSTRACT Safety integrity level (SIL) verication is a critical step in safety lifecycle of safety-related systems (SRS). Introducing redundancy into SRS raises two issues: voting group conguration and common cause failures (CCF). In order to minimize CCF, diverse redundancy is widely adopted by SRS. However, in the past, almost all attention of SIL verication has been paid to identical redundancy, this is reected in IEC 61508, ISA- TR84.00.02 and scientic literatures. Therefore, a novel method for SIL verication of SRS with diverse redundancy based on system degradation is proposed. Key idea of the method is to calculate average probability of dangerous failure on demand (PFD G ) at each stage of system degradation, which is caused by failures of redundant channels. To validate proposed method, it has been applied on safety shutdown system of Nuclear Power Control Test Facility, and numerical result is compared with FTA and FRANTIC model. Sensitivity studies and comparison of numerical results indicate that the method has very good consistency with FTA and FRANTIC model. Moreover, two sets of general formulae for PFD G of any MooN(D) group with diverse redundancy are provided. From engineering practice point of view, it makes SIL verication process simpler. 1. Introduction Safety integrity level (SIL) verication is a critical step in the safety lifecycle of safety-related system (SRS), in accordance with IEC 61508 [1], to achieve the required SIL for safety functions. SIL verication generally means calculation of average probability of dangerous failure on demand (PFD G ) of a SRS and veries whether the PFD G of designed SRS meets the required failure measure. Redundancy is a technique that has been widely used to obtain high reliability and availability in SRS. Introducing redundancy into SRS raises two issues: voting group conguration and common cause failures (CCF). A consistent conclusion on SIL verication is: CCF is the dominant part of the overall PFD G for SRS [27]. In order to minimize CCF, diverse redundancy is widely adopted by SRS. However, in the past, almost all attention on SIL verication has been paid to SRS with identical redundancy, and this is reected in IEC 61508-6 [8], ISA-TR84.00.02 [9] and scientic literatures. In most literatures about SIL verication, redundant channels in a voting group are assumed to be statistically independent and identical [1013]. However, in many practical applications, not all channels in a voting group can be treated as identical because they have dierent failure rates and diagnostic coverages, i.e. diverse redundancy. Diverse redundancy refers to the use of two or more dierent systems, which are built using dierent components, algorithms, electronics, design methodology etc. to perform the same task. One benet of diverse redundancy is the increased capabilities to reduce common mode and systematic failures such as those caused by design aws. There is clear evidence that diverse redundancy can bring benets to system reliability. For example, Torres-Echeverría et al. [14] studied design optimization of safety instrumented system (SIS) based on RAMS + C (reliability, availability, maintainability and safety plus cost) addressing IEC 61508 requirements and diverse redundancy, and the results showed that, in general, the design optimization with diversity gives better results than the one without diversity, and introducing diversity as a defense measure against CCF has a positive impact on the http://dx.doi.org/10.1016/j.ress.2017.03.005 Received 29 June 2016; Received in revised form 3 March 2017; Accepted 6 March 2017 Corresponding author at: Key Laboratory of Networked Control Systems, Chinese Academy of Sciences, Shenyang Institute of Automation, University of Chinese Academy of Sciences, No. 114 Nanta Street, Shenyang 110016, PR China. E-mail address: [email protected] (L. Ding). Abbreviations: CCF, Common cause failures; DD, Dangerous detected; DU, Dangerous undetected; MooN(D), M-out-of-N(Diagnosis); MAT, Mean available time; MDT, Mean down time; MRT, Mean repair time; MTTR, Mean time to restoration; MUT, Mean up time; MUnT, Mean unavailable time; RAMS + C, Reliability, availability, maintainability and safety plus cost; RBD, Reliability block diagram; SIL, Safety integrity level; SIS, Safety instrumented system; SRS, Safety-related system Reliability Engineering and System Safety 165 (2017) 170–187 Available online 21 March 2017 0951-8320/ © 2017 Elsevier Ltd. All rights reserved. MARK

Reliability Engineering and System Safetyir.sia.ac.cn/bitstream/173321/20351/1/SIL verification for SRS with... · Reliability Engineering and System Safety 165 (2017) 170–187 Available

Embed Size (px)

Citation preview

Page 1: Reliability Engineering and System Safetyir.sia.ac.cn/bitstream/173321/20351/1/SIL verification for SRS with... · Reliability Engineering and System Safety 165 (2017) 170–187 Available

Contents lists available at ScienceDirect

Reliability Engineering and System Safety

journal homepage: www.elsevier.com/locate/ress

SIL verification for SRS with diverse redundancy based on systemdegradation using reliability block diagram

Long Dinga,b,⁎, Hong Wanga, Jin Jiangb, Aidong Xua

a Key Laboratory of Networked Control Systems, Chinese Academy of Sciences, Shenyang Institute of Automation, University of Chinese Academy ofSciences, No. 114 Nanta Street, Shenyang 110016, PR Chinab Department of Electrical and Computer Engineering, Western University, 1151 Richmond Street North, London, Ontario, Canada N6A 5B9

A R T I C L E I N F O

Keywords:Functional safetySIL verificationDiverse redundancySystem degradationMooN(D) voting groupRBD

A B S T R A C T

Safety integrity level (SIL) verification is a critical step in safety lifecycle of safety-related systems (SRS).Introducing redundancy into SRS raises two issues: voting group configuration and common cause failures(CCF). In order to minimize CCF, diverse redundancy is widely adopted by SRS. However, in the past, almost allattention of SIL verification has been paid to identical redundancy, this is reflected in IEC 61508, ISA-TR84.00.02 and scientific literatures. Therefore, a novel method for SIL verification of SRS with diverseredundancy based on system degradation is proposed. Key idea of the method is to calculate average probabilityof dangerous failure on demand (PFDG) at each stage of system degradation, which is caused by failures ofredundant channels. To validate proposed method, it has been applied on safety shutdown system of NuclearPower Control Test Facility, and numerical result is compared with FTA and FRANTIC model. Sensitivitystudies and comparison of numerical results indicate that the method has very good consistency with FTA andFRANTIC model. Moreover, two sets of general formulae for PFDG of any MooN(D) group with diverseredundancy are provided. From engineering practice point of view, it makes SIL verification process simpler.

1. Introduction

Safety integrity level (SIL) verification is a critical step in the safetylifecycle of safety-related system (SRS), in accordance with IEC 61508[1], to achieve the required SIL for safety functions. SIL verificationgenerally means calculation of average probability of dangerous failureon demand (PFDG) of a SRS and verifies whether the PFDG ofdesigned SRS meets the required failure measure.

Redundancy is a technique that has been widely used to obtain highreliability and availability in SRS. Introducing redundancy into SRSraises two issues: voting group configuration and common causefailures (CCF). A consistent conclusion on SIL verification is: CCF isthe dominant part of the overall PFDG for SRS [2–7]. In order tominimize CCF, diverse redundancy is widely adopted by SRS. However,in the past, almost all attention on SIL verification has been paid toSRS with identical redundancy, and this is reflected in IEC 61508-6 [8],ISA-TR84.00.02 [9] and scientific literatures.

In most literatures about SIL verification, redundant channels in a

voting group are assumed to be statistically independent and identical[10–13]. However, in many practical applications, not all channels in avoting group can be treated as identical because they have differentfailure rates and diagnostic coverages, i.e. diverse redundancy.

Diverse redundancy refers to the use of two or more differentsystems, which are built using different components, algorithms,electronics, design methodology etc. to perform the same task. Onebenefit of diverse redundancy is the increased capabilities to reducecommon mode and systematic failures such as those caused by designflaws.

There is clear evidence that diverse redundancy can bring benefitsto system reliability. For example, Torres-Echeverría et al. [14] studieddesign optimization of safety instrumented system (SIS) based onRAMS + C (reliability, availability, maintainability and safety plus cost)addressing IEC 61508 requirements and diverse redundancy, and theresults showed that, in general, the design optimization with diversitygives better results than the one without diversity, and introducingdiversity as a defense measure against CCF has a positive impact on the

http://dx.doi.org/10.1016/j.ress.2017.03.005Received 29 June 2016; Received in revised form 3 March 2017; Accepted 6 March 2017

⁎ Corresponding author at: Key Laboratory of Networked Control Systems, Chinese Academy of Sciences, Shenyang Institute of Automation, University of Chinese Academy ofSciences, No. 114 Nanta Street, Shenyang 110016, PR China.

E-mail address: [email protected] (L. Ding).

Abbreviations: CCF, Common cause failures; DD, Dangerous detected; DU, Dangerous undetected; MooN(D), M-out-of-N(Diagnosis); MAT, Mean available time; MDT, Mean downtime; MRT, Mean repair time; MTTR, Mean time to restoration; MUT, Mean up time; MUnT, Mean unavailable time; RAMS + C, Reliability, availability, maintainability and safety pluscost; RBD, Reliability block diagram; SIL, Safety integrity level; SIS, Safety instrumented system; SRS, Safety-related system

Reliability Engineering and System Safety 165 (2017) 170–187

Available online 21 March 20170951-8320/ © 2017 Elsevier Ltd. All rights reserved.

MARK

Page 2: Reliability Engineering and System Safetyir.sia.ac.cn/bitstream/173321/20351/1/SIL verification for SRS with... · Reliability Engineering and System Safety 165 (2017) 170–187 Available

functional safety of the SIS, which in turn improves the overall plantsafety.

However, it is difficult to quantify these benefits and the effects ofdiverse redundancy on system reliability. Though less, some researchwork have been done on the reliability and availability for systemsusing diverse redundant components. Khatab et al. [15] studied thestationary availability of k-out-of-n: G systems with non-identical ncomponents subject to repair priorities by using a multi-dimensionalMarkov model, each component is characterized by its own failure rate.Moghaddass et al. [16] evaluated the availability of general k-out-of-n:G systems with non-identical components considering shut-off rulesusing a finite state-dependent non-homogeneous quasi-birth-deathprocess. These methods for reliability and availability of systems withdiverse redundancy are either not proper for SRS or too cumbersome tobe applied in engineering practice.

Classical methods, such as RBD, FTA, Markov analysis [17] andMonte Carlo simulation, can be used to verify SIL of diverse redun-dancy, but with much more complexity or limited accuracy. Kaczoret al. [18] used Monte Carlo simulation through the application of the2-out-of-N reliability structure and the sectioning off of so-called votingzones to verify SIL and reduce the number of false alarms, but themethod requires specialist software and is time-consuming, and theresults depend on the quality of pseudo-random numbers generatorand simulation parameters. Rausand [19] gave an approach todetermine PFDG for MooN voting group of N independent and non-identical channels by using Simplified Formulas method. This ap-proach is easy to be used in practice, however, it only takes dangerousundetected (DU) failures into account, and does not consider danger-ous detected (DD) failures and CCF, and details can be found on p.203–205 of [19].

Therefore, a novel method for SIL verification of SRS with diverse

redundancy, which is based on system degradation using reliabilityblock diagram (RBD), is proposed in this paper. This paper is a furtherdevelopment of the authors’ previous work [20] which studied the SILverification of systems with identical redundant channels from theperspective of system degradation, the aim of this paper is to present anew simpler and more accurate SIL verification method for SRS withdiverse redundancy design in harmony with IEC 61508 requirements.

Two important issues when introducing diverse redundancy intoSRS are voting group configuration and probability of CCF [14,21]. Inthis system degradation based method, CCF part of PFDG is quantifiedby using the multi-β factor model [22] introduced in PDS method [23]by SINTEF, new information are the quantification method forindependent part of PFDG and two sets of general formulae todetermine PFDG of any MooN(D) voting group with diverse redun-dancy, which are the new contributions of this paper. Some othergeneralized formulae for identical redundancy can be found in [2–4,13,24,25].

The key idea of the method proposed is to perform RBD analysisand calculate PFDG at each stage of system degradation, which iscaused by failures of redundant channels. System degradation is theproperty resulted from redundant channels’ failures in a multi-channelsystem, and it enables a system to continue operating properly in theevent of failures until the system's fault tolerance reaches zero. Basedon degradation processes, from fully functioning to degraded function-ing until the system could not execute safety function on demand anymore, this paper dynamically discusses the RBDs of safety architec-tures and gives combined RBDs composed of each stage-RBD. Basedon the RBDs obtained, formulae of PFDG for a voting group could bewritten out straightforwardly according to degradation processes, andequivalent mean down time (MDT) of one channel and one votinggroup are also calculated based on degradation processes.

Nomenclature

Ci ith minimal cut set of (N – M +1) channels failMATDU

CE MAT of the first failed channel in a voting group over [0,T1]

MATDUGE MAT of the second failed channel in a voting group over

[0, T1]MATDU

G E2 MAT of the third failed channel in a voting group over [0,T1]

MUnTDUCE MUnT of the first failed channel in a voting group over [0,

T1]MUnTDU

GE MUnT of the second failed channel in a voting group over[0, T1]

MUnTDUG E2 MUnT of the third failed channel in a voting group over

[0, T1]PFD Average probability of dangerous failure on demandPFDi PFD of the any ith failed channel in a voting groupPFDCCF PFD caused by CCFPFDG PFD for a voting groupR(t) Component reliabilityR(t)CE Reliability of the component with failure rate λDU,1 in the

first failed channelR(t)GE Reliability of the component with failure rate λDU,2 in the

second failed channel, under the condition that any onechannel has failed

R(t)G2E Reliability of the component with failure rate λDU,3 in thethird failed channel, under the condition that any twochannels have failed

T1 Proof test intervalβ The fraction of DU failures that have a common causeβD The fraction of DD failures that have a common causeλ Failure rate

λD,i Dangerous failure rate of the ith failed channel in a votinggroup (λD,i = λDD,i + λDU,i)

λDD Geometric average of DD failure rates of all channels in avoting group

λDD,i DD failure rate of the ith failed channel in a voting groupλDD i

ind, Independent DD failure rate of ith failed channel

λDU Geometric average of DU failure rates of all channels in avoting group

λDU,i DU failure rate of the ith failed channel in a voting groupλDU i

ind, Independent DU failure rate of ith failed channel

λS,i Safety failure rate of the ith failed channel in a votinggroup

λSD,i Safe detected failure rate of the ith failed channel in avoting group

μ Repair ratetCE Equivalent MDT for the first failed channel in a MooN

voting grouptGE Equivalent MDT for the second failed channel in a MooN

voting grouptG2E Equivalent MDT for the third failed channel in a MooN

voting grouptG(i-1)E Equivalent MDT for the ith failed channel, i.e. equivalent

MDT for MooN voting group with (i−1) fault tolerancetG(N-M)E Equivalent MDT for MooN voting groupt'CE Equivalent MDT for the first failed channel in a MooND

voting groupt'GE Equivalent MDT for the second failed channel in a

MooND voting groupt'G(i-1)E Equivalent MDT for the ith failed channel, i.e. equivalent

MDT for MooND voted group with (i−1) fault tolerancet'G(N-M)E Equivalent MDT for MooND voting group

L. Ding et al. Reliability Engineering and System Safety 165 (2017) 170–187

171

Page 3: Reliability Engineering and System Safetyir.sia.ac.cn/bitstream/173321/20351/1/SIL verification for SRS with... · Reliability Engineering and System Safety 165 (2017) 170–187 Available

The method takes DU failures, DD failures and CCF into account, itis more accurate than Simplified Formulas method. To validatecredibility of the proposed method and equations derived, the methodhas been applied on safety shutdown system of Nuclear Power ControlTest Facility (NPCTF) at The University of Western Ontario, and also ageneral comparison of numerical results among this method, FaultTree Analysis (FTA) and FRANTIC are made to demonstrate correct-ness of the method. The FRANTIC (Formal Reliability Analysisincluding Normal Testing, Inspection, and Checking) is a set ofcomputational procedures developed by the Nuclear RegulatoryCommission of the United States in the 80's and solves the problemby numerical integration of the time-dependent system unavailability(PFDG) function built from the logical combination of individualcomponent unavailability functions [26,27]. The comparisons demon-strate that the numerical result is very close to the results calculatedwith FTA and the FRANTIC, and sensitivity study results of failure rateand proof test interval indicate that the method has very goodconsistency with FTA and FRANTIC model.

The rest of this paper is organized as follows. Problem statementand assumptions are presented in Section 2. Section 3 gives RBDs ofdifferent voting groups and detailed calculation processes of PFDG. InSection 4, results and discussion are presented. Conclusions are givenin Section 5.

2. Problem statement and assumptions

For SIL verification of systems using diverse redundancy, especiallyhardware of SRS, IEC equations [8] and ISA equations [9] cannot beused as they are proposed for identical redundancy. Classical reliabilityanalysis methods, such as RBD, FTA and MA, can be used to verify SILof diverse redundancy, but with much more complexity or limitedaccuracy. Fig. 1 shows usability and accuracy of existing reliabilityanalysis methods for SIL verification of SRS.

When used for SIL verification of SRS, RBD, FTA and SE cover fewaspects of the system's safety-related behavior: these methods considerfailure rates only, and do not consider repair rates and systemdegradation states. They are static analysis methods and ease to use,however accuracy is limited.

MA, Petri net and Monte Carlo simulation could cover more aspectsof the system's safety-related behavior: these methods could considerfailure rates, repair rates and system degradation states by using statetransition. They are dynamic analysis methods, and more accurate, butwith much more application complexity for complex redundant SRS.Therefore specialized knowledge and skills are required.

Hybrid techniques are normally combination of two or moremethods presented above, and they are various and case-by-case.Compared to identical redundancy, all the methods above are morecomplicated when used for diverse redundancy.

This system degradation based method is a dynamic SIL verifica-tion method, and each operational stage due to system degradation canbe considered as a system state. This method can be easily used inengineering practice to deal with SIL verification of diverse redundancywith reasonable accuracy. In order to describe the degradationprocesses and PFDG calculations as clearly as possible, the followingassumptions are made as a basis.

i. All channels in a voting group are independent and diverse, exceptcommon cause failures;

ii. Component failure rates are constant over the life of the system,and components (or blocks representing them in RBD) can exist inonly two states: working (“up” state) or failed (“down” state);

iii. All channels in a voting group are fully functioning after start-up ofthe system;

iv. For each safety function, there is a perfect proof testing and repair,i.e. all failures that remain undetected are detected by the prooftest;

v. The proof test interval is at least an order of magnitude greaterthan the MRT;

vi. For each system, there is a single T1 and MTTR: the fraction offailures specified by the diagnostic coverage is both detected andrepaired within the MTTR. MTTR includes the diagnostic testinterval which is typically less than 1 h, the remaining being theMRT;

vii. Failures of individual components are considered to be statisticallyindependent events.

viii. λD,i is failure rate of the ith failed channel and λD,j is failure rateof the jth failed channel, for i < j, λD,i > λD,j. In other words, thefirst failed channel has highest failure rate and shortest MTTF(mean time to failure, MTTF =1/λ), and the last failed channel haslowest failure rate and longest MTTF. Then for a MooN votinggroup, λ λ∏ ≥ ∏N M

D i i C D i1− +1

, ∈ , , set C includes any (N – M +1) ofthe N failure rates. Since PFD λ∝i D i, , so,

PFD PFD∏ ≥ ∏N Mi i C i1

− +1∈ . Therefore, the results of PFDG ob-

tained by using the proposed method are conservative.

3. RBD and PFDG

3.1. Voting group

Table 1 shows typical used architectures and their primary attri-butes. The term “Safety Fault Tolerance” indicates the number of extraunits that exists to maintain safety (avoid the fail-danger state). Theterm “Availability Fault Tolerance” indicates the number of extra unitsthat exists to maintain availability (avoid the fail-safe state) [28]. Theseterms are an expansion of the term “Hardware Fault Tolerance” fromIEC 61508.

Four of the architectures, 1oo1, 1oo2, 2oo2 and 2oo3, have existedsince the early days of relay logic. Three of the architectures, 2oo3,1oo2D, and 2oo2D, have become common in commercial implementa-tions that provide both high safety and high availability, but with biastowards safety or availability. The architectures with the D designationhave one or more output switches controlled by automatic diagnostics.These diagnostics are used to control system failure modes and tomodify the failure behavior of units within the system. The Darchitectures provide excellent performance for both safety and avail-ability, which depend on exceptionally good diagnostic coverage. Formore details about these architectures and their relationships, refer to[28].

SIS designers need to select proper architecture as voting group tomeet a determined required SIL level. For that, a fundamental task isthe PFD evaluation of different voting groups. Both IEC 61508-6 [8]and ISA-TR84.00.02 [9] provide such equations for PFD evaluation ofabove typical SIS architectures. However, from time to time, SILanalysts face situations where they need to evaluate the PFD of higherredundancy architectures [2], and actual commercial architectures varyconsiderably and may even be combinations of the typical architectures

Fig. 1. Usability and accuracy of reliability analysis methods for SIL verification.

L. Ding et al. Reliability Engineering and System Safety 165 (2017) 170–187

172

Page 4: Reliability Engineering and System Safetyir.sia.ac.cn/bitstream/173321/20351/1/SIL verification for SRS with... · Reliability Engineering and System Safety 165 (2017) 170–187 Available

above. Therefore, besides those most widely used architectures, thispaper also provides two sets of general formulae for PFDG of anyMooN(D) voting group with diverse redundancy.

3.2. Multi-β factor model with diverse redundancy

A MooN voting group with diverse redundant channels consideringDU failures, DD failures and CCF is shown in Fig. 2, redundantchannels has different failure rates, i.e.,

for i≠ j, λ λ λ λ λ λ≠ , ≠ , ≠D i D j DU i DU j DD i DD j, , , , , , and λ λ λ= +D i DU i DD i, , , ,λ λ λ= +D j DU j DD j, , , .

The multi-β factor model [23] is used to incorporate effects of CCF,where the multiplicity of the CCF (i.e. number of channels affected) isexplicitly treated. In early version of IEC 61508-6, the single-β factormodel was introduced to quantify CCF. In single-β factor model, thefactor β represents the fraction of failures that have a common cause.The problem with single-β factor model is that for any MooN voting,the rate of dependent failures is the same. So the single-β factor modeldoes not distinguish between different voting logics, and the sameresult is obtained e.g. for 1oo2, 1oo3 and 2oo3 voted systems [22].

Due to the limitations of the single-β factor model for CCF, the PDSmethod [23] provided an extension of the single-β factor model, i.e. themulti-β factor model, which is suggested in the new version IEC 61508-6 published in 2010 [8].

In PDS method [23], the multi-β factor model explicitly depends onthe configuration, and the β factor of a MooN voting logic is expressedas: β(MooN) = β ⋅ CMooN, where CMooN is a modification factor forvarious voting configurations, and β is the factor obtained for 1oo2voting group, which can be determined by using the checklist in IEC61508-6.

The original multi-β factor model was defined for identical redun-dant channels with the same constant failure rate. However, diverseredundant channels are non-identical and have different failure rates.In this case it is more difficult to select the “joint” failure rate forinclusion of CCFs. To account for CCFs in this situation, following twoapproaches are introduced by [23] to select a “joint” failure rate for thediverse channels:

1. Using some representative average value, typically the geometricmean of the CCF failure rates of the diverse channels [22].

2. Using the lowest failure rate of the diverse channels.

For diverse redundant channels using the geometric mean has oftenbeen the preferred method [23]. This is an adequate approach if thefailure rates are of the same magnitude. The second approach issuggested to use the lowest failure rate when the failure rates havedifferent order of magnitude, refer to [23] for discussion about the useof these two approaches.

In this paper, the geometric mean approach is used to quantifyCCFs. Let λDU,i be DU failure rate of the ith failed channel, for i=1, 2,…, N. The geometric mean of the N DU failure rates is

⎛⎝⎜⎜

⎞⎠⎟⎟∏λ λ=DU

i

N

DU i

N

=1,

1/

(1)

Then, the independent DU failure rate λDU of the ith failed channelbecomes

λ λ βλ= −DU iind

DU i DU, , (2)

Similarly, the geometric mean of the N DD failure rates is

⎛⎝⎜⎜

⎞⎠⎟⎟∏λ λ=DD

i

N

DD i

N

=1,

1/

(3)

Then, the independent DD failure rate λDD of the ith failed channelbecomes

λ λ β λ= −DD iind

DD i D DD, , (4)

According to the multi-β factor model, PFD contributed by CCF of aMooN voting group is

PFD C βλ β λ= ( + )CCF M N DU D DDoo (5)

Where, CMooN is configuration factor, some recommended values ofCMooN for MooN architectures by PDS method [23] are given inTable 2. The authors of PDS method also investigated field experienceon CCFs, through presentation and derivation of generic values of β-factors for typical components in the oil and gas industry, and thedemonstration of how failure data may be used to construct checklistsfor updating the value of β in operation, based on a review of some12,000 maintenance notifications from six different onshore andoffshore petroleum facilities [29].

Diversity leads to a reduction in probability of failure due to CCF,due to a reduced β factor as compared to the β factor for identicalredundancy. The methodology discussed in Annex D of IEC 61508-6[8] states that diversity is one of the most effective mechanisms toreduce the β factor. One reason for this is that CCF of diverse channelsare likely to be non-simultaneous, thus increasing the probability of thediagnostic tests detecting the failure.

Table 1Typical architectures and their primary attributes.

Architecture Hardware FT Safety FT Availability FT Objective

1oo1 0 0 0 Base Unit1oo2 1 1 0 High Safety2oo2 0 0 1 High Availability –avoid process trip1oo1D 0 0 – failure not detected 0 High Safety

1 – failure detected2oo3 1 1 1 Safety and Availability2oo2D 0 0 – failure not detected 1 Safety and Availability – bias towards availability, avoid process trip

1 – failure detected1oo2D 1 1 0 – failure not detected Safety and Availability – bias towards safety

1 – failure detected

Fig. 2. A MooN voted group with diverse redundancy.

L. Ding et al. Reliability Engineering and System Safety 165 (2017) 170–187

173

Page 5: Reliability Engineering and System Safetyir.sia.ac.cn/bitstream/173321/20351/1/SIL verification for SRS with... · Reliability Engineering and System Safety 165 (2017) 170–187 Available

3.3. MUnT and PFDG for single channel

IEC 61508 and papers like [2,19] use mean down time (MDT) torepresents mean unavailable time of functional safety due to failures.There are researchers argue that MDT is mean non-operational timedue to repair action, i.e. MDT =1/u, where u is repair rate, and itshould be differentiated from the time for a component being "downdue to a dangerous undetected failure", because in the latter case, adangerous undetected failure has occurred and the component is stillworking. However, the component is not able to deal with a failure ofthe "system being under control". So, it is inappropriate to use MDT torepresent mean unavailable time of functional safety due to failures.Therefore, in the process of equations derivation, to avoid misunder-standing, this paper defines MUnT (Mean Unavailable Time ofFunctional Safety) and MAT (Mean Available Time of FunctionalSafety) to replace MDT (Mean Down Time) and MUT (Mean UpTime), and keep MDT with the meaning of mean non-operational timedue to repair action. In the authors’ opinion, MUnT and MAT are moreappropriate to represent the availability time of functional safety of SISdue to failures. Other notations and acronyms are the same with IEC61508 and other papers.

For generalization, assume that a channel consists of n componentsconnected in series, with respective failure rates: λ1, λ2, …, λn. PFDGfor single channel in time interval [0, MUnT] is,

PFD λ MUnT≈ ⋅G (6)

Where,

∑ ∑λ λ MUnTλ

λ MUnT= , = 1∑i

n

ii i

n

i i=1 =1 (7)

Detailed derivation of Eqs. (6) and (7) are provided in Appendix1.1.

3.4. 2oo2 voting group

2oo2 voting group consists of two channels connected in parallel sothat both channels need to demand the safety function before it cantake place. Any channel's failure would result in the failure of votinggroup, so, the RBD of 2oo2 voting group is two RBDs of 1oo1 votinggroup connected in series as shown in Fig. 3.

PFDG of 1oo1 voting group [20] is

PFD λ λ t λ t= ( + ) =G DU DD CE D CE

where,

tλλ

TMRT

λλ

MTTR= (2

+ ) +CEDU

D

DD

D

1

These two equations are proved by many literatures, such as [17].Therefore, according to the RBD as shown in Fig. 3, for 2oo2 voting

group,

PFD λ t λ t= +G D CE D CE,1 ,1 ,2 ,2 (8)

where,

tλλ

TMRT

λλ

MTTR= (2

+ ) +CEDU

D

DD

D,1

,1

,1

1 ,1

,1

tλλ

TMRT

λλ

MTTR= (2

+ ) +CEDU

D

DD

D,2

,2

,2

1 ,2

,2

3.5. 2oo2D voting group

It should be noted that actually all architectures have diagnosticchannels, but the diagnostic channels in a MooND architecture wouldreport the fault detected and change output states or output voting,while the diagnostic channel in a MooN architecture would only reportthe fault detected and would not change any output states or outputvoting, therefore, a MooND architecture and a MooN architecture mayhave different reactions when a fault is detected.

Similarly, the RBD of 2oo2D voting group is two RBDs of 1oo1Dvoting group connected in series as shown in Fig. 4. DD failures are notconsidered here because any DD failure leads the process into a safetystate, this is the same situation in MooND voting group when thevoting group has no more fault tolerance.

For 1oo1 voting group, λDD will not change any output states, soλDD will still cause dangerous failures on demand, the equivalent meandown time is

tλλ

TMRT

λλ

MTTR= (2

+ ) +CEDU

D

DD

D

1

For 1oo1D voting group, λDD will change output states and resultin safety state, so λDD has no contribution for t’CE and is omitted, theequivalent mean down time is

tλλ

TMRT′ = (

2+ )CE

DU

DU

1

PFDG of 1oo1D voting group [20] is

PFD λ t= ′G DU CE

Therefore, according to the RBD as shown in Fig. 4, for 2oo2Dvoting group,

PFD λ t λ t= ′ + ′G DU CE DU CE,1 ,1 ,2 ,2 (9)

where,

tλλ

TMRT′ = (

2+ )CE

DU

DU,1

,1

,1

1

tλλ

TMRT′ = (

2+ )CE

DU

DU,2

,2

,2

1

3.6. 1oo2 voting group

1oo2 voting group consists of two channels connected in parallel,thus there would have to be a dangerous failure in both channels beforea safety function failed on demand, and it can degrade to continueoperating when a dangerous failure causes the failure of any onechannel.

After the system is started and until any one channel is failed toexecute functional safety, both two channels are working, any channel'sfailure will result in a degradation of the system to 1oo1 voting group,and the RBD of the system in this stage is shown in Fig. 5-1. Based onEqs. (6) and (7), replace λ1, λ2 with λDU, λDD, then exclude CCFrepresented by the multi-β factor model, the probability of dangerousfailure of any one channel in the voting group is

Table 2Configuration factors, CMooN, for selected MooN voting groups.

M/N N =2 N =3 N =4 N =5

M = 1 1.00 0.50 0.30 0.20M = 2 – 2.00 1.10 0.80M = 3 – – 2.80 1.60M = 4 – – – 3.60

Fig. 3. 2oo2 voted group.

L. Ding et al. Reliability Engineering and System Safety 165 (2017) 170–187

174

Page 6: Reliability Engineering and System Safetyir.sia.ac.cn/bitstream/173321/20351/1/SIL verification for SRS with... · Reliability Engineering and System Safety 165 (2017) 170–187 Available

PFD C λ λ t λ λ t= ( + ) = 2( + )DUind

DDind

CE DUind

DDind

CE1 21

,1 ,1 ,1 ,1 (10)

where λDUind

,1 and λDDind

,1 are independent DU failure rate and independentDD failure rate of the first failed channel respectively, and

λMUnT

λ

λMUnT

λ

λT

MRTλ

λMTTR= + = (

2+ ) +CE

DUind

Dind DU

CE DDind

Dind DD

CE DUind

Dind

DDind

Dind

,1

,1

,1

,1

,1

,1

1 ,1

,1

After degrade to 1oo1 voting group, the 1oo2 voting group wouldfail completely if any dangerous failure in this remaining channeloccurs, so the MAT of the remaining channel (from the system startsworking to the system is unavailable to execute functional safety) is thevoting group equivalent mean up time, and the MUnT of the remainingchannel is the voting group equivalent mean down time tGE. The RBDin this stage is shown in Fig. 5-2. The probability of dangerous failureof this remaining channel in the voting group is

PFD λ λ t= ( + )DUind

DDind

GE2 ,2 ,2 (11)

and

λMUnT

λ

λMUnT= +GE

DUind

Dind DU

GE DDind

Dind DD

GE,2

,2

,2

,2

where MUnT T MAT MRT MUnT MTTR= − + , =DUGE

DUGE

DDGE

1 .Under the condition that any one of the two channels has failed, the

reliability of the component with failure rate λDUind

,2 in the remainingchannel is

R t e e( ) = 2(1 − )GE λ t λ t− −DUind

DUind

,1 ,2

Consider λT e λT λT λT≪ 1, ≈ 1 − + ( ) /2 − ( ) /6λT1

−1 1

21

31 , then similar to[17],

∫MAT

tλ e e dt

λ e e dtλ λ T λ λ T

T

=2(1 − )

2(1 − )≈ 2

3/

=23

DUGE

TDUind λ t λ t

TDUind λ t λ t DU

indDUind

DUind

DUind0 ,2

− −

0 ,2− − ,1 ,2 1

3,1 ,2 1

2

1

DUind

DUind

DUind

DUind

1 ,1 ,2

1 ,1 ,2

So,

MUnTT

MRT=3

+DUGE 1

Here, it can be seen that, the MUnT of DU failures of 1oo2 votinggroup with diverse redundancy can be assumed not related to failurerates, and it is the same as the MUnT of DU failures of 1oo2 votinggroup with identical redundancy [8,20], i.e. MUnT can be assumed notinfluenced by diverse redundancy. It is said “can be assumed” becauseTaylor Series is used, and higher order terms which are normally verysmall are ignored. To prove the point, MUnT of DU failures can also bedetermined by Simplified Formulas in [19], proving details areprovided in Appendix 1.2.

Therefore, the equivalent mean down time of 1oo2 voting group is

λT

MRTλ

λMTTR= (

3+ ) +GE

DUind

Dind

DDind

Dind

,2

,2

1 ,2

,2

Using multi-β factor model, probability of failure due to CCF is

PFD C βλT

MRT β λ MTTR= ( (2

+ ) + )CCF DU D DD1oo21

(12)

where, λDU and λDD are geometric means of λDU,i and λDD,i of thevoting group respectively, for i=1, 2, …, N.

Combine the RBDs above of two operation stages of the system andconsider CCF as shown in Fig. 5-3, the combined RBD of this votinggroup is shown in Fig. 5-4.

This system degradation based method is a dynamic SIL verifica-tion method, and each operational stage due to system degradation canbe considered as a system state. Based on the consecutive states ofdegradation process, the independent part of PFDG can be calculatedas “PFD1*PFD2”. Consequently, based on Eqs. (10), (11) and (12),consider CCF, for 1oo2 voting group,

PFD PFD PFD PFD

λ λ t λ λ t C βλ MRT β λ MTTR

λ λ t t C βλ MRT β λ MTTR

= ⋅ +

=2( + ) ⋅( + ) + ( ( + ) + )

=2 ∏ ( + ) + ( ( + ) + )

G CCF

DUind

DDind

CE DUind

DDind

GE DUT

D DD

i DU iind

DD iind

CE GE DUT

D DD

1 2

,1 ,1 ,2 ,2 1oo212

=12

, , 1oo212

(13)

3.7. 1oo2D voting group

1oo2D voting group consists of two channels connected in parallel,the group could degrade to continue operating in 1oo1D voting group ifany one channel is down.

After the system is started and until any one channel is failed toexecute functional safety, both two channels are working, any detectedfailure in any channel would cause a safety shutdown of the channeland then the system degrades to continue operating in 1oo1D votinggroup, i.e., λDD and λSD both will lead to a degradation of the system.As a result, λDD and λSD are both taken into account in the RBD of thisstage as shown in Fig. 6-1. Considering CCF, the probability of failuresof any one channel in this architecture is

PFD C λ λ λ t λ λ λ t= ( + + ) ′ = 2( + + ) ′DUind

DDind

SD CE DUind

DDind

SD CE1 21

,1 ,1 ,1 ,1 ,1 ,1 (14)

where λ λ DC= ⋅SD S,1 ,1 (Diagnostic Coverage [8]).Based on Eq. (6) and tCE for 1oo1 voting group, here

t MRT MTTR MTTR′ = ( + ) + +

=

CEλ

λ λT λ

λ λ

λ

λ λ

λ MRT λ λ MTTR

λ λ λ

+ 2 + +

( + ) + ( + )

+ ( + )

DUind

Dind SD

DDind

Dind SD

SD

Dind SD

DUind T

DDind SD

DUind

DDind SD

,1

,1 ,1

1 ,1

,1 ,1

,1

,1 ,1

,112 ,1 ,1

,1 ,1 ,1

After degrade to 1oo1D voting group, the 1oo2D voting groupwould fail completely if any dangerous undetected failure in thisremaining channel occurs, so the MUnT of the remaining channel isthe voting group equivalent mean down time t'GE. The RBD in thisstage is like the RBD of 1oo1D voting group, as shown in Fig. 6-2.

The probability of dangerous failure of the remaining channel is

PFD λ t= ′DUind

GE2 ,2 (15)

Fig. 4. 2oo2D voted group.

Fig. 5. 1oo2 voted group.

L. Ding et al. Reliability Engineering and System Safety 165 (2017) 170–187

175

Page 7: Reliability Engineering and System Safetyir.sia.ac.cn/bitstream/173321/20351/1/SIL verification for SRS with... · Reliability Engineering and System Safety 165 (2017) 170–187 Available

Based on calculation of tGE for 1oo2 voting group, here

λT

MRT′ = (3

+ )GEDUind

DUind

,2

,2

1

Combine two stages’ RBDs above and consider CCF as shown inFig. 5-3, the combined RBD of 1oo2D voting group is shown in Fig. 6-3.

Consequently, based on Eqs. (14) and (15), consider CCF, for1oo2D voting group,

PFD PFD PFD PFD

λ λ λ t λ t C βλ MRT β λ MTTR

λ λ λ λ t t C βλ MRT β λ MTTR

= ⋅ +

=2( + + ) ′ ⋅ ′ + ( ( + ) + )

=2 ⋅( + + ) ′ ′ + ( ( + ) + )

G CFF

DUind

DDind

SD CE DUind

GE DUT

D DD

DUind

DUind

DDind

SD CE GE DUT

D DD

1 2

,1 ,1 ,1 ,2 1oo212

,2 ,1 ,1 ,1 1oo212

(16)

3.8. 2oo3 voting group

2oo3 voting group consists of three channels connected in parallel,thus there would have to be dangerous failures in any two channelsbefore a safety function failed on demand, and it can degrade tocontinue operating when a dangerous failure causes the failure of anyone channel.

All three channels are functioning after the system is started, thenthe group degrades to 2oo2 voting group to continue operating afterany one channel is failed.

So the RBDs of two operation stages shown in Fig. 7-1 and 7-2 arejust like 1oo2 voting group, the only difference between 2oo3 votinggroup and 1oo2 voting group is the number of working channels in twooperation stages.

The combined RBD of 2oo3 voting group is shown in Fig. 7-3.With the same analysis method for 1oo2 voting group, in 2oo3

voting group,

PFD C λ λ t λ λ t= ( + ) = 3( + )DUind

DDind

CE DUind

DDind

CE1 31

,1 ,1 ,1 ,1 (17)

PFD C λ λ t λ λ t= ( + ) = 2( + )DUind

DDind

GE DUind

DDind

GE2 21

,2 ,2 ,2 ,2 (18)

Based on calculation of tCE for 1oo2 voting group, here

λT

MRTλ

λMTTR= (

2+ ) +CE

DUind

Dind

DDind

Dind

,1

,1

1 ,1

,1

According to its sub-RBD as shown in Fig. 7-2, as well,

λMUnT

λ

λMUnT= + ,GE

DUind

Dind DU

GE DDind

Dind DD

GE,2

,2

,2

,2

where MUnT T MAT MRT MUnT MTTR= − + , =DUGE

DUGE

DDGE

1 .Under the condition that any one of the three channels has failed,

the reliability of the component with failure rate λDUind

,2 in the remainingtwo channels is

R t e e( ) = 3(1 − )GE λ t λ t− −DUind

DUind

,1 ,2

Consider λT e λT λT λT≪ 1, ≈ 1 − + ( ) /2 − ( ) /6λT1

−1 1

21

31 , then,

∫MAT

tλ e e dt

λ e e dt

T=

3(1 − )

3(1 − )≈

23DU

GE

TDUind λ t λ t

TDUind λ t λ t

0 ,2− −

0 ,2− −

1DUind

DUind

DUind

DUind

1 ,1 ,2

1 ,1 ,2

so

MUnTT

MRT=3

+DUGE 1

Here, it can be seen that, since Taylor Series is used, and higherorder terms of Taylor Series are ignored, the MUnT of DU failures of2oo3 voting group can be assumed not influenced by diverse redun-dancy, and it is the same as the MUnT of DU failures of 2oo3 votinggroup with identical redundancy [8,20]. The point could also be provedby using Simplified Formulas, proving details are provided in Appendix1.3.

Therefore, the equivalent mean down time of 2oo3 voting group is

λT

MRTλ

λMTTR= (

3+ ) +GE

DUind

Dind

DDind

Dind

,2

,2

1 ,2

,2

Consequently, based on Eqs. (17) and (18), consider CCF, for 2oo3voting group,

PFD PFD PFD PFD

λ λ t λ λ t C βλ MRT β λ MTTR

λ λ t t C βλ MRT β λ MTTR

= ⋅ +

=3( + ) ⋅2( + ) + ( ( + ) + )

=6 ∏ ( + ) + ( ( + ) + )

G CCF

DUind

DDind

CE DUind

DDind

GE DUT

D DD

i DU iind

DD iind

CE GE DUT

D DD

1 2

,1 ,1 ,2 ,2 2oo312

=12

, , 2oo312

(19)

3.9. 2oo3D voting group

Like 2oo3 voting group, all three channels of 2oo3D voting groupare functioning after system starts to work, and the group coulddegrade to 2oo2D voting group to continue operating after any onechannel is failed.

The RBDs of two operation stages are shown in Fig. 8-1 and 8-2 andthey are like 1oo2D voting group, the only difference between 2oo3Dvoting group and 1oo2D voting group is the number of workingchannels in two operation stages.

The combined RBD of 2oo3D voting group is shown in Fig. 8-3.With the same analysis method for 1oo2D voting group, in 2oo3D

voting group,

PFD C λ λ λ t λ λ λ t= ( + + ) ′ = 3( + + ) ′DUind

DDind

SD CE DUind

DDind

SD CE1 31

,1 ,1 ,1 ,1 ,1 ,1 (20)

Fig. 6. 1oo2D voted group.

Fig. 7. 2oo3 voted group.

L. Ding et al. Reliability Engineering and System Safety 165 (2017) 170–187

176

Page 8: Reliability Engineering and System Safetyir.sia.ac.cn/bitstream/173321/20351/1/SIL verification for SRS with... · Reliability Engineering and System Safety 165 (2017) 170–187 Available

PFD C λ t λ t= ′ = 2 ′DUind

GE DUind

GE2 21

,2 ,2 (21)

Based on calculations of t'CE for 1oo2D voting group and tGE for2oo3 voting group, here

tλ MRT λ λ MTTR

λ λ λ′ =

( + ) + ( + )

+ ( + )CEDUind T

DDind

SD

DUind

DDind

SD

,1 2 ,1 ,1

,1 ,1 ,1

1

λT

MRT′ = (3

+ )GEDUind

DUind

,2

,2

1

Consequently, based on Eqs. (20) and (21), consider CCF, for2oo3D voting group,

PFD PFD PFD PFD

λ λ λ t λ t C βλ MRT β λ MTTR

λ λ λ λ t t C βλ MRT β λ MTTR

= ⋅ +

=3( + + ) ′ ⋅2 ′ + ( ( + ) + )

=6 ( + + ) ′ ′ + ( ( + ) + )

G CCF

DUind

DDind

SD CE DUind

GE DUT

D DD

DUind

DUind

DDind

SD CE GE DUT

D DD

1 2

,1 ,1 ,1 ,2 2oo312

,2 ,1 ,1 ,1 2oo312

(22)

3.10. 1oo3 voting group

1oo3 voting group consists of three channels connected in parallel,thus there would have to be dangerous failures in all three channelsbefore a safety function failed on demand, and it can degrade tocontinue operating when dangerous failures causes the failures of anytwo channels.

1oo3 voting group could degrade two times and it has threeoperation stages, the RBDs of three operation stages are shown inFig. 9-1, Fig. 9-2 and Fig. 9-3. The tG2E represents MUnT of the lastfailed channel in the group, so tG2E is the equivalent mean down timeof the voting group.

With the same analysis method for 1oo2 voting group, in 1oo3voting group,

PFD C λ λ t λ λ t= ( + ) = 3( + )DUind

DDind

CE DUind

DDind

CE1 31

,1 ,1 ,1 ,1 (23)

PFD C λ λ t λ λ t= ( + ) = 2( + )DUind

DDind

GE DUind

DDind

GE2 21

,2 ,2 ,2 ,2 (24)

Based on calculation of tCE and tGE for 2oo3 voting group, here

λT

MRTλ

λMTTR= (

2+ ) +CE

DUind

Dind

DDind

Dind

,1

,1

1 ,1

,1

λT

MRTλ

λMTTR= (

3+ ) +GE

DUind

Dind

DDind

Dind

,2

,2

1 ,2

,2

After degrade to 1oo1 voting group to continue operating, the 1oo3

voting group would fail completely if any dangerous failure occurs inthis remaining channel, so the MUnT of the remaining channel is thevoting group equivalent mean down time tG2E. The RBD in this stageis shown in Fig. 9-3.

The probability of dangerous failure of this last channel in the groupis

PFD λ λ t= ( + )DUind

DDind

G E3 ,3 ,3 2 (25)

According to its sub-RBD as shown in Fig. 9-3, where

λMUnT

λ

λMUnT= +G E

DUind

Dind DU

G E DDind

Dind DD

G E2

,3

,3

2 ,3

,3

2

where MUnT T MAT MRT MUnT MTTR= − + , =DUG E

DUG E

DDG E2

12 2 .

Under the condition that any two of the three channels have failed,the reliability of the component with failure rate λDU

ind,3 in the remaining

channel is

R t e e e( ) = 3(1 − )(1 − )G E λ t λ t λ t2 − − −DUind

DUind

DUind

,1 ,2 ,3

Consider λT e λT λT λT≪ 1, ≈ 1 − + ( ) /2 − ( ) /6λT1

−1 1

21

31 , then,

∫MAT

tλ e e e dt

λ e e e dt

λ λ λ T λ λ λ TT

=3(1 − )(1 − )

3(1 − )(1 − )

≈ 34

/ =34

DUG E

TDUind λ t λ t λ t

TDUind λ t λ t λ t

DUind

DUind

DUind

DUind

DUind

DUind

2 0 ,3− − −

0 ,3− − −

,1 ,2 ,3 14

,1 ,2 ,3 13 1

DUind

DUind

DUind

DUind

DUind

DUind

1 ,1 ,2 ,3

1 ,1 ,2 ,3

so

MUnTT

MRT=4

+DUG E2 1

Here, it can be seen that, since Taylor Series is used, and high orderterms of Taylor Series are ignored, the MUnT of DU failures of 1oo3voting group can be assumed not influenced by diverse redundancy,and it is the same as the MUnT of DU failures of 1oo3 voting groupwith identical redundancy [8,20]. The point could also be proved byusing Simplified Formulas, proving details are provided in Appendix1.4.

Therefore, the equivalent mean down time of 1oo3 voting group is

λT

MRTλ

λMTTR= (

4+ ) +G E

DUind

Dind

DDind

Dind2

,3

,3

1 ,3

,3

Consequently, based on Eqs. (23), (24) and (25), consider CCF, for1oo3 voting group,

PFD PFD PFD PFD PFD

λ λ t t t C βλ MRT β λ MTTR

= ⋅ ⋅ +

=6 ∏ ( + ) + ( ( + ) + )G CCF

i DU iind

DD iind

CE GE G E DUT

D DD

1 2 3

=13

, , 2 1oo312

(26)

From the system degradation point of view, the combined RBDsgiven in this paper, in fact, are not genuine RBDs in the real sense.They are combinations of all RBDs of consecutive stages of systemoperation, including degraded operation stages. In those RBDs,channels labeled with different MUnTs (tCE, tGE and tG2E) representdifferent operation stages of the voting group.

3.11. MooN(D) voting group

Consider a MooN(D) voting group of N independent and non-identical channels. The DU failure rate of channel j is λDU,j and thechannels are proof-tested at the same time with test interval T1. Theproof-testing is assumed to be perfect such that all DU faults arerevealed by the test.

For a MooN voting group, based on system degradation method,from group is started to failed to demand, the group has (N - M +1)operation stages,

PFD C λ λ t= ( + )N DUind

DDind

CE11

,1 ,1 (27)Fig. 8. 2oo3D voted group.

L. Ding et al. Reliability Engineering and System Safety 165 (2017) 170–187

177

Page 9: Reliability Engineering and System Safetyir.sia.ac.cn/bitstream/173321/20351/1/SIL verification for SRS with... · Reliability Engineering and System Safety 165 (2017) 170–187 Available

PFD C λ λ t= ( + )N DUind

DDind

GE2 −11

,2 ,2 (28)

PFD C λ λ t= ( + )i N i DU iind

DD iind

G i E− +11

, , ( −1) (29)

PFD C λ λ t= ( + )N M M DU N Mind

DD N Mind

G N M E− +11

, − +1 , − +1 ( − ) (30)

where,

λT

MRTλ

λMTTR= (

2+ ) +CE

DUind

Dind

DDind

Dind

,1

,1

1 ,1

,1

λT

MRTλ

λMTTR= (

3+ ) +GE

DUind

Dind

DDind

Dind

,2

,2

1 ,2

,2

λT

iMRT

λ

λ

MTTRλ

λT

iMRT

λ

λMTTR

= (− 1 + 2

+ ) +

= (+ 1

+ ) +

G i EDU iind

D iind

DD iind

D iind

DU iind

D iind

DD iind

D iind

( −1),

,

1 ,

,

,

,

1 ,

,

λT

N MMRT

λ

λMTTR= (

− + 2+ ) +G N M E

DU N Mind

D N Mind

DD N Mind

D N Mind( − )

, − +1

, − +1

1 , − +1

, − +1

Detailed derivations of tG i E( −1) and tG N M E( − ) are provided inAppendix 1.5.

Consequently, based on Eqs. (27)–(30), consider CCF, for MooNvoting group,

PFD PFD PFD PFD PFD

C λ λ t C βλ MRT β λ MTTR

C λ λ t t t t C βλ MRT

β λ MTTR

= ∙ ∙⋯∙ +

= ∏ ( + ) + ( ( + ) + )

= ∏ ( + ) ⋯ + ( ( + )

+ )

G N M CCF

iN M

N i DU iind

DD iind

G i E M N DUT

D DD

iN M

N i DU iind

DD iind

CE GE G E G N M E M N DUT

D DD

1 2 − +1

=1− +1

− +11

, , ( −1) oo12

=1− +1

− +11

, , 2 ( − ) oo12

(31)

Similarly, for a MooND voting group,

PFD C λ λ λ t= ( + + ) ′N DUind

DDind

SD CE11

,1 ,1 ,1 (32)

PFD C λ λ λ t= ( + + ) ′N DUind

DDind

SD GE2 −11

,2 ,2 ,2 (33)

PFD C λ λ λ t= ( + + ) ′i N i DU iind

DD iind

SD i G i E− +11

, , , ( −1) (34)

PFD C λ t= ′N M M DU N Mind

G N M E− +11

, − +1 ( − ) (35)

where,

tλ MRT λ λ MTTR

λ λ λ′ =

( + ) + ( + )

+ ( + )CEDUind T

DDind

SD

DUind

DDind

SD

,1 2 ,1 ,1

,1 ,1 ,1

1

tλ MRT λ λ MTTR

λ λ λ′ =

( + ) + ( + )

+ ( + )GEDUind T

DDind

SD

DUind

DDind

SD

,2 3 ,2 ,2

,2 ,2 ,2

1

tλ MRT λ λ MTTR

λ λ λ′ =

( + ) + ( + )

+ ( + )G i EDU iind T

i DD iind

SD i

DU iind

DD iind

SD i( −1)

, + 1 , ,

, , ,

1

λT

N MMRT′ = (

− + 2+ )G N M E

DU N Mind

DU N Mind( − )

, − +1

, − +1

1

Consequently, based on Eqs. (32)–(35), consider CCF, for MooNDvoting group,

PFD PFD PFD PFD PFD

C λ λ λ t C λ t

C βλ MRT β λ MTTR

C λ λ λ t t t t C λ

t C βλ MRT β λ MTTR

= ∙ ∙⋯∙ +

= ∏ ( + + ) ′ ∙ ′

+ ( ( + ) + )

= ∏ ( + + ) ′ ′ ′ ⋯ ′ ∙

′ + ( ( + ) + )

G N M CCF

iN M

N i DU iind

DD iind

SD i G i E M DU N Mind

G N M E

M N DUT

D DD

iN M

N i DU iind

DD iind

SD i CE GE G E G N M E M DU N Mind

G N M E M N DUT

D DD

1 2 − +1

=1−

− +11

, , , ( −1)1

, − +1 ( − )

oo 2

=1−

− +11

, , , 2 ( − −1)1

, − +1

( − ) oo 2

1

1

(36)

4. Results and discussion

4.1. Results

From the system degradation point of view, in Section 3, this paperpresents RBDs of each operation stage and gives a combined RBD forseveral classical voting groups with diverse redundancy. In addition,based on degradation processes, PFDG for voting groups are calcu-lated, and two sets of general formulae to assess PFDG of anyMooN(D)voting group with diverse redundancy are given.

To validate credibility of the proposed method and equationsderived, a demonstration is given below for an illustration purposeabout how the method can be applied to industry systems. Numericalresults are compared with FTA method [32] and FRANTIC [26,27],sensitivity studies of failure rate and proof test interval on PFDG aregiven. Following is a general comparison of numerical results amongthis method, FTA and FRANTIC to demonstrate correctness of themethod.

Fig. 9. 1oo3 voted group.

L. Ding et al. Reliability Engineering and System Safety 165 (2017) 170–187

178

Page 10: Reliability Engineering and System Safetyir.sia.ac.cn/bitstream/173321/20351/1/SIL verification for SRS with... · Reliability Engineering and System Safety 165 (2017) 170–187 Available

4.1.1. Demonstration on safety shutdown system of NPCTFThis demonstration is performed on safety shutdown system of

NPCTF. There are two shutdown systems on NPCTF: SDS1 and SDS2.Either of them can trip NPCTF. For high-temperature trip safetyfunction, architecture of the shutdown system is shown in Fig. 10.

Sensor subsystem includes three identical temperature sensorswhich constitute a 2oo3 voting group. Logic subsystem includes ananalog input module, a controller and a digital output module which isa 1oo1 configuration. The actuator subsystem is a 1oo2 configurationcomposed of two diverse shutdown channels, either of the two shut-down channels can trip the heater process on demand.

Since incompleteness of the reliability database, failure rates forcomponents used in this demonstration case come from the database[30,31], and failure rate for logic subsystem comes from an example inIEC 61508-6 [8].

Strictly, overall failure rate of every subsystem channel should bedetermined by FMEDA, however, since this demonstration is given foran illustration purpose, an assumption is made on that overall failurerate of each shutdown channel of actuator subsystem is summation ofthe failure rates of components composing that shutdown channel.Besides, for simplicity and convenience, other two assumptions com-monly used in IEC 61508-6 are made here, one is that dangerousfailure rate is equal to safe failure rate for all subsystem channels, i.e.λD = λS =50% * λ, another is that for CCF of sensor subsystem, β=10%,βD =5%. For the case study, a proof test period of one year is assumed.

Since sensors and actuators have no self-diagnostic function,diagnostic coverage (DC) of sensors and actuators are 0. Since twoactuator channels are independent and completely different, theactuator subsystem is diverse redundancy and there is no CCF.Therefore, architecture of shutdown system on NPCTF can be simpli-fied as shown in Fig. 11 with reliability data.

Numerical results of the case study are given in Table 3. As can beseen from Table 3, this system degradation based method gives veryclose results with FTA and FRANTIC. Besides, compared with FTA andFRANTIC, it can be seen that although quite close, this paper tends togive lightly conservative values. This is explained by the assumption viiithat the first failed channel has highest failure rate, and the last failedchannel has lowest failure rate, so the results of PFDG obtained byusing the proposed method are conservative.

Detailed procedures of the three methods for the case study are putin Appendix 2, 3, 4 to provide sufficient information. For moreinformation about FTA and FRANTIC, interested readers can refer to[26,27,32].

4.1.2. Sensitivity StudyTo give a more concrete proof of credibility of this method,

sensitivity studies of failure rate and proof test interval on NPCTFshutdown system are performed. Sensitivity study of failure rate isperformed with λT2-1 varying from 10−7/h to 10−5/h, and sensitivitystudy of proof test interval is performed with test interval of sensor T2-1 varying from 1 to 24 months, other parameters keep the same withthe demonstration case before.

Sensitivity study of λT2-1 is shown in Fig. 12, which shows verygood consistence among the proposed method, FTA and the FRANTICmodel.

Sensitivity study of test interval of sensor T2-1 is shown in Fig. 13,which also shows very good consistence among the proposed method,FTA and the FRANTIC model.

4.1.3. General comparison of numerical resultsA general comparison of numerical results among this method, FTA

and FRANTIC are given below to demonstrate correctness of themethod. Besides, CCF part and the independent part is split for thecomparison of numerical results.

The architectures used in the comparison are 2oo2, 2oo2D, 1oo2,1oo2D, 2oo3, 2oo3D and 1oo3, they are explicitly mentioned in IEC61508-6 [8] and IEC 61131-6 [33]. Three proof test intervals are usedfor each architecture, which are: 1 year, 2 years and 5 years. Differentfailure rates are assumed to the redundant channels. The PFDG resultsfor the various architectures are given in Table 4. Since for 2oo2 votingand 2oo2D voting, both channels need to demand the safety function atall the time, so there is no need to consider CCF separately.

As can be seen from Table 4, the numerical values obtained with allthree methods are close to each other for all architectures and prooftest intervals. In comparison with FTA and FRANTIC, it can be seenthat although quite close, the proposed method tends to give slightlyhigher values for independent part of PFDG, and similar to [2],FRANTIC tends to give lower values. This is explained by theassumption viii that the first failed channel has highest failure rate,and the last failed channel has lowest failure rate, so the results ofPFDG obtained by using the proposed method are more conservative.However, since CCF contributes much more to PFDG, the discrepan-cies of independent parts can be considered insignificant.

4.2. Discussion

SIL verification is essentially reliability prediction, and there is noway to know the real probability value of reliability. Each method forreliability prediction uses some kind of simplification and approxima-tion, and uncertainty comes from many aspects including the basicfailure rates of electronic components. In the authors’ opinion, toevaluate the accuracy of a method, the numerical results could show

Fig. 10. Architecture of safety shutdown system on NPCTF.

L. Ding et al. Reliability Engineering and System Safety 165 (2017) 170–187

179

Page 11: Reliability Engineering and System Safetyir.sia.ac.cn/bitstream/173321/20351/1/SIL verification for SRS with... · Reliability Engineering and System Safety 165 (2017) 170–187 Available

some credibility is one side, the other side is the system's safety-relatedbehavior that the method could cover. Markov model is known as moreaccurate method, because it can cover most aspects of the system'ssafety-related behavior except the uncertainty and sensitivity analysis.

Compared to SIL verification of systems using identical redun-dancy, research on SIL verification of systems using diverse redun-dancy is more universal, more difficult and more complicated. Amethod for SIL verification of identical redundancy can not necessarilybe used for diverse redundancy, or can be used for diverse redundancybut with much more complexity such as FTA and Markov analysis.

Compared to traditional reliability analysis methods such as FTAand Markov analysis for SIL verification of diverse redundancy, thissystem degradation based method can be easily used in engineeringpractice, and could make the SIL verification process simpler, from theengineering point of view. Compared to the Simplified Formulas givenfor diverse channels by reference [19], this system degradation basedmethod considers more elements and details: λDD failure rate,diagnostic coverage, inclusion of the effect of CCF, restoration timeof faulty component and dynamic degradation processes. This methodprovides more details and accuracy in reliability analysis, in theauthors’ opinion, it is more accurate compared to Simplified

Fig. 11. Simplified architecture of with reliability data.

Table 3Comparison of numerical values among FTA, FRANTIC and this paper.

Method FTA (Fault Tree Analysis) FRANTIC This paper

PFDG (all) 7.0543×10−4 7.0122e−004 7.0658×10−4

Sensor 1.9673×10−4 1.9651e−004 1.9699×10−4

Ind CCF 2.0000×10−5 1.7673×10−4 1.9941e−005 1.7657×10−4 2.0100×10−5 1.7689×10−4

Logic 2.5880×10−4 2.5859×10−4 2.5900×10−4

Actuator 2.4991×10−4 2.4612×10−4 2.5059×10−4

Note: “Ind” and “CCF” represent independent part and CCF part of PFD for sensor subsystem.

Fig. 12. Sensitivity studies of failure rate on shutdown system. Fig. 13. Sensitivity studies of proof test interval on shutdown system.

L. Ding et al. Reliability Engineering and System Safety 165 (2017) 170–187

180

Page 12: Reliability Engineering and System Safetyir.sia.ac.cn/bitstream/173321/20351/1/SIL verification for SRS with... · Reliability Engineering and System Safety 165 (2017) 170–187 Available

Formulas method given in reference [19].So far, this paper does not consider the parameter repair rates and

staggered testing patterns. Another drawback is that components withconstant failure rates. Some components have increasing failure ratesdue to aging, the modeling of time-variant failure rates for SILverification is worth further study. And although CCF is quantified byusing the multi-β factor model to treat the multiplicity of CCF, the βfactor for 1oo2 voting group is still determined by using the checklist inIEC 61508-6. Whilst there is a clear evidence that diverse redundancycan reduce CCF, but it is much too difficult to quantify the effects ofdiverse redundancy on CCF, i.e. how much CCF could be reduced byusing diverse redundancy? This question is worth further study.

5. Conclusions

IEC 61508 requires SIL verification for SRS. This study is a furtherdevelopment of the previous work done by the authors, with the aim ofdeveloping a SIL verification method for SRS with diverse redundancyin harmony with IEC 61508 requirements. Since most research workthat has already been done is focusing on systems with identicalredundancy, while diverse redundancy is widely used in engineeringpractice to minimize CCF, this paper tries using the authors’ systemdegradation based method to verify SIL of systems with diverseredundancy.

In this system degradation based method for SIL verification of

diverse redundancy, the CCF part of PFDG is existing knowledge fromPDS method produced by SINTEF, and new contribution is thequantification method for independent part of PFDG. The systemdegradation based method has been applied on safety shutdown systemof Nuclear Power Control Test Facility, sensitivity studies and a generalcomparison of numerical results indicate that the method has verygood consistency with FTA and FRANTIC model, which demonstratescorrectness of the proposed method. Results also show that thismethod can be easily used in engineering practice to deal with SILverification of diverse redundancy with reasonable accuracy.

Besides, this paper gives two sets of general formulae for SILverification of any MooN(D) system with independent and non-identical channels, i.e. diverse redundancy. The two sets of generalformulae can also be used for identical redundancy.

Acknowledgments

This research was carried out while the first author is a visitinggraduate of the Western University supported by the China ScholarshipCouncil. This paper is also a result of work financially supported by 863Project (Project number: 2013AA040304) of China.

The authors would like to thank the reviewers for the commentsand suggestions that contributed to a significant improvement of thispaper.

Table 4Comparison of numerical results among this method, FTA and FRANTIC.

λD,1 = 6*10−7/hour, λD,2 = 5*10−7/hour, λD,3 = 4*10−7/hour, DC1 = DC2 = DC3 = 0.9, β = 2%, βD = 1%,MTTR = 8 (hour), MRT = 8 (hour), qo = 1.0, pf = 0.0, p = 0.0, τ = 1.5 (hour), TR = MTTR [26]

voting T1 This paper FTA FRANTICPFDG PFDG PFDGIndependent CCF Independent CCF Independent CCF

2oo2 1 4.9060e-004 4.9016e-004 4.8972e-0042 9.7240e-004 9.7196e-004 9.7152e-0045 2.4178e-003 2.4174e-003 2.4169e-003

2oo2D 1 4.8268e-004 4.8224e-004 4.8180e-0042 9.6448e-004 9.6404e-004 9.6360e-0045 2.4099e-003 2.4094e-003 2.4090e-003

1oo2 1 4.9248e-006 4.9175e-006 4.9130e-0067.8516e-008 4.8462e-006 7.5673e-008 4.8419e-006 7.5536e-008 4.8375e-006

2 9.9514e-006 9.9386e-006 9.9340e-0063.0710e-007 9.6443e-006 2.9870e-007 9.6399e-006 2.9843e-007 9.6355e-006

5 2.5932e-005 2.5886e-005 2.5881e-0051.8935e-006 2.4038e-005 1.8520e-006 2.4034e-00 1.8513e-006 2.4030e-005

1oo2D 1 4.9237e-006 4.9194e-006 4.9130e-0067.7462e-008 4.8462e-006 7.7575e-008 4.8419e-006 7.5519e-008 4.8375e-006

2 9.9485e-006 9.9424e-006 9.9339e-0063.0424e-007 9.6443e-006 3.0246e-007 9.6399e-006 2.9841e-007 9.6355e-006

5 2.5919e-005 2.5895e-005 2.5881e-0051.8805e-006 2.4038e-005 1.8613e-006 2.4034e-005 1.8513e-006 2.4030e-005

2oo3 1 6.7824e-006 6.7270e-006 6.7208e-0062.3607e-007 6.5463e-006 1.8665e-007 6.5404e-006 1.8636e-007 6.5345e-006

2 1.3951e-005 1.3758e-005 1.3752e-0059.2334e-007 1.3028e-005 7.3662e-007 1.3022e-005 7.3625e-007 1.3016e-005

5 3.8164e-005 3.7030e-005 3.7027e-0055.6930e-006 3.2471e-005 4.5643e-006 3.2465e-005 4.5674e-006 3.2459e-005

2oo3D 1 6.7793e-006 6.7317e-006 6.7254e-0062.3303e-007 6.5463e-006 1.9134e-007 6.5404e-006 1.9095e-007 6.5345e-006

2 1.3943e-005 1.3767e-005 1.3761e-0059.1523e-007 1.3028e-005 7.4588e-007 1.3022e-005 7.4510e-007 1.3016e-005

5 3.8128e-005 3.7053e-005 3.7045e-0055.6570e-006 3.2471e-005 4.5873e-006 3.2465e-005 4.5853e-006 3.2459e-005

1oo3 1 2.1821e-006 2.1801e-006 2.1782e-0062.1145e-011 2.1821e-006 1.9655e-011 2.1801e-006 1.9602e-011 2.1782e-006

2 4.3427e-006 4.3407e-006 4.3387e-0061.6249e-010 4.3425e-006 1.5446e-010 4.3405e-006 1.5425e-010 4.3386e-006

5 1.0826e-005 1.0824e-005 1.0822e-0052.4777e-009 1.0824e-005 2.3876e-009 1.0822e-005 2.3863e-009 1.0820e-005

L. Ding et al. Reliability Engineering and System Safety 165 (2017) 170–187

181

Page 13: Reliability Engineering and System Safetyir.sia.ac.cn/bitstream/173321/20351/1/SIL verification for SRS with... · Reliability Engineering and System Safety 165 (2017) 170–187 Available

Appendix

1. Mathematical derivation

1.1. MUnT and PFDG for single channelFor generalization, assume that a channel is made up of two components in series, with the respective failure rates and repair rates: λ1, μ1 and λ2,

μ2, and suppose that λ1 and λ2 have contributions to PFDG. Using an equivalent component replace these two components, the failure rate andrepair rate of the equivalent component are λ andμ.

Before replacement, the reliability of the channel is

R t e( ) = ;λ λ t−( + )1 2

after replacement, the reliability of the channel is

R t e( ) = .λt−

Comparing two equations above,

λ λ λ= + .1 2

Before replacement, the probability of the channel in available state is

P A Aμ μ

λ μ λ μ= =

( + )( + );1 2

1 2

1 1 2 2

after replacement, the probability of the channel in available state is

P μλ μ

=+

.

So,

μλ μ

μ μλ μ λ μ+

=( + )( + )

.1 2

1 1 2 2

Therefore,

μλμ μ

λ λ λ μ λ μ=

+ +.1 2

1 2 1 2 2 1

As MUnT μ MUnT μ MUnT μ≈ 1/ , ≈ 1/ , ≈ 1/1 1 2 2 , then

MUnTλ λ

λ MUnT λ MUnT λ λ MUnTMUnT= 1+

( + + ).1 2

1 1 2 2 1 2 1 2

Since μ λ μ λ λ MUnT λ MUnT λ λ MUnTMUnT λ MUnT λ MUnT≫ , ≫ , + + ≈ +1 1 2 2 1 1 2 2 1 2 1 2 1 1 2 2,Hence,

MUnTλ λ

λ MUnT λ MUnT≈ 1+

( + )1 2

1 1 2 2

For a channel consists of n components connected in series,

∑MUnTλ

λ MUnT= 1∑ i i

n

i i=1

Since λ MUnT⋅ ≪ 1, PFDG for single channel in time interval [0, MUnT] is,

PFD R MUnT e λ MUnT= 1 − ( ) = 1 − ≈ ⋅Gλ MUnT− ⋅

1.2 1oo2 voting groupUsing Simplified Formulas (λDD is not considered by the method), PFDG of 1oo2 voting group is

PFDλ λ T

=3G

DU DU,1 ,2 12

and probability distribution function of 1oo2 voting group is

F t R t e e e e e e( ) = 1 − ( ) = 1 − ( + − ) = 1 − − +DU DUλ t λ t λ λ t λ t λ t λ λ t− − −( + ) − − −( + )DU DU DU DU DU DU DU DU,1 ,2 ,1 ,2 ,1 ,2 ,1 ,2

Then, MUnT of DU failures for 1oo2 voting group is

MUnTT

F TPFD

Te e e

λ λ T T=

( )∙ =

1 − − +∙

3≈

3DUDU

G λ T λ T λ λ TDU DU1

1

1− − −( + )

,1 ,2 12

1DU DU DU DU,1 1 ,2 1 ,1 ,2 1

The last approximation follows because of using Taylor Series,

e e e λ λ T1 − − + ≈λ T λ T λ λ TDU DU

− − −( + ),1 ,2 1

2DU DU DU DU,1 1 ,2 1 ,1 ,2 1

L. Ding et al. Reliability Engineering and System Safety 165 (2017) 170–187

182

Page 14: Reliability Engineering and System Safetyir.sia.ac.cn/bitstream/173321/20351/1/SIL verification for SRS with... · Reliability Engineering and System Safety 165 (2017) 170–187 Available

1.3 2oo3 voting groupUsing Simplified Formulas, PFDG of 2oo3 voting group is

PFDλ λ λ λ λ λ T

=[( ) + ( ) + ( )]

3GDU DU DU DU DU DU,1 ,2 ,1 ,3 ,2 ,3 1

2

and probability distribution function of 2oo3 voting group is

F t R t e e e e

e e e e

( ) = 1 − ( ) = 1 − ( + + − 2 )= 1 − − − + 2

DU DUλ λ t λ λ t λ λ t λ λ λ t

λ λ t λ λ t λ λ t λ λ λ t

−( + ) −( + ) −( + ) −( + + )

−( + ) −( + ) −( + ) −( + + )

DU DU DU DU DU DU DU DU DU

DU DU DU DU DU DU DU DU DU

,1 ,2 ,1 ,3 ,2 ,3 ,1 ,2 ,3

,1 ,2 ,1 ,3 ,2 ,3 ,1 ,2 ,3

Then, MUnT of DU failures of 2oo3 voting group is

MUnT PFD= ∙

= ∙ ≈

DUT

F T G

T

e e e e

λ λ λ λ λ λ T T

( )

1 − − − + 2

[( ) + ( ) + ( )]3 3

DU

λDU λDU T λDU λDU T λDU λDU T λDU λDU λDU TDU DU DU DU DU DU

11

1−( ,1+ ,2) 1 −( ,1+ ,3) 1 −( ,2+ ,3) 1 −( ,1+ ,2+ ,3) 1

,1 ,2 ,1 ,3 ,2 ,3 12

1

The last approximation follows because of using Taylor Series,

e e e eλ λ λ λ λ λ T

1 − − − + 2≈[( ) + ( ) + ( )]

λ λ T λ λ T λ λ T λ λ λ T

DU DU DU DU DU DU

−( + ) −( + ) −( + ) −( + + )

,1 ,2 ,1 ,3 ,2 ,3 12

DU DU DU DU DU DU DU DU DU,1 ,2 1 ,1 ,3 1 ,2 ,3 1 ,1 ,2 ,3 1

1.4 1oo3 voting groupUsing Simplified Formulas, PFDG of 1oo3 voting group is

PFDλ λ λ T

=4G

DU DU DU,1 ,2 ,3 13

and probability distribution function of 1oo3 voting group is

F t e e e( ) = (1 − )∙(1 − )∙(1 − )DUλ t λ t λ t− − −DU DU DU,1 ,2 ,3

Then, MUnT of DU failures of 1oo3 voting group is

MUnTT

F TPFD

Te e e

λ λ λ T T=

( )∙ =

(1 − )∙(1 − )∙(1 − )∙

4≈

4DUDU

G λ T λ T λ TDU DU DU1

1

1− − −

,1 ,2 ,3 13

1DU DU DU,1 1 ,2 1 ,3 1

The last approximation follows because of using Taylor Series,

e e e λ λ λ T(1 − )∙(1 − )∙(1 − ) ≈λ T λ T λ TDU DU DU

− − −,1 ,2 ,3 1

3DU DU DU,1 1 ,2 1 ,3 1

1.5 MooN(D) voting groupSimplified Formulas method is used to determine tG i E( −1) and tG N M E( − ) . A MooN voting group is functioning when at least M of its N channels are

functioning and fails to function when at least (N - M +1) channels fail. Each minimal cut set has (N - M +1) events (i.e., channels with DU failure)

and the voting group has⎛⎝⎜

⎞⎠⎟

NN M− + 1 minimal cut sets. Let the minimal cut sets be denoted C1, C2, …, ⎛

⎝⎜⎞⎠⎟

C NN M− +1

, and complementary set of Ci is

denoted by Ci.For a MooN voting group,

⎛⎝⎜⎜

⎞⎠⎟⎟

⎛⎝⎜

⎞⎠⎟

∑ ∏PFD λT

N M=

− + 2GM N

i

NN M

j CDU j

N Moo

=1

− +1

∈,

1− +1

i

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

F T e e e e

e e e

( ) = ∑ ∏ (1 − ) ∏ ( ) + ∑ ∏ (1 − ) ∏ ( )

+⋯ + ∑ ∏ (1 − ) ∏ ( ) + ∑ ∏ (1 − )

DUM N

i

NN M

j Cλ T

h Cλ T

i

NN M

j Cλ T

h Cλ T

i

NN

j Cλ T

h Cλ T

i

NN

j Cλ T

1oo

=1− +1

∈−

∈−

=1− +2

∈−

∈−

=1−1

∈−

∈−

=1 ∈−

iDU j

iDU h

iDU j

iDU h

iDU j

iDU h

iDU j

, 1 , 1 , 1 , 1

, 1 , 1 , 1

When λDU,jT1 is small, e λ T1 − ≈λ TDU j

−, 1

DU j, 1 and e ≈ 1λ T− DU h, 1 , so

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

F T λ T λ T λ T λ T

λ T λ T λ T λ T

( ) ≈ ∑ ∏ ( ) ∏ (1) + ∑ ∏ ( ) ∏ (1) + ⋯ + ∑ ∏ ( ) ∏ (1) + ∑ ∏ ( )

= ∑ ∏ ( ) + ∑ ∏ ( ) + ⋯ + ∑ ∏ ( ) + ∑ ∏ ( )

DUM N

i

NN M

j C DU j i

NN M

j C DU j i

NN

j C DU j i

NN

j C DU j

i

NN M

j C DU j i

NN M

j C DU j i

NN

j C DU j i

NN

j C DU j

1oo

=1− +1

∈ , 1 =1− +2

∈ , 1 =1−1

∈ , 1 =1 ∈ , 1

=1− +1

∈ , 1 =1− +2

∈ , 1 =1−1

∈ , 1 =1 ∈ , 1

i i i i

i i i i

Also, when λDU,jT1 is small, the addends get smaller and smaller as the index i increase, and F T( )DUM N

1oo cab be approximated by the first addend

in the sum, such that

L. Ding et al. Reliability Engineering and System Safety 165 (2017) 170–187

183

Page 15: Reliability Engineering and System Safetyir.sia.ac.cn/bitstream/173321/20351/1/SIL verification for SRS with... · Reliability Engineering and System Safety 165 (2017) 170–187 Available

⎛⎝⎜⎜

⎞⎠⎟⎟

⎛⎝⎜⎜

⎞⎠⎟⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

∑ ∏ ∑ ∏F T λ T λ T( ) ≈ ( ) = ( )DUM N

i

NN M

j CDU j

i

NN M

j CDU j

N M1

oo

=1

− +1

∈, 1

=1

− +1

∈, 1

− +1

i i

Then, MUnT of DU failures of MooN voting group is therefore

⎛⎝⎜

⎞⎠⎟⎛

⎝⎜⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

MUnT PFD

λ

= ∙

≈ ∙∑ ∏ =

DUM N T

F T GM N

T

λ T

i

NN M

j C DU jTN M

TN M

oo( )

oo

∑ ∏ ( )

=1− +1

∈ , − + 2 − + 2

DUM N

i

NN M

j Ci DU j N Mi

N M

1

1oo

1

=1− +1

∈ , 1− +1

1− +1

1

Here, it can be seen that, since Taylor Series is used, and high order terms of Taylor Series are ignored, theMUnT of DU failures ofMooN votinggroup can be assumed not influenced by diverse redundancy.

Therefore, the equivalent mean down time of MooN voting group is

λT

N MMRT

λ

λMTTR= (

− + 2+ ) +G N M E

DU N Mind

D N Mind

DD N Mind

D N Mind( − )

, − +1

, − +1

1 , − +1

, − +1

and

tλλ

Ti

MRTλλ

MTTRλλ

Ti

MRTλλ

MTTR= (− 1 + 2

+ ) + = (+ 1

+ ) +G i EDU iind

D iind

DD iind

D iind

DU iind

D iind

DD iind

D iind( −1)

,

,

1 ,

,

,

,

1 ,

,

2. System degradation method of case study

2.1 Sensor subsystem

tλλ

TMRT

λλ

MTTRT

MRT= (2

+ ) + =2

+CEDU

D

DD

D

,1

,1

1 ,1

,1

1

tλλ

TMRT

λλ

MTTRT

MRT= (3

+ ) + =3

+GEDU

D

DD

D

,2

,2

1 ,2

,2

1

PFD λ λ t t C βλ MRT β λ MTTR

λ βλ t t βλ MRT

= 6 ∏ ( + ) + ( ( + ) + )

= 6( − ) + 1.5( ( + ) = 1.97 × 10

G Sensor i DU iind

DD iind

CE GE DUT

D DD

DU DU CE GE DUT

− =12

, , 2oo3 22

2−4

1

1

2.2 Logic subsystem

⎡⎣⎢

⎤⎦⎥PFD λ t λ

λλ

TMRT

λλ

MTTR= = (2

+ ) + = 2.59 × 10G Logic D CE DDU

D

DD

D−

1 −4

2.3 Actuator subsystem

tλλ

TMRT

λλ

MTTRT

MRT= (2

+ ) + =2

+CEDU

D

DD

D

,1

,1

1 ,1

,1

1

tλλ

TMRT

λλ

MTTRT

MRT= (3

+ ) + =3

+GEDU

D

DD

D

,2

,2

1 ,2

,2

1

PFD λ λ t t C βλ MRT β λ MTTR

λ λ t t

= 2 ∏ ( + ) + ( ( + ) + )

= 2 = 2.51 × 10G Actuator i DU i

indDD iind

CE GE DUT

D DD

DU DU CE GE

− =12

, , 1oo2 2

,1 ,2−4

1

2.4 Safety shutdown system of NPCTF

PFD PFD PFD PFD= + + = 7.0658 × 10G Shutdown G Sensor G Logic G Actuator− − − −−4

3. FTA method of case study

The fault tree of the case study is shown in Figs. A-3-1.

L. Ding et al. Reliability Engineering and System Safety 165 (2017) 170–187

184

Page 16: Reliability Engineering and System Safetyir.sia.ac.cn/bitstream/173321/20351/1/SIL verification for SRS with... · Reliability Engineering and System Safety 165 (2017) 170–187 Available

3.1 Sensor subsystemFor every t T MRT∈ [0, + ]1 ,

PFD t β λ λ t( ) = ∙( + )∙CCF T Splitter2−1

PFD t β λ t( ) = (1 − ) ∙T T2−1 2−1

PFD t β λ t( ) = (1 − ) ∙Splitter Splitter

PFD t PFD t PFD t( ) = ( ) + ( )T channel T Splitter2−1− 2−1

PFD t PFD t PFD t PFD t PFD t( ) = 3 ( ) (1 − ( )) + ( ) + ( )Sensor T channel T channel T channel CCF2−1−2

2−1− 2−1−3

3.2 Logic subsystemFor every t T MRT∈ [0, + ]1 ,

PFD t λ β t( ) = (1 − )∙DU Logic

PFD t λ β MTTR( ) = ∙DD Logic

PFD t PFD t PFD t( ) = ( ) + ( )Logic DU DD

3.3 Actuator subsystemFor every t T MRT∈ [0, + ]1 ,

PFD t PFD t PFD t λ t( ) = ( ) = ( ) = ∙RA RB RC Relay

PFD t λ t( ) = ∙RL Relay1

Figs. A-3-1. Fault tree of the case study.

L. Ding et al. Reliability Engineering and System Safety 165 (2017) 170–187

185

Page 17: Reliability Engineering and System Safetyir.sia.ac.cn/bitstream/173321/20351/1/SIL verification for SRS with... · Reliability Engineering and System Safety 165 (2017) 170–187 Available

PFD t λ t( ) = ∙LM Contr LM Contr− −

PFD t λ t( ) = ∙LM LM

PFD t λ t( ) = ∙SDS switch SDS switch1− 1−

PFD t PFD t PFD t PFD t PFD t PFD t PFD t PFD t( ) = ( ) + ( ) + ( ) + ( ) + ( ) + ( ) + ( )LM channel RA RB RC RL LM Cont LM SDS switch− 1 − 1− PFD t λ t( ) = ∙Rod Rod

PFD t λ t( ) = ∙Gripper Gripper

PFD t λ t( ) = ∙Valve Valve

PFD t λ t( ) = ∙SDS switch SDS switch2− 2−

PFD t PFD t PFD t PFD t PFD t( ) = ( ) + ( ) + ( ) + ( )Rod channel Rod Gripper Valve SDS switch− 2−

PFD t PFD t PFD t( ) = ( )∙ ( )Actuator LM channel Rod channel− −

3.4 Safety shutdown system of NPCTFThe time-dependent system unavailability function is

PFD t PFD t PFD t PFD t( ) = ( ) + ( ) + ( )Shutdown Sensor Logic Actuator

The average probability of dangerous failure on demand is

∫PFD

PFD t

T MRT=

( )

+= 7.0543 × 10G Shutdown

T MRTShutdown

−0

+

1

−41

4. FRANTIC of case study

4.1 Notation of FRANTICq instantaneous component unavailability.qo unavailability of the component due to test.pf probability of component failure during the test.p probability that a failure is not detected by the test.T1 time from plant startup to commencement of first test.T2 time between complete test cycles for remaining tests.τ average online testing time.TR average online repair time for a failed component.

x the complement of x, i.e., 1 – x.[x] greatest integer less than or equal to x.

4.2 Component instantaneous unavailabilityAccording to FRANTIC computer code, undetected failure rates are considered as periodically tested components. For periodically tested

components, the instantaneous unavailability is calculated by the cyclic Eqs. (1)–(5) given in Tables A-4-1. To start the test cycle, the first testinterval T1 can be different from the remaining test intervals T2 to account for staggering of tests among different components. For the first testinterval, q = λt, 0≤ t≤ T1 and for the first test, λT1 replaces Q in (1) and (2).

Detected failure rates are considered as monitored components, where failures are immediately detectable such as by an alarm, theinstantaneous unavailability is calculated using the steady state, asymptotic approximation, Eq. (6) given in Tables A-4-1, where MTTR is theaverage downtime.

Besides, components of every subsystem whose failure rates are undetected failure rates or detected failure rates are also given in Tables A-4-1.

4.3 Safety shutdown system instantaneous unavailabilityThe instantaneous unavailability of safety shutdown system is built from individual component unavailability based on the same logical

combination established by FTA shown in Appendix 3.The time-dependent system unavailability function is

PFD t PFD t PFD t PFD t( ) = ( ) + ( ) + ( )Shutdown Sensor Logic Actuator

4.4 Safety shutdown system of NPCTFThe average probability of dangerous failure on demand is

∫PFD

PFD t

T MRT=

( )

+= 7.0122 × 10G Shutdown

T MRTShutdown

−0

+

1

−41

L. Ding et al. Reliability Engineering and System Safety 165 (2017) 170–187

186

Page 18: Reliability Engineering and System Safetyir.sia.ac.cn/bitstream/173321/20351/1/SIL verification for SRS with... · Reliability Engineering and System Safety 165 (2017) 170–187 Available

References

[1] IEC 61508. Functional safety of electrical/electronic/programmable electronicsafety-related systems, Part 1-7. Geneva: International ElectrotechnicalCommission; 2010.

[2] Oliveira LF, Abramovitch RN. Extension of ISA TR84.00.02 PFD equations toMooN architectures. Reliab Eng Syst Saf 2010;95:707–15.

[3] Vaurio JK. Unavailability equations for k-out-of-n systems. Reliab Eng Syst Saf2011;96:350–2.

[4] Jin H, Lundteigen MA, Rausand M. New PFH-formulas for k-out-of-n: f-systems.Reliab Eng Syst Saf 2013;111:112–8.

[5] Wang C, Xing L, Levitin G. Explicit and implicit methods for probabilistic common-cause failure analysis. Reliab Eng Syst Saf 2014;131:175–84.

[6] Wang Y, Rausand M. Reliability analysis of safety-instrumented systems operatedin high-demand mode. J Loss Prev Process Ind 2014;32:254–64.

[7] Jin H, Rausand M. Reliability of safety-instrumented systems subject to partialtesting and common-cause failures. Reliab Eng Syst Saf 2014;121:146–51.

[8] IEC 61508-6. Functional safety of electrical/electronic/programmable electronicsafety-related systems – Part 6: guidelines on the application of IEC 61508-2 andIEC 61508-3. International Electrotechnical Commission, Geneva; April 2010.

[9] ISA-TR84.00.02. Safety Instrumented Functions (SIF) – Safety Integrity Level(SIL) evaluation techniques, Part 1-5. United States of America: TheInstrumentation, Systems, and Automation Society; 2002.

[10] Flammini F, Marrone S, Mazzocca N, Vittorini V. A new modeling approach to thesafety evaluation of N-modular redundant computer systems in presence ofimperfect maintenance. Reliab Eng Syst Saf 2009;94:1422–32.

[11] Jin H, Lundteigen MA, Rausand M. Reliability performance of safety instrumentedsystems: a common approach for both low- and high-demand mode of operation.Reliab Eng Syst Saf 2011;96:365–73.

[12] Torres- Echeverría AC, Martorell S, Thompson HA. Modeling safety instrumentedsystems with MooN voting architectures addressing system reconfiguration fortesting. Reliab Eng Syst Saf 2011;96:545–63.

[13] Jahanian H. Generalizing PFDformulas of IEC 61508 for MooN configurations. ISATrans 2015;55:168–74.

[14] Torres-Echeverría AC, Martorell S, Thompson HA. Design optimization of a safety-instrumented system based on RAMS + C addressing IEC 61508 requirements anddiverse redundancy. Reliab Eng Syst Saf 2009;94:162–79.

[15] Khatab A, Nahas N, Nourelfath M. Availability of K-out-of-N: G systems with non-identical components subject to repair priorities. Reliab Eng Syst Saf2009;94:142–51.

[16] Moghaddass R, Zuo MJ, Wang W. Availability of a general k-out-of-n: G systemswith non-identical components considering shut-off rules using quasi-birth-deathprocess. Reliab Eng Syst Saf 2011;96:489–96.

[17] Zhang TL, Long W, Sato Y. Availability of systems with self-diagnostic components– applying Markov model to IEC 61508-6. Reliab Eng Syst Saf 2003;80:133–41.

[18] Kaczor G, Młynarski S, Szkoda M. Verification of safety integrity level with theapplication of Monte Carlo simulation and reliability block diagrams. J Loss PrevProcess Ind 2016;41:31–9.

[19] Rausand M. Reliability of safety-critical systems: theory and applications.Hoboken, New Jersey: Wiley; 2014.

[20] Ding L, Wang H, Kang K, Wang K. A novel method for SIL verification based onsystem degradation using reliability block diagram. Reliab Eng Syst Saf2014;132:36–45.

[21] Murthy DNP, Østerås T, Rausand M. Component reliability specification. ReliabEng Syst Saf 2009;94:1609–17.

[22] Hokstad P, Rausand M. Common cause failure modeling: status and trends. In:Misra KB, editor. Handbook for performability engineering. London: Springer;2008. p. 621–40.

[23] Hauge S, Kråkenes T, Hokstad P, Håbrekke S, Jin H. Reliability Prediction methodfor safety instrumented systems. Trondheim, Norway: SINTEF; 2013.

[24] Chebila M, Innal F. Generalized analytical expressions for safety instrumentedsystems' performance measures: PFDavg and PFH. J Loss Prev Process Ind2015;34:167–76.

[25] Cai B, Liu Y, Fan Q. A multiphase dynamic Bayesian networks methodology for thedetermination of safety integrity levels. Reliab Eng Syst Saf 2016;150:105–15.

[26] Vesely WE, Goldberg FF. Time dependent unavailability analysis of nuclear safetysystems. IEEE Trans Reliab 1977(4):257–60.

[27] Vesely WE, Goldberg FF, Powers JT, Dickey JM, Smith JM, Hall RE. FRANTIC II -A computer code for time dependent unavailability analysis. Nuclear RegulatoryCommission NUREG/CR-1924; 1981.

[28] Goble WM. Control systems safety evaluation and reliability, Third edition.Research Triangle Park, NC: ISA; 2010.

[29] Hauge S, Hokstad P, Håbrekke S, Lundteigen MA. Common cause failures in safetyinstrumented systems: using field experience from the petroleum industry. ReliabEng Syst Saf 2016;151:34–45.

[30] Eide SA, Wierman TE, Gentillon CD, Rasmuson DM, Atwood CL. Industry-averageperformance for components and initiating events at U.S. commercial nuclearpower plants. NUREG/CR-6928. Washington: U.S. Nuclear Regulatory Research;2007.

[31] International atomic energy agency . Generic component reliability data forresearch reactor PSA. Vienna: IAEA-TECDOC-930; 1997.

[32] IEC 61025. Fault tree analysis (FTA). Geneva: International ElectrotechnicalCommission; December 2006.

[33] IEC 61131-6. Programmable controllers – Part 6: functional safety. Geneva:International Electrotechnical Commission; 2010.

Table A-4-1Instantaneous unavailability equations of FRANTIC method and components.

Undetected failure rates Detected failure rates

Instantaneous unavailability equations

⎡⎣⎢

⎤⎦⎥

q p p Q p Qq T t T τ

q p p Q p Q λT T τ t T τ T

q λ t T τ T τ T t T TQ λ T τ

T T T

= + + ; < ≤ + (1)

= + + 1/2 ; + < ≤ + + (2)= ( − − ); + + < ≤ + (3)≡ ( − ); (4)

≡ + ; (5)

f f f o o o

f f f R o o R

o o R o

ot T

T

2

2

− 12 2 1

q = λDD * MTTR (6)

Components Sensor temperature sensors, splittersLogic λDU part λDD partActuator Relay A/B/C/RL 1, LM controller, LM, SDS1 switch, Gravity rod,

Gravity rod gripper, Gripper control valve, SDS2 switch

L. Ding et al. Reliability Engineering and System Safety 165 (2017) 170–187

187