Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
DEVELOPMENT EVALUATIONS
Lecture Templates
Marko Nagode, Simon Oman, Domen Seruga
University of Ljubljana, Faculty of Mechanical Engineering
Askerceva 6, SI-1000 Ljubljana, Slovenia
25. Februry 2019
ii
Contents
1 Introduction 1
1.1 Concepts, Terms and Definitions . . . . . . . . . . . . . . . . . . . 3
1.2 Product Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Basic Time Divisions . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Effectiveness and Costs . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Basic Reliability Models 9
2.1 The Failure Distribution . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Mean Time to Failure . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Hazard Rate Function . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Bathtub Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 Conditional Reliability . . . . . . . . . . . . . . . . . . . . . . . . 13
2.6 Probability Models . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.6.1 The Exponential Distribution . . . . . . . . . . . . . . . . 13
2.6.2 The Weibull Distribution . . . . . . . . . . . . . . . . . . . 14
2.6.3 The Normal Distribution . . . . . . . . . . . . . . . . . . . 15
2.6.4 The Log-normal Distribution . . . . . . . . . . . . . . . . 16
3 Reliability of Systems 17
3.1 Series and Parallel Configurations . . . . . . . . . . . . . . . . . . 18
3.2 Parallel configurations k out of n . . . . . . . . . . . . . . . . . . 18
3.3 Combined Series-Parallel Systems . . . . . . . . . . . . . . . . . . 19
3.4 Complex Configurations . . . . . . . . . . . . . . . . . . . . . . . 19
3.4.1 Decomposition . . . . . . . . . . . . . . . . . . . . . . . . 20
3.4.2 Enumeration . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.4.3 The System Structure Function . . . . . . . . . . . . . . . 21
3.4.4 Coherent Systems . . . . . . . . . . . . . . . . . . . . . . . 22
3.4.5 Minimal Path and Cut Sets . . . . . . . . . . . . . . . . . 23
3.4.6 System Bounds . . . . . . . . . . . . . . . . . . . . . . . . 24
3.5 Low and High Level Redundancy . . . . . . . . . . . . . . . . . . 24
3.6 Three State Devices . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.6.1 Series Structure . . . . . . . . . . . . . . . . . . . . . . . . 25
3.6.2 Parallel Structure . . . . . . . . . . . . . . . . . . . . . . . 26
iii
CONTENTS
3.6.3 Low Level Redundancy . . . . . . . . . . . . . . . . . . . . 26
3.6.4 High Level Redundancy . . . . . . . . . . . . . . . . . . . 27
4 State Dependent Systems 29
4.1 Markov Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2 Load Sharing Systems . . . . . . . . . . . . . . . . . . . . . . . . 32
4.3 Standby Systems or Passive Parallel Configurations . . . . . . . . 32
4.4 Passive Parallel System of Identical Components . . . . . . . . . . 33
4.5 Standby Systems with Switching Failure . . . . . . . . . . . . . . 34
4.6 Degraded Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.7 Three State Devices . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5 Physical Reliability Models 37
5.1 Covariate Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.2 Static Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.3 Dynamic Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.3.1 Periodic Loads . . . . . . . . . . . . . . . . . . . . . . . . 40
5.3.2 Random Loads . . . . . . . . . . . . . . . . . . . . . . . . 41
5.3.3 Random Fixed Stress and Strength . . . . . . . . . . . . . 41
5.4 Physics of Failure Models . . . . . . . . . . . . . . . . . . . . . . 42
6 Design for Reliability 43
6.1 Reliability Objectives . . . . . . . . . . . . . . . . . . . . . . . . . 44
6.1.1 Life Cycle Costs . . . . . . . . . . . . . . . . . . . . . . . . 45
6.2 Reliability Allocation . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.2.1 Exponential Case . . . . . . . . . . . . . . . . . . . . . . . 46
6.2.2 Optimal allocation . . . . . . . . . . . . . . . . . . . . . . 46
6.2.3 ARINC Method . . . . . . . . . . . . . . . . . . . . . . . . 47
6.2.4 AGREE Method . . . . . . . . . . . . . . . . . . . . . . . 47
6.2.5 Redundancies . . . . . . . . . . . . . . . . . . . . . . . . . 48
6.3 Design Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
6.3.1 Parts and Material Selection . . . . . . . . . . . . . . . . . 49
6.3.2 Derating . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
6.3.3 Stress Strength Analysis . . . . . . . . . . . . . . . . . . . 50
6.3.4 Complexity and Technology . . . . . . . . . . . . . . . . . 51
6.3.5 Redundancy . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.4 Failure and Effect Analysis . . . . . . . . . . . . . . . . . . . . . . 52
6.4.1 Review of the Process . . . . . . . . . . . . . . . . . . . . 53
6.4.2 Identification of the Potential Failure Modes . . . . . . . . 55
6.4.3 Identification of Potential Effects of Each Failure Mode . . 55
6.4.4 Assigning a Severity Rating for Each Failure Mode . . . . 55
6.4.5 Assigning an Occurrence Rating for Each Failure Mode . . 57
iv
CONTENTS
6.4.6 Assigning a Detection Rating for Each Failure Mode and/or
Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.4.7 Calculation of the Risk Priority Numbe . . . . . . . . . . . 57
6.4.8 Prioritizing the Failure Modes for Action . . . . . . . . . . 58
6.4.9 Taking Action to Eliminate or Reduce the High Risk Fail-
ure Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.4.10 Calculation of the Resulting RPN . . . . . . . . . . . . . . 58
6.5 System Safety and Fault Tree Analysis . . . . . . . . . . . . . . . 59
6.5.1 Fault Tree Analysis . . . . . . . . . . . . . . . . . . . . . . 59
6.5.2 Minimal Cut Sets . . . . . . . . . . . . . . . . . . . . . . . 61
6.5.3 Quantitative Analysis . . . . . . . . . . . . . . . . . . . . . 62
7 Maintainability 63
7.1 Analysis of Downtime . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.2 The Repair Time Distribution . . . . . . . . . . . . . . . . . . . . 64
7.3 System Repair Time . . . . . . . . . . . . . . . . . . . . . . . . . 65
7.4 Reliability under Preventive Maintenance . . . . . . . . . . . . . . 66
7.5 Stochastic Point Processes . . . . . . . . . . . . . . . . . . . . . . 67
7.5.1 Renewal Process . . . . . . . . . . . . . . . . . . . . . . . 68
7.5.2 Minimal Repair process . . . . . . . . . . . . . . . . . . . 69
7.5.3 Overhaul . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Note: The lecture templates are summarized after the book of Charles E. Ebeling,
An Introduction to Reliability and Maintainability Engineering, McGraw-Hill,
1997.
v
CONTENTS
For a reader who would like to have an in-depth knowledge of the subject, it is
recommended to read the entire book [Ebeling] content.
vi
Chapter 1
Introduction
Causes of the failures include:
• bad engineering design,
• faulty construction or manufacturing processes,
• improper use,
• human error,
• poor maintenance,
• inadequate testing and inspection,
• lack of protection against excessive environmental stress, etc.
The impact of product and system failures can be:
• so insignificant that the user does not even notice it,
• but they can also lead to a catastrophic consequences for humans and the
environment.
When evaluating on safety we conclude that the product will operate safely if
Ymin
Xmax
≥ SF
where SF is a safety factor. Because durability Y and load X are random variables
with probability density function fY (y) in fX(x), even safety factor SF� 1 does
not guarantee the operation without defects. Evaluation on safety is therefore
not sufficient because it does not take into account the scatter of the load and
the durability.
A modern development process, therefore, in addition to evaluations on:
• functionality,
1
1. INTRODUCTION
Xmax
fY(y)
fX(x)
Ymin
Figure 1.1. Circular cable car.
• safety,
• life cycle costs,
• ecological value,
• suitability of technology,
• aesthetic value,
• usefulness value,
• ergonomics,
• suitability for recycling, etc.
also includes evaluations on:
• reliability,
• maintainability,
• supportability,
• availability,
• effectiveness and
• product value.
2
1.1 Concepts, Terms and Definitions
Effectiveness
Readiness for
operationAvailability Capability
Maintainability SupportabilityReliability
Figure 1.2. Elements of the product effectiveness.
1.1 Concepts, Terms and Definitions
The ultimate objective of each product is the quality performance of the required
functions within the permissible deviations for a certain amount of time when
used under certain conditions of use, environmental and maintenance conditions,
at acceptable costs and minimizing environmental pollution. The function can be
described with the output characteristics of the product (the quality of message
transmission in communication systems, the load capacity of the conveyor or the
reliability of the brake) and the quality of the product is determined by reliability,
maintainability, supportability, availability, effectiveness, and product value.
Reliability is defined to be the probability that a component or system will
perform a required function within the permissible deviations for a given period
of time when used under stated operating and environmental conditions. Prior
to the determination of reliability, it is necessary to clearly define the failures, to
connect them with the functions of the product and to select the unit of time.
The specified time interval can be based on calendar or clock time, operating
hours or cycles. The cycle can be a turn of the engine, a load cycle or a block of
time load history. The terms of use and the environmental conditions in which
the product operates must meet the design requirements.
Maintainability is defined to be the probability that a failed component or
system will be restored or repaired to a specified condition within a period of
time when maintenance is performed in accordance with prescribed procedures.
Maintenance is usually a function of time and depends only on the time of repair.
The delay in logistical support, waiting time for maintenance personnel and parts
and administrative time are not included in the maintainability and are usually
treated separately.
Supportability is defined as the degree to which product characteristics and
planned support resources, including personnel, meet project requirements. Sup-
portability is a function of time and depends on the delay in logistic support and
3
1. INTRODUCTION
Product value
Efectiveness Life cycle costs Timetable Personnel
Figure 1.3. Elements of the product value.
the maintenance delay.
Reliability, maintainability and supportability are at a higher level linked to avail-
ability. Availability is defined as the probability that a component or system is
performing its required function at a given point in time or time interval when
used under stated operating, environmental and maintenance conditions. Avail-
ability is always greater or equal to reliability and is the preferred measure when
the system or component can be restored since it accounts for both failures (re-
liability) and repairs (maintainability).
In the broadest sense, product quality is determined by its effectiveness and prod-
uct value. Effectiveness is the probability that the product will meet project re-
quirements under certain operating, environmental and maintenance conditions,
depending on the readiness for operation, availability and capability. Readiness
for operation corresponds to a state vector at the first entry into operation,
capability however specifies the probability that the product will fulfill the re-
quired functions depending on the state in which it is located. The breakdown
of effectiveness is shown in Figure 1.2.
Product value links effectiveness, life cycle costs, timetables and personnel (Fig-
ure 1.3).Previous analyzes have shown that life cycle costs can be reduced by de-
voting enough attention to reliability, maintainability and supportability already
in the early stages of the product’s life cycle. If we want to increase the value,
we must increase the effectiveness at minimal costs in as short a time as possible
and with as little personnel as possible.
4
1.2 Product Attributes
1.2 Product Attributes
- effectiveness- acquisition costs- operations and- support costs- salvage value- timetables- personnel
reliability- selection of materials and parts- derating- stress analysis- strength analysis- complexity and technology- redundancymaintainability- failure isolation and diagnostics- standardization and parts - exchangeability- modularity and accessibility- repair or replacement- proactive maintenancesuportability- number of redundant components- number of spare parts- number of maintenance channelsavailability
operating- reach- speed- precision class- sensitivity- useful cargo- output power- ease of usephysical- volume and density- mass- shapefunctional- safety- fulfillment rate- functions
quality performance effectiveness product value
Tabel 1.1. Product attributes.
5
1. INTRODUCTION
costs share (%)
detailing andoptimization
testing
preparation of production
production
0 20 40 60 80 100
Pro
du
ct li
fe c
ycle
sta
ges
conceiving5
60
3
20
2
10
5
5
85
5
Figure 1.4. Cost breakdown by product life cycle stages.
1.3 Basic Time Divisions
The operating time may include:
• working hours,
• free time and time in standby mode.
Working hours is the time during which the product performs the required func-
tions within the permissible tolerances. In the free time there is no need for
operation, while the product performs only a limited number of functions or does
not operate during the standby time. The free time and standby time vary in the
speed of the product’s transition to the operating state. The non-operating time
is divided into:
• delay in logistics support,
• delay in maintenance and
• repair time.
t
free time
time in standby mode
working hours
time of operation down time
delay in logistics support
delay in maintenance
repair time
Figure 1.5. Time bus of the product status.
6
1.4 Effectiveness and Costs
1.4 Effectiveness and Costs
Epopt
pre-deliverycosts
after-deliverycosts
totalcosts
effectiveness
costs
Figure 1.6. Effectiveness and the manufacturer costs.
costs
effectiveness
Euopt
after-deliverycosts
purchasecosts
totalcosts
Figure 1.7. Effectiveness and the user costs.
7
1. INTRODUCTION
8
Chapter 2
Basic Reliability Models
We distinguish four characteristic functions that describe the reliability of the
product:
• reliability function,
• cumulative distribution function,
• probability density function and
• hazard rate function.
Each of these functions provides a unique and complete reliability description
from a different angle of view. If we know one, we can use it to express all others.
Knowing these functions we can also calculate:
• mean time to failure,
• failure distribution variance,
• median time to failure and,
• most likely observed failure time (mode).
2.1 The Failure Distribution
Reliability R(t) is the probability that the time to failure is going to be T ≥ t
R(t) = Pr{T ≥ t} =
∫ ∞t
f (t)dt (2.1)
Cumulative distribution function F (t) is the probability that a failure occurs
before time t
F (t) = Pr{T < t} = 1−R(t) =
∫ t
0
f (t)dt (2.2)
9
2. BASIC RELIABILITY MODELS
0 20 40 60 80 100
0.0
0.2
0.4
0.6
0.8
1.0
t
F(t),R(t)
R(t) F(t)
Figure 2.1. Reliability and cumulative distribution functions.
Probability density function is defined as
f (t) =dF (t)
dt= −dR(t)
dt(2.3)
Figure 2.2. Probability density function.
10
2.2 Mean Time to Failure
2.2 Mean Time to Failure
Expected mean time to failure MTTF is defined as
MTTF = E[T ] =
∫ ∞0
tf (t)dt (2.4)
It can also be shown that
MTTF =
∫ ∞0
R(t)dt (2.5)
Median time to failure tmed
R(tmed) = Pr{T ≥ tmed} = 0.5
Most likely observed failure time tmod
f (tmod) = max0≤t<∞
f (t)
Failure distribution variance
σ2 = E[(T −MTTF)2] =
∫ ∞0
(t−MTTF)2f (t)dt (2.6)
0 tmed 3 4
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
t
f(t)
MTTFtmod
area = 0.5
Figure 2.3. Comparison between mean time to failure, median and mode time.
11
2. BASIC RELIABILITY MODELS
2.3 Hazard Rate Function
Hazard rate function λ(t) provides an instantaneous (at time t) rate of failure
given that the system has survived to time t
λ(t) =f (t)
R(t)(2.7)
Reliability as a function of hazard rate function
R(t) = exp
{−∫ t
0
λ(t)dt
}(2.8)
Cumulative failure rate
L(t) =
∫ t
0
λ(t)dt (2.9)
Average failure rate defined between two times t1 ≤ T ≤ t2
AFR(t1, t2) =1
t2 − t1
∫ t2
t1
λ(t)dt (2.10)
2.4 Bathtub Curve
0 1000 2000 3000 4000 5000
0.000
0.005
0.010
0.015
0.020
t
l(t
)
randomfailures
wearoutfailures
earlyfailures
Figure 2.4. Bathtub curve.
12
2.5 Conditional Reliability
earlyfailures
randomfailures
wearoutfailures
Failuretype cause of failure prevention
burn-in testingscreeningquality controlacceptance testing
redundancysafety factor
deratingPreventive maintenaceParts replacemnettechnology
production failurepoor quality controlcontaminationpoor workmanship
operating conditionsenvironmental conditionshuman errorrandom events
material fatiguecorrosionagingwear
model
DFR
CFR
IFR
Tabel 2.1. Types and causes of failures and failure prevention.
2.5 Conditional Reliability
Conditional reliability is the probability that the product will be in operation
until the moment t provided that it was in the operating state after the burn-in
period or the warranty period T0.
R(t|T0) =R(t+ T0)
R(T0)(2.11)
Residual mean time to failure (MTTF taking into account T0)
MTTF(T0) =
∫ ∞0
R(t|T0)dt =1
R(T0)
∫ ∞T0
R(t)dt (2.12)
2.6 Probability Models
2.6.1 The Exponential Distribution
Reliability function
R(t) = exp
{−λ∫ t
0
dt
}= e−λt (2.13)
Cumulative distribution function of failures
F (t) = 1− e−λt (2.14)
Exponential probability density function
f (t) = −dR(t)
dt= λe−λt (2.15)
13
2. BASIC RELIABILITY MODELS
0 1 2 3 4 5
0
1
2
3
4
5
t
f(t)
λ = 5
λ = 1λ = 0.25
Figure 2.5. The exponential probability density function.
Mean time to failure
MTTF =
∫ ∞0
R(t)dt =
∫ ∞0
e−λt =e−λt
−λ
∣∣∣∣∞0
=1
λ(2.16)
Failure distribution variance
σ2 =
∫ ∞0
(t−MTTF)2f (t)dt =
∫ ∞0
(t− 1/λ)2λe−λtdt =1
λ2(2.17)
The probability that the product (component) will be in operation until MTTF
is
R(MTTF) = e−MTTF/MTTF = e−1 = 0.367879
CFR model does not have a memory.
2.6.2 The Weibull Distribution
Hazard rate function
λ(t) =β
θ
(t
θ
)β−1
za θ > 0, β > 0 in t ≥ 0
Reliability function
R(t) = exp
{−∫ t
0
β
θ
(t
θ
)β−1
dt
}= e
−(tθ
)β(2.18)
Probability density function
f (t) =β
θ
(t
θ
)β−1
e−(tθ
)β(2.19)
14
2.6 Probability Models
0 1 2 3 4 5
0.0
0.3
0.6
0.9
1.2
1.5
t
f(t)
β = 0.5
β = 1.5 β = 2.0
β = 4.0
Figure 2.6. The Weibull probability density function for θ = 2.0.
0 1 2 3 4 5
0.0
0.5
1.0
1.5
2.0
t
f(t)
σ = 0.2
σ = 0.5
σ = 1.0
Figure 2.7. The Normal probability density function for µ = 2.0.
Mean time to failure
MTTF = θΓ[1 + 1/β ] (2.20)
Failure distribution variance
σ2 = θ2(Γ[1 + 2/β ]− Γ2[1 + 1/β ]) (2.21)
2.6.3 The Normal Distribution
Probability density function
f (t) =1√2πσ
exp
{−(t− µ)2
2σ2
}(2.22)
15
2. BASIC RELIABILITY MODELS
Reliability function
R(t) =1√2πσ
∫ ∞t
exp
{−(t− µ)2
2σ2
}dt (2.23)
Mean time to failure
MTTF = µ (2.24)
2.6.4 The Log-normal Distribution
s = 0.1
0 1 2 3 4 5
0.0
0.5
1.0
1.5
2.0
t
f(t)
s = 2.0
s = 1.0
Figure 2.8. The Log-normal probability density function for tmed = 2.0.
Probability density function
f (t) =1√
2πtsexp
{−1
2
(ln(t/tmed))2
s2
}(2.25)
Reliability function expressed by Laplace function
R(t) = 1− Φ((ln t− ln tmed)/s) (2.26)
Mean time to failure
MTTF = tmed exp(s2/2) (2.27)
Failure distribution variance
σ2 = t2med exp s2(exp s2 − 1) (2.28)
16
Chapter 3
Reliability of Systems
I2I1 I3 I5
I4
E1
S2S1 S3 S4
P2P1
P3
P5
P4system
product
assembly
sub-assembly
element
component
sub-component
Figure 3.1. Reliability block diagram of a system.
17
3. RELIABILITY OF SYSTEMS
21 n
1
2
n
(a) (b)
Figure 3.2. Reliability block diagram for n components in series (a) and in parallel
(b).
3.1 Series and Parallel Configurations
Reliability of a system made of n components in series
Rs(t) =n∏i=1
Ri(t) (3.1)
Reliability of a system made of n statistically independent components in parallel
Rs(t) = 1−n∏i=1
(1−Ri(t)) (3.2)
123456789
10
9.000000E-018.100000E-017.290000E-016.561000E-015.904900E-015.314410E-014.782969E-014.304672E-013.874205E-013.486784E-01
serialn
9.000000E-019.900000E-019.990000E-019.999000E-019.999900E-019.999990E-019.999999E-011.000000E+001.000000E+001.000000E+00
parallel
Tabel 3.1. The influence of component number n (reliability of each component
R = 0.9) on system reliability for serial and parallel configurations.
3.2 Parallel configurations k out of n
A generalization of n parallel components occurs when a requirement exists for
k out of n identical and independent components to function for the system to
function. Probability that at least k out of n statistically independent parallel
18
3.3 Combined Series-Parallel Systems
components will function is
Rs(t) =n∑i=k
(n
i
)R(t)i(1−R(t))n−i (3.3)
3.3 Combined Series-Parallel Systems
1
2
3
AB
4 5 6
CD
Figure 3.3. Combined series-parallel system.
3.4 Complex Configurations
For certain systems, the component configuration is such that the system relia-
bility cannot be simply decomposed into series and parallel relationship. Such
systems can be solved using the following procedures:
• decomposition,
• enumeration,
• minimal paths,
• minimal cuts and
• Monte Carlo symulations.
19
3. RELIABILITY OF SYSTEMS
3.4.1 Decomposition
1
2
3
4
(b)
1
2
3
4
(c)
1
2
3
4
(a)
5
Figure 3.4. Decomposition of a linked network: Complex configuration (a), compo-
nent 5 does not fail (b) and component 5 fails (c).
20
3.4 Complex Configurations
3.4.2 Enumeration
123456789
1011121314151617181920212223242526272829303132
SFSSSSFSSSFFFSSSFSSFFSFFSFFSFFFF
1i
SSFSSSFFSSSSSFFSFFSSSFFFFSFFSFFF
2
SSSFSSSFFSFSSSSFFFFFSSSSFFFFFSFF
3
SSSSFSSSFFSFSFSSSFFFFFFSSSFFFFSF
4
SSSSSSFSFSSSSSSSFFFFFSFFFSFFFFFF
system
SSSSSFSSSFSSFSFFSSFSFFSFFFSFFFFF
5Pi
operates
0.5848200.0649800.0649800.0307800.0307800.146205
0.003420
0.0076950.0034200.0034200.0162450.0034200.0162450.007695
0.000855
0.000855
Pihas failed
0.007220
0.001620
0.0003800.0001800.0004050.0001800.000855
0.0003800.0018050.000855
0.0000200.0000450.0000450.0000950.0000950.000005
= 0.985815 = 0.014185
Figure 3.5. Enumeration method for complex configuration from Figure 3.4.
3.4.3 The System Structure Function
A very general approach for analyzing the reliability of complex systems is trough
the use od the system structure function. First component state need to be
defined by
Xi =
{1 if component operates
0 if component has failed(3.4)
21
3. RELIABILITY OF SYSTEMS
If the system configuration consists of n components there are 2n possible states.
Each state is defined by the state vector
X = [X1, . . . , Xn] (3.5)
and the system structure function
Ψ(X) =
{1 if system operates
0 if system has failed(3.6)
Therefore for a series system the system structure function is defined by
Ψ(X) =n∏i=1
Xi = min{X1, . . . , Xn}
and for a parallel system by
Ψ(X) = 1−n∏i=1
(1−Xi) = max{X1, . . . , Xn}
Reliability of the system and the system structure function are interconnected
Rs = E[Ψ(X)] = 0 · Pr{Ψ(X) = 0}+ 1 · Pr{Ψ(X) = 1} (3.7)
From (3.4) and (3.7) it follows
Xni = Xi E[Xi] = Ri
For the system composed of n statistically independent series components it ap-
plies
Rs = Pr{Ψ(X) = 1}= Pr{min{X1, . . . , Xn} = 1}= Pr{X1 = 1, . . . , Xn = 1}= Pr{X1 = 1} · · ·Pr{Xn = 1}= R1 · · ·Rn
and for the system composed of n statistically independent parallel components
it applies
Rs = Pr{Ψ(X) = 1}= 1− Pr{X1 = 0, . . . , Xn = 0}= 1− Pr{X1 = 0} · · ·Pr{Xn = 0}= 1− (1−R1) · · · (1−Rn)
3.4.4 Coherent Systems
A system is coherent when a component reliability improvement does not de-
grade the system reliability. A coherent system has a structure function that is
22
3.4 Complex Configurations
monotonically increasing. That is if Y ≥ X then Ψ(Y) ≥ Ψ(X).
3.4.5 Minimal Path and Cut Sets
A path is a set of components whose functioning ensures that the system func-
tions. A minimal path Pi is one in which all the components within the set must
function for the system to function. The number of minimal paths is upside lim-
ited and denoted by p. A state vector X is a vector of a minimal path when
Ψ(X) = 1 and X < Y for each Y for which it applies Ψ(Y) = 1. The system
structure function is derived from the rules that apply to reliability calculation
Ψ(X) = 1−p∏i=1
(1−∏j∈Pi
Xj) (3.8)
123
p = 4
{1,3}{2,4}
{1,4,5}{2,3,5}
minimal path Pii
[1,0,1,0,0][0,1,0,1,0][1,0,0,1,1][0,1,1,0,1]
vector of minimal pathXi
Tabel 3.2. Minimal path sets for complex configuration from Figure 3.4.
1
2
4
3
2 4
1 3
5
5
Figure 3.6. Replacement block diagram formed from path sets.
A cut is a set of components whose failure will result in a system failure. A
minimal cut Ci is one in which all the components must fail in order for the
system to fail. The number of minimal paths is upside limited and denoted by
c. X is a vector of a minimal cut when Ψ(X) = 0 and X > Y for each Y for
which it applies Ψ(Y) = 0. The system structure function is also derived from
the rules that apply to reliability calculation
Ψ(X) =c∏i=1
(1−∏j∈Ci
(1−Xj)) (3.9)
23
3. RELIABILITY OF SYSTEMS
123
c = 4
{1,2}{3,4}
{1,4,5}{2,3,5}
minimal cut Cii
[0,0,1,1,1][1,1,0,0,1][0,1,1,0,0][1,0,0,1,0]
vector of minimal cutXi
Tabel 3.3. Minimal cut sets for complex configuration from Figure 3.4.
1
2
3
4
4
5
3
5
1 2
Figure 3.7. Replacement block diagram formed from cut sets.
3.4.6 System Bounds
The rough upper-bound reliability is given by
Rsu = 1−n∏i=1
(1−Ri) (3.10)
The rough lower-bound reliability is given by
Rsl =n∏i=1
Ri (3.11)
The estimation of upper-bound reliability is given by
Rsu = 1−p∏i=1
(1−∏j∈Pi
Rj) (3.12)
The estimation of upper-bound reliability is given by
Rsl =c∏i=1
(1−∏j∈Ci
(1−Rj)) (3.13)
3.5 Low and High Level Redundancy
Reliability of a system with a low level redundancy
Rl = (1− (1−R)2)2 = R2(2−R)2
Reliability of a system with a high level redundancy
Rh = 1− (1−R2)2 = R2(2−R2)
24
3.6 Three State Devices
1
1
2
2
1
1
2
2
(b) (c)
1 2
(a)
Figure 3.8. Series configuration (a), low level redundancy (b)and high level redun-
dancy (c).
3.6 Three State Devices
For three state devices it is typical that besides an operating state they have two
failure modes. For example failure at closing (short failure) and failure at opening
(open failure) the valve or electrical switch.
3.6.1 Series Structure
2
1(a)
2
1(d)
22
11(b) (c)
Figure 3.9. Series configuration. Failure ob both valves at closing (a), failure of valve
no. 1 at opening (b), failure of valve no. 2 at opening (c) and failure of both valves at
opening (d).
25
3. RELIABILITY OF SYSTEMS
3.6.2 Parallel Structure
21(a) 21(b)
21(c) 21(d)
Figure 3.10. Parallel configuration. Failure ob both valves at opening (a), failure of
valve no. 1 at closing (b), failure of valve no. 2 at closing (c) and failure of both valves
at closing (d).
3.6.3 Low Level Redundancy
Reliability of a system with low level redundancy is calculated by
Rl = 1− (Fo − Fs)
In general it can be written
Rl =n∏i=1
(1− qmoi)−n∏i=1
(1− (1− qsi)m) (3.14)
2
2
2
1
1
1
n
n
n
m
Figure 3.11. Block diagram of a system composed of n assemblies in series where
each assembly has m three state components in parallel.
26
3.6 Three State Devices
22
11
Figure 3.12. Low level redundancy for n = m = 2.
123456789
10111213141516
SFSSSFSSFFSFSFFF
1left
valvei
SSSFSSFFSFSFFFSF
SSFSSFFSSSFFFSFF
SSSSFSSFFSFSFFFF
1rightvalve
2left
valve
2rightvalve
systemat
closing
SSSSSSFSFFFFFFFF
SSSSSFSFSSSFFFFF
0.011475
0.0114750.0114750.0114750.0020250.0012750.0012750.0020250.000225
Psiat
closing
Fs = 0.052725
systemat
opening
0.002209
0.003249
0.0001410.0001710.0001710.0001410.000009
Poiat
opening
Fo = 0.006091
Tabel 3.4. Low level redundancy for n = m = 2, qo1 = 0.05, qo2 = 0.06, qs1 = 0.15,
qs2 = 0.10, Ps7 = (1 − qs1)qs1qs2(1 − qs2) = 0.011475, Po8 = (1 − qo1)2q2o2 = 0.003249
and Rl = 1− (Fo − Fs) = 0.941184.
3.6.4 High Level Redundancy
Reliability of a system with high level redundancy is calculated by
Rh =
(1−
m∏i=1
qsi
)n
−
(1−
m∏i=1
(1− qoi)
)n
(3.15)
27
3. RELIABILITY OF SYSTEMS
21 m
21 m
1 2 m
n
Figure 3.13. Block diagram of a system composed of n assemblies in parallel where
each assembly has m three state components in series.
28
Chapter 4
State Dependent Systems
4.1 Markov Analysis
For a two component series system the system reliability is
Rs(t) = P1(t)
and for two component parallel system as follows
Rs(t) = P1(t) + P2(t) + P3(t)
Since the system must be in one of the states we can further write
P1(t) + P2(t) + P3(t) + P4(t) = 1 (4.1)
If the system at the time t is in state no. 1 then the probability that in time
t + ∆t will still be in the same state equals the probability P1(t) reduced by a
probability of transition to states no. 2 or 3. Probability that failure occurs in the
time interval t ≤ T ≤ t + ∆t with the condition that component no. 1 operates
until the time t is
Pr{t ≤ T ≤ t+ ∆t|T ≥ t} =R1(t)−R1(t+ ∆t)
R1(t)
Probability that the system will be in state no. 1 at time t and will go into the
state no. 2 during the time interval t ≤ T ≤ t+ ∆t corresponds to the associated
1234
SFSF
1state
SSFF
2
SSSF
systemparallelconf.
systemserialconf.
SFFF
Tabel 4.1. Possible states of the system.
29
4. STATE DEPENDENT SYSTEMS
probabilities of both events
R1(t)−R1(t+ ∆t)
R1(t)P1(t)
Probability that the system at time t+ ∆t will still be in state no. 1 therefore is
P1(t+ ∆t) = P1(t)− R1(t)−R1(t+ ∆t)
R1(t)P1(t)
− R2(t)−R2(t+ ∆t)
R2(t)P1(t)
(4.2)
Probability that the system will be in state no. 2 at time t + ∆t corresponds
to the probability that the system is in state no. 2 at time t increased by the
probability of its transition from the state no. 1 to the state no. 2 and decreased
by the probability of its transition from the state no. 2 to the state no. 4
P2(t+ ∆t) = P2(t) +R1(t)−R1(t+ ∆t)
R1(t)P1(t)
− R2(t)−R2(t+ ∆t)
R2(t)P2(t)
(4.3)
Similarly we can derive the probability that the system at time t + ∆t is going
to be in the state no. 3
P3(t+ ∆t) = P3(t) +R2(t)−R2(t+ ∆t)
R2(t)P1(t)
− R1(t)−R1(t+ ∆t)
R1(t)P3(t)
(4.4)
or in the state no. 4
P4(t+ ∆t) = P4(t) +R2(t)−R2(t+ ∆t)
R2(t)P2(t)
+R1(t)−R1(t+ ∆t)
R1(t)P3(t)
(4.5)
If we divide equation (4.2) with ∆t we get
P1(t+ ∆t)− P1(t)
∆t= −R1(t)−R1(t+ ∆t)
∆tR1(t)P1(t)
− R2(t)−R2(t+ ∆t)
∆tR2(t)P1(t)
Because it applies
lim∆t→0
P (t+ ∆t)− P (t)
∆t=dP (t)
dt= P ′(t)
and
lim∆t→0
−R(t+ ∆t)−R(t)
∆t= f (t)
30
4.1 Markov Analysis
the equations (4.2) to (4.5) forms the system of ordinary differential equations of
the first orderP ′1(t)
P ′2(t)
P ′3(t)
P ′4(t)
=
−λ1 − λ2 0 0 0
λ1 −λ2 0 0
λ2 0 −λ1 0
0 λ2 λ1 0
P1(t)
P2(t)
P3(t)
P4(t)
(4.6)
If failure rates are constant then the system of differential equations is analytically
solvable. Probability that the system will be in state no. 1 can be calculated by
an integration method for separating variables∫dP1(t)
P1(t)= −(λ1 + λ2)
∫dt+ C1
from there it follows
lnP1(t) = −(λ1 + λ2)t+ C1
and
P1(t) = e−(λ1+λ2)t+C1
Value of the integration constant C1 is calculated from the condition P1(0) = 1
which says that at time t = 0 the system is in state no. 1 with probability of 1.
From there it follows C1 = 0 and
P1(t) = e−(λ1+λ2)t (4.7)
Probability that the system at time t is in state no. 2 is a result of a linear
differential equation
P ′2(t) + λ2P2(t) = λ1e−(λ1+λ2)t
where
P2(t) = e−λ2∫dt
{λ1
∫e−(λ1+λ2)t+λ2
∫dtdt+ C2
}or
P2(t) = −e−(λ1+λ2)t + C2e−λ2t
Value of the integration constant C2 = 1 comes from boundary condition P2(0) =
0. Probability that the system at time t will be in state no. 2 is therefore
P2(t) = −e−(λ1+λ2)t + e−λ2t (4.8)
Similarly we can derive the following equation
P3(t) = e−λ1t − e−(λ1+λ2)t (4.9)
Probability that the system will be in state no. 4 is
P4(t) = 1− P1(t)− P2(t)− P3(t) (4.10)
31
4. STATE DEPENDENT SYSTEMS
Whilst the probability of a system composed of two components in series equals
Rs(t) = e−(λ1+λ2)t (4.11)
for a system with composed of two components in parallel applies
Rs(t) = e−λ1t + e−λ2t − e−(λ1+λ2)t (4.12)
4.2 Load Sharing Systems
1
2
1
3
4
2
λ 1
λ2
+
λ2
λ 1+
Figure 4.1. Block and rate diagrams for a two component load sharing system.
A system of differential equations for a load-sharing systemP ′1(t)
P ′2(t)
P ′3(t)
P ′4(t)
=
−λ1 − λ2 0 0 0
λ1 −λ+2 0 0
λ2 0 −λ+1 0
0 λ+2 λ+
1 0
P1(t)
P2(t)
P3(t)
P4(t)
(4.13)
Solution of a differential equation system
P1(t) = e−(λ1+λ2)t
P2(t) =λ1
λ1 + λ2 − λ+2
(e−λ+2 t − e−(λ1+λ2)t)
P3(t) =λ2
λ1 + λ2 − λ+1
(e−λ+1 t − e−(λ1+λ2)t)
(4.14)
4.3 Standby Systems or Passive Parallel
Configurations
1
3
4
2
λ 1
λ2
λ2 -
λ 1
p = 0
p = 0
1
2
Figure 4.2. Block and rate diagrams for a two component standby system with failures
in standby.
32
4.4 Passive Parallel System of Identical Components
A system of differential equations for a standby system without switching failure
p = 0P ′1(t)
P ′2(t)
P ′3(t)
P ′4(t)
=
−λ1 − λ−2 0 0 0
λ1 −λ2 0 0
λ−2 0 −λ1 0
0 λ2 λ1 0
P1(t)
P2(t)
P3(t)
P4(t)
(4.15)
Solution of a differential equation system
P1(t) = e−(λ1+λ−2 )t
P2(t) =λ1
λ1 + λ−2 − λ2
(e−λ2t − e−(λ1+λ−2 )t)
P3(t) = e−λ1t − e−(λ1+λ−2 )t
(4.16)
System reliability
Rs(t) = e−λ1t +λ1
λ1 + λ−2 − λ2
(e−λ2t − e−(λ1+λ−2 )t) (4.17)
Mean time to failure MTTF
MTTF =1
λ1
+λ1
λ2(λ1 + λ−2 )(4.18)
4.4 Passive Parallel System of Identical
Components
A system of differential equations for a passive parallel system of identical com-
ponentsP ′1(t)
P ′2(t)
P ′3(t)
P ′4(t)
P ′5(t)
=
−λ 0 0 0 0
λ −λ 0 0 0
0 λ −λ 0 0
0 0 λ −λ 0
0 0 0 λ 0
P1(t)
P2(t)
P3(t)
P4(t)
P5(t)
(4.19)
12345
SFFFF
1state
SSFFF
2
SSSFF
3
SSSSF
4
SSSSF
system
Tabel 4.2. Possible states of the system.
33
4. STATE DEPENDENT SYSTEMS
1
2
3
4
5
λ
λ
λ
λ
1
1
1
1
Figure 4.3. Block and rate diagrams for a passive parallel system of identical compo-
nents.
Reliability of a passive parallel system of n identical components composed and
belonging MTTF are therefore
Rs(t) =n∑i=1
Pi(t) = e−λtn−1∑i=0
(λt)i
i!
MTTF =
∫ ∞0
Rs(t)dt =n−1∑i=0
∫ ∞0
(λt)ie−λt
i!dt =
n−1∑i=0
Γ[i+ 1]
λi!=n
λ
(4.20)
4.5 Standby Systems with Switching Failure
A system of differential equations for a standby system with switching failure
p > 0P ′1(t)
P ′2(t)
P ′3(t)
P ′4(t)
=
−qλ1 − pλ1 − λ−2 0 0 0
qλ1 −λ2 0 0
λ−2 0 −λ1 0
pλ1 λ2 λ1 0
P1(t)
P2(t)
P3(t)
P4(t)
(4.21)
1
3
4
2
qλ 1
λ2
λ2 -
λ 1
p > 0
p > 0
1
2
pλ 1
Figure 4.4. Block and rate diagrams for a two component standby system with failures
in standby and with switching failure.
34
4.6 Degraded Systems
where q = 1− p. Solution of a differential equation system (4.21)
P1(t) = e−(λ1+λ−2 )t
P2(t) =qλ1
λ1 + λ−2 − λ2
(e−λ2t − e−(λ1+λ−2 )t)
P3(t) = e−λ1t − e−(λ1+λ−2 )t
(4.22)
and system reliability
Rs(t) = e−λ1t +qλ1
λ1 + λ−2 − λ2
(e−λ2t − e−(λ1+λ−2 )t) (4.23)
4.6 Degraded Systems
(1) operating state(2) degraded state(3) state of failure
1
1
323
Figure 4.5. Block and rate diagrams for a degraded system.
A system of differential equations for a degraded systemP ′1(t)
P ′2(t)
P ′3(t)
=
−λ1 − λ2 0 0
λ2 −λ3 0
λ1 λ3 0
P1(t)
P2(t)
P3(t)
(4.24)
Solution of a differential equation system
P1(t) = e−(λ1+λ2)t
P2(t) =λ2
λ1 + λ2 − λ3
(e−λ3t − e−(λ1+λ2)t)
P3(t) = 1− P1(t)− P2(t)
(4.25)
and system reliability
Rs(t) = P1(t) + P2(t) (4.26)
4.7 Three State Devices
(1) operating state(2) failure at opening(3) failure at closing
1
1
32
Figure 4.6. Block and rate diagrams for a three state device.
35
4. STATE DEPENDENT SYSTEMS
A system of differential equations for a three state devicesP ′1(t)
P ′2(t)
P ′3(t)
=
−λ1 − λ2 0 0
λ1 0 0
λ2 0 0
P1(t)
P2(t)
P3(t)
(4.27)
Solution of a differential equation system
P1(t) = e−(λ1+λ2)t
P2(t) =λ1
λ1 + λ2
(1− e−(λ1+λ2)t)
P3(t) = 1− P1(t)− P2(t)
(4.28)
and system reliability
Rs(t) = e−(λ1+λ2)t (4.29)
36
Chapter 5
Physical Reliability Models
In previous chapters, we dealt with reliability models in which the reliability
of a component or system was considered as a function of time only. Because
the operating and environmental conditions can change over time, reliability of-
ten also depends on other factors. Reliability models which in addition to time
also consider other parameters are called covariate models. Static stress-strength
models do not depend on time but on applied load and strength. Unlike static
stress-strength models, dynamic stress-strength models also take into account the
impact of load history. In addition to the static stress-strength reliability models
there are also so called physics of failure models.
5.1 Covariate Models
A simple example of a covariate model for the exponential probability density
function of failures is given by the equation
λ(x) = a0 +n∑i=1
aixi
The reliability of the component by taking into account additional parameters for
the case of the exponential probability density function of failures is calculated
by
R(t) = e−λ(x)t
37
5. PHYSICAL RELIABILITY MODELS
5.2 Static Models
Probability that the stress X will not be greater than x is calculated by
Pr{X < x} = FX(x) =
∫ x
0
fX(x)dx (5.1)
Probability that the strength Y will not exceed the value y can be calculated by
Pr{Y < y} = FY (y) =
∫ y
0
fY (y)dy (5.2)
If stress X is a random variable and strength a constant value Y = y (see Fig-
ure 5.1), then the reliability is
R = Pr{X < y} = FX(y) =
∫ y
0
fX(x)dx (5.3)
0 20 60 80 100
0.00
0.01
0.02
0.03
0.04
x
f X(x)
R
y
Figure 5.1. Reliability for random stress and constant strength.
If strength Y is a random variable and stress a constant value X = x see Fig-
ure 5.2), then the reliability is
R = Pr{Y ≥ x} = 1− FY (x) =
∫ ∞x
fY (y)dy (5.4)
If both stress X and strength Y are random variables, then the reliability is a
probability that the stress will not exceed the strength
R = Pr{X < Y } =
∫ ∞0
FX(y)fY (y)dy (5.5)
38
5.2 Static Models
0 20 60 80 100
0.00
0.01
0.02
0.03
0.04
y
f Y(y)
x
R
Figure 5.2. Reliability for constant stress and random strength.
or a probability that the strength will be greater than the stress
R = Pr{Y ≥ X} =
∫ ∞0
(1− FY (x))fX(x)dx (5.6)
0 20 60 80 100
0.00
0.01
0.02
0.03
0.04
x, y
f X(x),f Y(y)
FX(y)
fY(y)
fX(x)
y
Figure 5.3. Reliability for random stress and strength.
If the stress and strength belong to the exponential probability density function
fX(x) =1
µXe− xµX fY (y) =
1
µYe− yµY
with a mean value of stress µX and a mean value of strength µY then the reliability
39
5. PHYSICAL RELIABILITY MODELS
1.0
0.50
0.9
0.53
0.8
0.56
0.7
0.59
0.6
0.63
0.5
0.67
0.4
0.71
0.3
0.77
0.2
0.83
0.1
0.91
µX/µY
R
Tabel 5.1. Reliability for the exponential probability density function of stress and
strength.
is
R =1
µX
∫ ∞0
exp
{−µX + µY
µXµY
}dx =
1
1 + µX/µY(5.7)
As obvious from Table 5.1,the value of µY must be at least 10x greater than the
value of µX in order for the reliability to exceed the value of 0.9.
5.3 Dynamic Models
Components for which the stress is changing over time requires separate treat-
ment in which dynamic reliability is calculated. Stress may appear at completely
random times or can be placed repetitively over time.
5.3.1 Periodic Loads
if we neglect the impact of aging and assume that in a moment ti the component
is loaded with a random load Xi having a probability density function of fX(x)
and assume that random strength Yi having a probability density function of
fY (y) is independent of time, then the dynamic reliability after n load cycles is
Rn = Pr{X1 < Y1, . . . , Xn < Yn}
Since the events X1, X2, . . . in Y1, Y2, . . . statistically independent it applies
Rn = Pr{X1 < Y1} · · ·Pr{Xn < Yn} = Rn (5.8)
If the time period ∆t is constant, t0 = 0 and ti = ti−1 + ∆t for i = 1, 2, . . ., then
the dynamic reliability can be calculated by
R(t) = Rt/∆t (5.9)
Y1
t1t2
tn
t
X1
Y2
X2
Yn
Xn
0 ∆t
Figure 5.4. The stress time history and strength for a periodic load.
40
5.3 Dynamic Models
5.3.2 Random Loads
Y1
t1
t2
t t
X1
Y2
X2
Yn
Xn
0
Figure 5.5. The stress time history and strength for a random load.
The probability that over the time t a component will be loaded i times correspond
to a binomial probability density function
fn(i) =
(n
i
)pi(1− p)n−i (5.10)
If number of events n� 1 and probability p very small then the equation (5.10)
can be approximated by Poisson probability density function
fn(i) = e−pn(pn)i
i!(5.11)
From mean period ∆t = t/n it follows pn = (p/∆t)t = αt where α is the mean
number of loads per unit of time. From here it follows
fi(t) = e−αt(αt)i
i!(5.12)
Dynamic reliability is a weighted sum of all possible events
R(t) =∞∑i=0
Rifi(t) = e−αt∞∑i=0
(αtR)i
i!(5.13)
Since
eαtR =∞∑i=0
(αtR)i
i!
the reliability can be written as
R(t) = e−(1−R)αt (5.14)
5.3.3 Random Fixed Stress and Strength
A different result is obtained if stress and strength are randomly determined once
and then fixed for each cycle. We assume that the first stress X1 and strength
Y1 are random variables and for all other realizations of stress and strength it
follows
Xi = X1 ∨ 0 Yi = Y1 i = 1, 2, . . .
41
5. PHYSICAL RELIABILITY MODELS
Y1
Y2
Yn
t1
t2
t t
X1
X2
Xn
0
Figure 5.6. The load time history and strength for random fixed stress and strength.
From (5.13) the dynamic reliability is
R(t) = R0f0(t) +∞∑i=1
Rifi(t)
Because Ri = R for each i ≥ 1 it applies
R(t) = f0(t) +R(1− f0(t)) = e−αt +R(1− e−αt) (5.15)
5.4 Physics of Failure Models
An alternative to static reliability models is called physics of failure where time
to failure is expressed by a function
t = f (operating conditions,
environmental conditions,
material properties,
manufacturing technology,
geometry)
A typical representative of a physics of failure model is the calculation of expected
operating time of the roller bearing
t = a1a2a3
106
60n
(C
P
)pin hours where a1 is a reliability factor, a2 material factor, a3 factor of operating
conditions, n rotary speed in rev/min, p exponent, C dynamic load capacity and
P equivalent stress in N.
42
Chapter 6
Design for Reliability
definition of the reliability objectives
allocation of reliabilityby components
evaluation methods
failure mode andeffect analysisFMEA/FMECA
are goalsfulfilled
system effectiveness andlife cycle costs
system safety andfault tree analysis FTA
are goalsfulfilled
production
yes
yes
no
no
Figure 6.1. The reliability design process.
43
6. DESIGN FOR RELIABILITY
conceiving
detailingoptimizationandtesting
production
use andmaintenance
specification of reliabilityallocation of reliabilityevaluation methods
evaluation methodsFMEAtesting reliability growthsafety analysis and FTA
acceptance testingquality controlburn-in testing andmonitoring/screening
preventive maintenancepredictive maintenancemodificationsparts replacement
life cyclephases
activities relatedto reliability
Tabel 6.1. Reliability and product life cycle phases.
6.1 Reliability Objectives
Reliability indicators:
• mean time to failure MTTF,
• mean time between failures MTBF,
• target reliability after a certain time and,
• probability density function of failures.
Other activities:
• it is necessary to define the failures precisely,
• to eliminate a certain type of failure from the requirements,
• to monitor and record operation and non-operation times and,
• precisely define normal operating, environmental and maintenance condi-
tions.
44
6.2 Reliability Allocation
- R&D costs- production- distribution
operations costs- personnel- energy and fuelfailure costs- waranty- responsibility- repair or replacement- loss of trustsupport costs- repair resources- supply resources- support equipment- or facilities
- salvage value - disposal costs
acquisition costsoperations and support costs remaining costs
Tabel 6.2. Costs categories.
6.1.1 Life Cycle Costs
In general life cycle costs are
life cycle costs = acquisition costs
+ operations costs
+ failure costs
+ support costs
− net salvage value
(6.1)
where
net salvage value = salvage value
− disposal costs
6.2 Reliability Allocation
Reliability allocation objective
Rs = h(R1(t), . . . , Rn(t)) ≥ R∗s (t) (6.2)
In case of serially related components
Rs =n∏i=1
Ri(t) ≥ R∗s (t) (6.3)
45
6. DESIGN FOR RELIABILITY
6.2.1 Exponential Case
If all components have constant failure rates, equation (6.3) can be written as
n∏i=1
e−λit ≥ e−λ∗s t
or
n∑i=1
λi ≤ λ∗s (6.4)
6.2.2 Optimal allocation
The objective of optimal allocation are minimal costs. We therefore need to solve
the following
minC =n∑i=1
Ci(xi) (6.5)
n∏i=1
(Ri + xi) ≥ R∗s (6.6)
at condition
0 < Ri + xi ≤ Bi < 1 i = 1, . . . , n (6.7)
The function of costs for reliability improvement Ci(x) can be approximated with
a second order polynomial equation
minC =n∑i=1
cix2i (6.8)
If we neglect (6.7) and equate both sides of the equation (6.6) we can determine
optimal values of xi using the Lagrange method
L(xi, θ) =n∑i=1
cix2i − θ
{n∏i=1
(Ri + xi)−R∗s
}(6.9)
Function extreme value L(xi, θ) is then calculated by solving the equation system
∂L(xi, θ)
∂xi= 2cixi − θ
n∏j=1j 6=i
(Rj + xj) = 0 i = 1, . . . , n
∂L(xi, θ)
∂θ=
n∏i=1
(Ri + xi)−R∗s = 0
(6.10)
46
6.2 Reliability Allocation
By multiplying the above equation with (Ri + xi) and a small rearrangement we
get
θ
n∏i=1
(Ri + xi) = 2cixi(Ri + xi) = θR∗s
From here it follows
2cix2i + 2ciRixi − θR∗s = 0 (6.11)
Solution of the square equation (6.11) is its positive root
xi =−2ciRi +
√4c2iR
2i + 8ciθR
∗s
4ci(6.12)
6.2.3 ARINC Method
This method assumes that the components are in series, statistically indepen-
dent and have constant failure rates. If λi is the current failure rate of the ith
component and λ∗s is the target system failure rate, then from weights
wi =λi∑ni=1 λi
we can calculate new failure rate for each component
λi = wiλ∗s
6.2.4 AGREE Method
1
n2 = 3
2
3
4
6
5
n3 = 2n
1 = 1
n = 3, N = 6
Figure 6.2. Block diagram of a complex system suitable for solving with AGREE
method .
AGREE method (Advisory Group on Reliability of Electronic Equipment) as-
sumes that a system is comprised of n components (see Figure 6.2) each having
47
6. DESIGN FOR RELIABILITY
ni modules or sub-components. Let
t = system operating time
R∗s = system reliability target at time t
n = number of components
N =n∑i=1
ni = total number of modules in system
ti = operating time of ith component where ti ≤ t
λi = failure rate of the ith component
wi = importance index
Because it applies
n∏i=1
R∗ni/Ns = R∗s
we can write the following
wi(1− e−λiti) = 1−R∗ni/Ns
Where the left side of the equation is the joined probability that the ith compo-
nent fails and results in a system failure. The right side of the equation is the
failure probability allocated to the ith component. From this it follows
λi = − 1
tiln
{1− 1−R∗ni/Ns
wi
}i = 1, . . . , n (6.13)
Because the failure of a component does not necessarily cause a system failure
the following applies∏n
i=1 e−λiti ≤ R∗s .
6.2.5 Redundancies
1 4
3
2
23
Figure 6.3. Block diagram of a system.
48
6.3 Design Methods
6.3 Design Methods
6.3.1 Parts and Material Selection
• standard parts or parts manufacturing,
• material properties are a function of various factors.
6.3.2 Derating
0 20 40 60 80
0.0
0.2
0.4
0.6
0.8
T
λ
L/Ld = 0.9
L/Ld = 0.7
L/Ld = 0.5
Figure 6.4. The influence of temperature and stress on failure rate.
λ = λb
(s
sd
)0.7(L
Ld
)4.69(υd
υ
)0.54(c
cd
)0.67(T
Td
)3
49
6. DESIGN FOR RELIABILITY
where
λb = failure rate defined by the manufacturer
s = operating speed
sd = design speed
L = operating load
Ld = design load
υ = viscosity of lubricant used
υd = viscosity of specified lubricant
c = concentration of contaminants
cd = standard contamination level
T = operating temperature
Td = design temperature
6.3.3 Stress Strength Analysis
1 4 7 10
0.0
0.1
0.2
0.3
0.4
0.5
SF
Pr{SF<1}
sSF = 0.8
sSF = 1.2
sSF = 1
Figure 6.5. Failure probability and safety factor.
Safety factor SF and safety margin SM
SF =Y
XSM = Y −X (6.14)
Probability of a system failure as a function of the safety factor
Pr{SF < 1} = 1− Φ
(ln SF
sSF
)(6.15)
where
SF =my
mx
sSF =√s2x + s2
y
50
6.3 Design Methods
6.3.4 Complexity and Technology
• part count of the system,
• variability of used parts in the system,
• technology.
6.3.5 Redundancy
active
parallel k of n
load sharing
redundancy
combined passive
withoutswitching
failure
withswitching
failure
Figure 6.6. A classification of redundancy.
Optimization of active redundancy based on the costs
maxn∏i=1
{1− (1−Ri(t))ni} (6.16)
at the conditionn∑i=1
Cu,ini ≤ B +n∑i=1
Cu,i (6.17)
1 C := 0; ni := 1 za i := 1, . . . , n;
2 s := false; Calculate ∆i za i := 1, . . . , n;
3 while s := false do
4 k := maxarg {∆1, . . . ,∆n}; C := C + Cu,k;
5 if C < B then
6 nk := nk + 1; Calculate ∆k;
7 else
8 s := true; C := C − Cu,k;
9 end if
10 end while
Figure 6.7. Marginal analysis.
51
6. DESIGN FOR RELIABILITY
Optimal number of redundant components is determined using the marginal anal-
ysis which requires the separation of control variables ni. This is achieved by
logarithm of the equation (6.16)
maxn∑i=1
ln {1− (1−Ri(t))ni} (6.18)
Because logarithm is a monotonically increasing function, transformation ( ref
eq6:3:5:3) does not affect the position of the optimum. Marginal value
∆i =ln {1− (1−Ri(t))
ni+1} − ln {1− (1−Ri(t))ni}
Cu,i
is defined as the increase in the logarithm of the component reliability per dollar
investment for increasing the number of parallel components for one.
6.4 Failure and Effect Analysis
Failure mode and effect analysis FMEA is an iterative process that influences
design by identifying failure modes, assessing their probabilities of occurrence
and their effect on the system, isolating their causes and determining corrective
action or preventive measures. Main purpose of such analysis is to increase safety
and customer satisfaction. The company’s management needs to precisely define
the constraints within which the FMEA process will take place. There is a need
to answer a series of questions.
• Is the FMEA team responsible only for carrying out the analysis, or should
it also prepare proposals and/or implement improvements?
• What is the team budget?
• What other resources are available to the group?
• Is the date for carrying out the analysis determined and whether there are
other time constraints that need to be taken into account?
• What is the procedure if the group has to exceed the limits?
• How should the group inform the other entities of the company with the
results of the analysis?
FMEA of a product or a process consists of 10 steps
• review of the process,
• identification of the potential failure modes,
• identification of potential effects of each failure mode,
• assigning a severity rating for each failure mode,
52
6.4 Failure and Effect Analysis
1. Are all affected areas represented?
FMEA Analysis
FMEA number: 019 Date started:
Date completed:
05.03.02
Team leader: Martin M.
Action:Yes No
2. Are different levels and types of knowledge represented on the team?
3. Is the customer involved?
Yes No Action:
Yes No Action:
4. Who will take minutes and maintain records?
Sales and marketing represent the customer.
Andrej N.
FMEA team boundaries of freedom
5. What aspects of the FMEA is the team responsible for?.
Improvementrecommendations
Improvemntimplementation
6. What is the budget of the FMEA?
7. Does the project have a deadline? 15.04.02
8. Do team members have specific time constraints?
9. What is the procedure if the team needs to expand beyond boundaries ?
Revision of restrictions with the department head.
10. How should the ŽFMEA be communicated with others?
Final report.
11. What is the scope of the FMEA?
Perform the FMEA analysis for the new model of fire extinguisher X-1050.
The analysis must be completed before the 01.05.02.
Team members: Martin M. Andrej N. Marta L.
Boštjan V. Franc G.
5000 EUR
Figure 6.8. FMEA team start-up worksheet.
• assigning an occurrence rating for each failure mode,
• assigning a detection rating for each failure mode and/or effect,
• calculation of the risk priority number (RPN) for each effect,
• prioritizing the failure modes for action,
• taking action to eliminate or reduce the high risk failure modes and
• calculation of the resulting RPN as the failure modes are reduced or elimi-
nated.
6.4.1 Review of the Process
To ensure that everyone on the FMEA team has the same understanding of the
process that is being worked on, the team should review a blueprint of the product
53
6. DESIGN FOR RELIABILITY
8
Item and
function
Poten
tialca
use(s) o
ffailu
reO
Curren
t contro
ls
preven
tiond
ete
ction
D
RPN
Recom
men
deda
ction
Respon
sibility
an
d ta
rge
tco
mp
letio
nA
ction
takenC
Poten
tial
failure
mo
de
Poten
tial
effect(s)of failure
S
tubecra
cks
Exposure to
high heator frost duringtran
sportation
- use
of
insu
latio
nm
ate
rials for
pa
cka
gin
g- tra
nsp
ortatio
nin
clima
tically
con
trolle
dco
nd
ition
s
it do
es n
ot
trigg
er a
ta
ctivatio
n
10
5
6
300
use oftem
peraturenon
-sensitivem
aterial
Martin M
.01
.04.02
replacemen
tof current tubew
ith thetem
peraturenon
-sensitive
tube
ho
les
low
ou
tpu
tp
ressu
reD
amage of th
etube during th
eprodu
ction
8
- no
sharp
ob
jec
t allo
we
din
the
pro
du
ction
pro
ce
ss
4
256
protection o
ftub
es with
a Kevlar coating
Marta L.
15.04
.02pu
rchase of
protective
coatings
for tube
SO
D
RPN
10
2
6
120
8
5
4
160
clog
gin
gn
o o
utp
ut
10
foreign
object
in the tube
6
- inp
ut
con
trol
- test u
sin
gco
mp
resse
da
ir
3
180
cylinder
valvem
echan
ism
levelindicator
10
6
3
180
Team leader:
Martin M
.
fire extinguisher X-1050 F
MEA
team
FM
EA team
:
Subject (product / process):
fire extinguisher X-1050
019FM
EA num
ber:
FMEA
date (original):
(revised):
05
.03.02
01.05
.02
Pa
ge:
of
12
Figure
6.9.
FM
EA
ofa
pro
duct/p
rocess.
54
6.4 Failure and Effect Analysis
if they are conducting a product FMEA, or a detailed flowchart of the operation
if they are conducting a process FMEA.
6.4.2 Identification of the Potential Failure Modes
potentialfailure effect
potential cause offailure
potentialfailure mode
biological effectscyclical fatiguehuman errordegradationdepolimerizacijaevaporationexceptional load
chemical and electrolyticchangescontact failurecorrosioncreepmechanical loads
low-quality components
pollutiontemperature fluctuationfrictionmoisture
putrefactionfatigue damagedelayed reactionmagnetic degradationshort circuitfilament breakagelightning strike
electrolyticcorrosionbad contactcorrosin defectsdeformationsudden breakage
spillage of brakeliquidloss of contactfatigue damagewearsteaming up
degradationunstable driveairplane accidentdrop of permeabilitydrop of resistancelamp failureradio transmissionfailureloss of contact
relay does not workleaking of the tankbad guidancepower loss andincreased noiseno brake force
voltage droppressure dropLoses and noisebad visibility
Tabel 6.3. Potential causes of failure, failure modes and effects.
6.4.3 Identification of Potential Effects of Each Failure
Mode
In identifying the potential effects of the failure, we are wondering what might the
consequences of each failure mode be. This step must be thorough, because this
information will feed into the assignment of risk ratings for each of the failure.
6.4.4 Assigning a Severity Rating for Each Failure Mode
The severity rating S is based on a 10-point scale, with 1 being the lowest rating
and 10 being the highest. It is an estimation of how serious the effect would be if
a given failure did occur. Because each failure may have several different effects,
and each effect can have different level of severity, it is the effect, not the failure,
that is rated.
55
6. DESIGN FOR RELIABILITY
almo
stu
ncerta
inve
ry rem
ote
remo
te
very lo
w
low
mo
derate
mo
derately
high
high
Very h
igh
almo
st certa
in
1
00 p
er 1
000
50
pe
r 10
00
20
pe
r 10
00
10
pe
r 10
00
5 p
er 1
000
2 p
er 1
000
1 p
er 1
000
0.5
pe
r 10
00
0.1
pe
r 10
00
0
.01
pe
r 10
00
10987654321
very h
igh
: failu
re alm
o-
st inevitab
le
high
:rep
eated
failure
s
mo
derate
:o
ccasio
nal
failure
s
low
:rela
tively
few
failu
res
remo
te:
dan
gero
usly
high
extrem
ely h
ighve
ry hig
h
high
mo
derate
low
very lo
w
min
or
very m
ino
r
no
ne
Failu
re co
uld
inju
re the cu
stom
er o
r an
emp
loye
ew
itho
ut a w
arnin
g.Fa
ilure
wo
uld
create
no
nco
mp
lien
ce with
fede
ral
regu
lation
s.Fa
ilure
rend
ers th
e u
nit in
op
erasble
or u
nfit fo
r use
.
Failu
re cau
ses a h
igh d
egre
e o
f custo
mer
dissa
tisfaction
.Fa
ilure
results in
a su
bsystem
or p
artial m
alfun
ction
of th
e pro
du
ct.Fa
ilure
creates en
ou
gh
of a p
erfo
rman
ce loss to
cause
the
custo
me
r to co
mp
lain
.Fa
ilure
can b
e o
vercom
e w
ith m
od
ification
s to th
ecu
stom
er‘s p
rocess o
r pro
du
ct with
min
or p
erf. loss.
Failu
re w
ou
ld cre
ate a m
ino
r nu
isance
to th
e custo
me
rb
ut th
e cu
stom
er can o
verco
me it w
itho
ut p
erf. lo
ss.Fa
ilure
may n
ot b
e read
ily app
are
nt to
the cu
stom
er,
bu
t wo
uld
ha
ve min
or effe
ct on
the p
rocess o
r pro
du
ct.Fa
ilure
wo
uld
no
t be
no
ticable
to th
e cu
stom
er.
de
scriptio
nd
efin
ition
de
scriptio
nfre
qu
en
cyD
etectio
nratin
g D
rating
Seve
rity rating
SO
ccurre
nce
rating
O
Tabel6.4.
Reco
mm
end
ations
forS
everity,O
ccurren
cean
dD
etectionratin
gscales.
56
6.4 Failure and Effect Analysis
6.4.5 Assigning an Occurrence Rating for Each Failure
Mode
The occurrence rating O is based on a 10-point scale and it specifies the proba-
bility for a certain failure mode to appear in the system or process life time. In
the assessment phase, the team tries to answer various questions.
• What is the frequency of failures of similar components that are still on the
market?
• Is the considered component similar to the previous generation of compo-
nents?
• What changes have been made when switching from previous to new gen-
eration?
• Is it a completely different or even completely new component?
• Has the operating conditions and/or environmental conditions changed?
• Has a statistical assessment of the probability of failure occurrence been
carried out?
6.4.6 Assigning a Detection Rating for Each Failure
Mode and/or Effect
The detection rating D is based on a 10-point scale and it looks how likely we
are to detect a failure or the effect of a failure. We start this step by identifying
current controls that may detect a failure or effect of a failure.
6.4.7 Calculation of the Risk Priority Numbe
The risk priority number (RPN)indexrisk priority number is simply calculated
by multiplying the severity rating S times the occurrence rating O times the
detection rating D for all of the items.
RPN = S ×O ×D (6.19)
it is a number between 1 and 1000. Because in the process of valuation of S, O
and D there are differences between the members of the group, it is worthwhile
to use one of the following techniques to achieve consensus
• vote in the group,
• inclusion of an expert,
• postponing the decision to one of the team members,
57
6. DESIGN FOR RELIABILITY
• classification of failures and/or effects by size according to S, O or D,
• prolonged debate or,
• voting for higher RPN value.
6.4.8 Prioritizing the Failure Modes for Action
The failure modes can now be prioritized by ranking them in order from the
highest RPN number to the smallest. A Pareto diagram is helpful to visualize
the differences between the various ratings. The team must now decide which
items to work on. Usually the team sets a cut-off RPN, where any failure modes
with a RPN above that point are attended to.
6.4.9 Taking Action to Eliminate or Reduce the High
Risk Failure Modes
Using an organized problem solving process the team identify and implement
actions to eliminate or reduce the high risk failure modes. The following are the
measures that can contribute to the reduction of the RPN index:
S Use of personal protective equipment (helmet, safety glasses, protective
gloves). Safety switches. Use of materials such as Safety glass that does
not cause such serious damage in case of failure.
O Increase the process performance index Cpk using test design and/or mod-
ification methods. Process of continuous improvement. Use of the mecha-
nisms to be activated in order for the product or process to work (example:
the garden lawnmower has a lever that needs to be pressed constantly dur-
ing operation).
D Statistical control of the process. Use of regularly calibrated measuring
devices. Preventive maintenance as a means of timely detection of potential
defects. Using encoding (colors and shapes) that informs the user about
what is right and what is not.
6.4.10 Calculation of the Resulting RPN
Once action has been taken to improve the product or process, new ratings for
severity, occurrence and detection should be determined and a resulting RPN
calculated.
58
6.5 System Safety and Fault Tree Analysis
6.5 System Safety and Fault Tree Analysis
Reliability and product safety are closely related. Safety is a state of the product
that does not lead to injuries, loss of life, or severe damage to equipment and
harmful consequences for the environment (MIL-STD-882). The safety analysis
is carried out primarily in cases where a certain type of failure could lead to
catastrophic consequences for humans or the environment. The aim of the anal-
ysis is to determine the types, probability of realization and corrective measures
for potential disasters with catastrophic consequences. Since the probability of
realization of these events is often very low, and because such failures are pre-
dominantly due to the sequence of several events, the fault tree analysis (FTA)
is used to detect and quantify them.
6.5.1 Fault Tree Analysis
AND gate. An output event occursonly if all incoming events occur.
Basic event. An independent basic event,which can no longer be broken down.
OR gate. An output event occurs,if at least one input event occurs.
Resultant event. Description of the failurethat arises from the logical combination ofother failures or description of the logicaldoor below it.
Incomplete event. An event that is notcompletely broken down due to a lack ofinformation.
Transfer-in and transfer-out. Used to linksections of the fault tree that are not contiguous.
Conditional event. A condition or restriction tied to a logic gate.
Normal event. A normally occurring eventthat is not a fault.
Figure 6.10. Symbols of fault tree analysis.
A fault tree analysis is a graphical design technique for detection of potential
failures that could lead to catastrophic consequences. There are four major steps
to a fault tree analysis:
• definition of the system, its boundaries and the top event,
59
6. DESIGN FOR RELIABILITY
observer
sensor 1
switch
Pissa tower
sensor 2alarm
auxiliary source of electricity
main source of electricity
Figure 6.11. Alarm device diagram.
Talarm failure
Hsensor 2failure
Gsensor 1 failure
Ihumanerror
Jsensor 1failure
Kswitchfailure
Apower failure D
secondary alarm failure
Bsensor failure C
alarmfailure
Eauxiliary sourcefailure
Fmain
sourcefailure
Figure 6.12. The alarm system fault tree.
• construction of the fault tree,
• qualitative evaluation by identifying those combinations of events that will
cause the top event and
• quantitative evaluation by assigning failure probabilities or unavailabilities
to the basic events and computing the probability of the top event.
Faults can be classified as primary, secondary and command. Faults may also be
classified as active or passive.
The top event T can be expressed as a function of basic and incomplete events
T = A ∪B ∪ C ∪D= (E ∩ F ) ∪ (G ∩H) ∪ C ∪D= (E ∩ F ) ∪ ((I ∪ J ∪K) ∩H) ∪ C ∪D
(6.20)
60
6.5 System Safety and Fault Tree Analysis
T
M2
or
Mk
E1
and
M1
E2 En1
and and
Figure 6.13. Equivalent tree of minimal cut sets.
T
BA
E F E A
C D
or or
and
or
T
A
or
E
Figure 6.14. A fault tree and an equivalent fault tree.
6.5.2 Minimal Cut Sets
Minimal cut set is a minimal set of basic events that cause the top event. If Mi
is ith minimal cut set the top event T can be expressed as union of all k minimal
cut sets
T = M1 ∪M2 ∪ · · · ∪Mk (6.21)
where Mi = E1 ∩ E2 ∩ · · · ∩ Eni and Ei are basic events.
E,FI,HJ,HK,HCD
iteration
ABCD
1 2
E,FG,HCD
3
iteration
AB
1 2
AC,D
3
AE,DF,D
AE,EE,AF,EF,A
4
AE
5
(a) (b)
Figure 6.15. Minimal cut sets examples: Figure 6.12 (a) and Figure 6.14 (b).
61
6. DESIGN FOR RELIABILITY
6.5.3 Quantitative Analysis
In the context of quantitative analysis, we estimate the probability of realization
of the top event under the condition that the probabilities of realizations of basic
and incomplete events are known. To help with the analysis we can use a tree of
minimal cut sets, since it applies
P (T ) = P (M1 ∪M2 ∪ · · · ∪Mk) (6.22)
If minimal cut sets represent unrelated events we can write
P (T ) = P (M1) + P (M2) + · · ·+ P (Mk) (6.23)
and if basic events Ei are statistically independent we can write
P (Mi) = P (E1 ∩ E2 ∩ · · · ∩ Eni) = P (E1)P (E2) · · ·P (Eni) (6.24)
Because the probabilities P (Mi) are small, the equation (6.23) gives satisfactory
results even in case of related events.
62
Chapter 7
Maintainability
We distinguish between:
• reactive,
• preventive and,
• predictive maintenance.
7.1 Analysis of Downtime
Delay time in the logistic support includes:
• waiting time for spare parts,
• initial administrative time,
• initial production or purchasing time,
total downtime
de
lay
in lo
gis
tics
sup
po
rt
mai
nte
na
nce
de
lay
tim
e
acce
ss t
ime
dia
gn
ost
ic t
ime
acti
ve t
ime
of
rep
air
or
rep
lace
men
t
tim
e o
f ve
rifi
cati
on
an
d
alig
nm
en
t
repair time
Figure 7.1. Breakdown of maintenance downtime.
63
7. MAINTAINABILITY
• maintenance time and,
• time of the transport.
Maintenance delay time includes:
• waiting time for available maintenance resources,
• waiting time for available maintenance facilities,
• administrative (notification) time and,
• travel time.
Maintenance resources includes:
• personnel,
• test and support equipment,
• tools,
• manuals or other technical data.
Maintenance facilities may be:
• repair dock,
• service bay,
• fixed test stand and,
• and other facilities.
7.2 The Repair Time Distribution
Let T be the continuous random variable representing the time to repair a failed
unit, having a probability density function of h(t) and cumulative distribution
function of
Pr{T < t} = H (t) =
∫ t
0
h(t)dt (7.1)
The mean time to repair may be found from
MTTR =
∫ ∞0
th(t)dt =
∫ ∞0
(1−H (t))dt (7.2)
and the variance of the repair distribution from
σ2 =
∫ ∞0
(t−MTTR)2h(t)dt (7.3)
64
7.3 System Repair Time
If the repair distribution function is exponential with h(t) = re−rt and the rate
of repair r = 1/MTTR then it applies
H (t) =
∫ t
0
e−t/MTTR
MTTRdt = 1− e−t/MTTR (7.4)
To represent the repair distribution often log-normal distribution is used
h(t) =1√
2πtsexp
{−1
2
(ln(t/tmed))2
s2
}(7.5)
where tmed is the median time to repair and s shape parameter. The probability
of a repair being finished in time t is found by utilizing the relationship between
the normal and log-normal distributions
Pr{T < t} = H (t) = Φ
(1
sln
t
tmed
)(7.6)
The mean time to repair is defined by
MTTR = tmedes2/2 (7.7)
7.3 System Repair Time
let MTTRi be the mean time to repair of the ith component, mumi be the expected
number of unexpected repairs of the ith component over the system design life
and qi the number of identical components of type i. Then the system mean time
to repair can be expressed as a weighted sum
MTTRs =
∑ni=1 qimumiMTTRi∑n
i=1 qimumi
(7.8)
The expected number of unexpected repairs of the ith component is
mumi =
{toi
MTBFifor renewal process∫ toi
0ρi(t)dt for minimal repair
Let us consider k from n redundant system. repair in such a case can be carried
out in various ways:
• repair of the component immediately after a failure,
• repair of one component after a failure of n− k + 1 n components,
• repair of one component while maintaining all other operating components
after a failure of n− k + 1 n components,
• repair of all faulty components while maintaining all other components after
a failure of n− k + 1 n components.
65
7. MAINTAINABILITY
7.4 Reliability under Preventive Maintenance
The reliability over the first and the second preventive maintenance intervals is
Rpm(t) = R(t) za 0 ≤ t < Tpm
Rpm(t) = R(Tpm)R(t− Tpm) za Tpm ≤ t < 2Tpm
where R(Tpm) is the probability of survival until the first preventive maintenance
and R(t − Tpm) is the probability to survive the additional time t − Tpm given
that the system was restored to its original condition at time Tpm. In general it
applies
Rpm(t) = Ri(Tpm)R(t− iTpm) za iTpm ≤ t < (i+ 1)Tpm (7.9)
where Ri(Tpm) is the probability of survival until the ith preventive maintenance
and R(t − iTpm) is the probability to survive the additional time t − iTpm. The
t
R(t)
Ri(Tpm)
Rpm(t)
R(t - iTpm)
0 Tpm
0.0
0.2
0.4
0.6
0.8
1.0
2Tpm 3Tpm 4Tpm 5Tpm 6Tpm
relia
bili
ty
Figure 7.2. The influence of preventive maintenance on reliability (increasing failure
rate).
MTTF under preventive maintenance can be calculated as
MTBF =
∫ ∞0
Rpm(t)dt =∞∑i=0
∫ (i+1)Tpm
iTpm
Rpm(t)dt
=∞∑i=0
Ri(Tpm)
∫ (i+1)Tpm
iTpm
R(t− iTpm)dt
=∞∑i=0
Ri(Tpm)
∫ Tpm
0
R(t)dt
In some preventive maintenance situations there is a possibility of a maintenance-
induced failure where p is the probability of a maintenance-induced failure during
66
7.5 Stochastic Point Processes
an individual preventive maintenance. The reliability after i preventive mainte-
nance procedures taking into account probability p is
Rpm(t) = Ri(Tpm)(1− p)iR(t− iTpm) za iTpm ≤ t < (i+ 1)Tpm
Ri(Tpm)
R(t - iTpm)
0 Tpm
t
R(t)
Rpm(t)
0.0
0.2
0.4
0.6
0.8
1.0
2Tpm 3Tpm 4Tpm 5Tpm 6Tpm
relia
bili
ty
Figure 7.3. The influence of preventive maintenance on reliability (decreasing failure
rate).
7.5 Stochastic Point Processes
Relationships between individual random variables are given by equations
Yi = Xi + Si = Ti − Ti−1 Ti =i∑
j=1
Yj (7.10)
Let us also write the expected values of the operating times and down-times
E[Xi] = MTBF E[Si] = MTTR (7.11)
S1
T0
T2
T1
t
X1
Y1
Figure 7.4. An example of a time frame of a component state.
67
7. MAINTAINABILITY
7.5.1 Renewal Process
If Si ≈ 0 then random variable Xi is operating time between i− 1 in ith failure,
and random variable Ti operating time until ith failure. If random variables Xi
are statistically independent having the same probability density distribution of
failures f (t), then the probability distribution of operating times Ti for i → ∞converges to normal probability density function
fi(t) =1√
2πσiexp
{−(t− µi)2
2σ2i
}(7.12)
with the mean value of
µi = E[Ti] =i∑
j=1
E[Xj] = iE[Xi] = iMTBF (7.13)
and variance of
σ2i = E[(Ti − iMTBF)2]
= E[(i∑
j=1
(Xj −MTBF))2)]
=i∑
j=1
i∑k=1
E[(Xj −MTBF)(Xk −MTBF)]
=i∑
j=1
E[(Xj −MTBF)2] = iσ2
(7.14)
since it applies
E[(Xj −MTBF)(Xk −MTBF)] = 0 j 6= k
Stochastic point process can also be expressed by number of failure in time interval
[0, t].Let N (t) be a discrete random variable which gives a cumulative number of
failures in the interval [0, t]. Then it follows
Pr{N (t) = 0} = Pr{T1 > t}Pr{N (t) = i} = Pr{Ti ≤ t < Ti+1}
= Pr{Ti ≤ t} − Pr{Ti+1 ≤ t}= Fi(t)− Fi+1(t)
(7.15)
where cumulative distribution function is
F0(t) =
{0 za t ≤ 0
1 za t > 0
or
Fi(t) =
∫ t
0
Fi−1(t− u)f (u)du i = 1, 2, . . .
68
7.5 Stochastic Point Processes
7.5.1.1 Homogeneous Poisson Process
If
f (t) = λe−λt (7.16)
with a failure rate of λ = 1/MTBF then it applies
F0(t) = 1
F1(t) = 1− e−λt
F2(t) = 1− e−λt − λte−λt
F3(t) = 1− e−λt − λte−λt − (λt)2
2e−λt
from which a general form of the cumulative distribution function can be gener-
ated
Fi(t) = 1−i−1∑j=0
(λt)j
j!e−λt i = 0, 1, . . . (7.17)
From here it follows
Pr{N (t) = i} =i∑
j=0
(λt)j
j!e−λt −
i−1∑j=0
(λt)j
j!e−λt =
(λt)i
i!e−λt (7.18)
7.5.1.2 Renewal Function
The renewal function provides the expected number of failures at time t
m(t) =∞∑i=1
iPr{N (t) = i} (7.19)
if in equation Pr{N (t) = i} is replaced by Fi(t)− Fi+1(t), we get
m(t) =∞∑i=1
Fi(t) (7.20)
7.5.2 Minimal Repair process
Aging of the product can be considered as a stochastic point process and it is
expressed with the intensity function
ρ(t) =dE[N (t)]
dt(7.21)
The intensity and renewal functions are related since it applies
m(t) = E[N (t)] =
∫ t
0
ρ(t)dt (7.22)
69
7. MAINTAINABILITY
7.5.3 Overhaul
If the probability density distribution of failures f (t) is known, then the expected
period of the overhaul is
E[Tov] = R(Tpm)E[t|t ≥ Tpm] + F (Tpm)E[t|t < Tpm]
= R(Tpm)Tpm + F (Tpm)
∫ ∞0
tf (t|t < Tpm)dt
Because from Bayes’s theorem it follows
f (t|t < Tpm) =
{0 za t ≥ Tpm
f (t)F (Tpm)
za t < Tpm
we can finally write
E[Tov] = R(Tpm)Tpm +
∫ Tpm
0
tf (t)dt =
∫ Tpm
0
R(t)dt (7.23)
Calculation of the expected period of the overhaul in case of the exponential
probability density distribution of the failure
E[Tov] =
∫ Tpm
0
e−λtdt =1
λ(1− e−λTpm)
with λ = 0.00001h−1 and overhaul period of Tpm = 10000h is shown in Figure 7.5.
0 5000 10000 15000
0.0E+00
2.0E-06
4.0E-06
6.0E-06
8.0E-06
1.0E-05
t
f(t) T
pmE[T
ov] = 9516h
R(Tpm)
Figure 7.5. Example of the overhaul period calculation.
70